Sample records for carlo mcmc algorithm

  1. An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals

    ERIC Educational Resources Information Center

    Verhelst, Norman D.

    2008-01-01

    Uniform sampling of binary matrices with fixed margins is known as a difficult problem. Two classes of algorithms to sample from a distribution not too different from the uniform are studied in the literature: importance sampling and Markov chain Monte Carlo (MCMC). Existing MCMC algorithms converge slowly, require a long burn-in period and yield…

  2. Experiences with Markov Chain Monte Carlo Convergence Assessment in Two Psychometric Examples

    ERIC Educational Resources Information Center

    Sinharay, Sandip

    2004-01-01

    There is an increasing use of Markov chain Monte Carlo (MCMC) algorithms for fitting statistical models in psychometrics, especially in situations where the traditional estimation techniques are very difficult to apply. One of the disadvantages of using an MCMC algorithm is that it is not straightforward to determine the convergence of the…

  3. Annealed Importance Sampling Reversible Jump MCMC algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Karagiannis, Georgios; Andrieu, Christophe

    2013-03-20

    It will soon be 20 years since reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithms have been proposed. They have significantly extended the scope of Markov chain Monte Carlo simulation methods, offering the promise to be able to routinely tackle transdimensional sampling problems, as encountered in Bayesian model selection problems for example, in a principled and flexible fashion. Their practical efficient implementation, however, still remains a challenge. A particular difficulty encountered in practice is in the choice of the dimension matching variables (both their nature and their distribution) and the reversible transformations which allow one to define the one-to-one mappingsmore » underpinning the design of these algorithms. Indeed, even seemingly sensible choices can lead to algorithms with very poor performance. The focus of this paper is the development and performance evaluation of a method, annealed importance sampling RJ-MCMC (aisRJ), which addresses this problem by mitigating the sensitivity of RJ-MCMC algorithms to the aforementioned poor design. As we shall see the algorithm can be understood as being an “exact approximation” of an idealized MCMC algorithm that would sample from the model probabilities directly in a model selection set-up. Such an idealized algorithm may have good theoretical convergence properties, but typically cannot be implemented, and our algorithms can approximate the performance of such idealized algorithms to an arbitrary degree while not introducing any bias for any degree of approximation. Our approach combines the dimension matching ideas of RJ-MCMC with annealed importance sampling and its Markov chain Monte Carlo implementation. We illustrate the performance of the algorithm with numerical simulations which indicate that, although the approach may at first appear computationally involved, it is in fact competitive.« less

  4. Bayesian Estimation of Multidimensional Item Response Models. A Comparison of Analytic and Simulation Algorithms

    ERIC Educational Resources Information Center

    Martin-Fernandez, Manuel; Revuelta, Javier

    2017-01-01

    This study compares the performance of two estimation algorithms of new usage, the Metropolis-Hastings Robins-Monro (MHRM) and the Hamiltonian MCMC (HMC), with two consolidated algorithms in the psychometric literature, the marginal likelihood via EM algorithm (MML-EM) and the Markov chain Monte Carlo (MCMC), in the estimation of multidimensional…

  5. An NCME Instructional Module on Estimating Item Response Theory Models Using Markov Chain Monte Carlo Methods

    ERIC Educational Resources Information Center

    Kim, Jee-Seon; Bolt, Daniel M.

    2007-01-01

    The purpose of this ITEMS module is to provide an introduction to Markov chain Monte Carlo (MCMC) estimation for item response models. A brief description of Bayesian inference is followed by an overview of the various facets of MCMC algorithms, including discussion of prior specification, sampling procedures, and methods for evaluating chain…

  6. A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data.

    PubMed

    Liang, Faming; Kim, Jinsu; Song, Qifan

    2016-01-01

    Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively.

  7. A Bootstrap Metropolis–Hastings Algorithm for Bayesian Analysis of Big Data

    PubMed Central

    Kim, Jinsu; Song, Qifan

    2016-01-01

    Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively. PMID:29033469

  8. Searching for efficient Markov chain Monte Carlo proposal kernels

    PubMed Central

    Yang, Ziheng; Rodríguez, Carlos E.

    2013-01-01

    Markov chain Monte Carlo (MCMC) or the Metropolis–Hastings algorithm is a simulation algorithm that has made modern Bayesian statistical inference possible. Nevertheless, the efficiency of different Metropolis–Hastings proposal kernels has rarely been studied except for the Gaussian proposal. Here we propose a unique class of Bactrian kernels, which avoid proposing values that are very close to the current value, and compare their efficiency with a number of proposals for simulating different target distributions, with efficiency measured by the asymptotic variance of a parameter estimate. The uniform kernel is found to be more efficient than the Gaussian kernel, whereas the Bactrian kernel is even better. When optimal scales are used for both, the Bactrian kernel is at least 50% more efficient than the Gaussian. Implementation in a Bayesian program for molecular clock dating confirms the general applicability of our results to generic MCMC algorithms. Our results refute a previous claim that all proposals had nearly identical performance and will prompt further research into efficient MCMC proposals. PMID:24218600

  9. MCMC-ODPR: primer design optimization using Markov Chain Monte Carlo sampling.

    PubMed

    Kitchen, James L; Moore, Jonathan D; Palmer, Sarah A; Allaby, Robin G

    2012-11-05

    Next generation sequencing technologies often require numerous primer designs that require good target coverage that can be financially costly. We aimed to develop a system that would implement primer reuse to design degenerate primers that could be designed around SNPs, thus find the fewest necessary primers and the lowest cost whilst maintaining an acceptable coverage and provide a cost effective solution. We have implemented Metropolis-Hastings Markov Chain Monte Carlo for optimizing primer reuse. We call it the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR) algorithm. After repeating the program 1020 times to assess the variance, an average of 17.14% fewer primers were found to be necessary using MCMC-ODPR for an equivalent coverage without implementing primer reuse. The algorithm was able to reuse primers up to five times. We compared MCMC-ODPR with single sequence primer design programs Primer3 and Primer-BLAST and achieved a lower primer cost per amplicon base covered of 0.21 and 0.19 and 0.18 primer nucleotides on three separate gene sequences, respectively. With multiple sequences, MCMC-ODPR achieved a lower cost per base covered of 0.19 than programs BatchPrimer3 and PAMPS, which achieved 0.25 and 0.64 primer nucleotides, respectively. MCMC-ODPR is a useful tool for designing primers at various melting temperatures at good target coverage. By combining degeneracy with optimal primer reuse the user may increase coverage of sequences amplified by the designed primers at significantly lower costs. Our analyses showed that overall MCMC-ODPR outperformed the other primer-design programs in our study in terms of cost per covered base.

  10. MCMC-ODPR: Primer design optimization using Markov Chain Monte Carlo sampling

    PubMed Central

    2012-01-01

    Background Next generation sequencing technologies often require numerous primer designs that require good target coverage that can be financially costly. We aimed to develop a system that would implement primer reuse to design degenerate primers that could be designed around SNPs, thus find the fewest necessary primers and the lowest cost whilst maintaining an acceptable coverage and provide a cost effective solution. We have implemented Metropolis-Hastings Markov Chain Monte Carlo for optimizing primer reuse. We call it the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR) algorithm. Results After repeating the program 1020 times to assess the variance, an average of 17.14% fewer primers were found to be necessary using MCMC-ODPR for an equivalent coverage without implementing primer reuse. The algorithm was able to reuse primers up to five times. We compared MCMC-ODPR with single sequence primer design programs Primer3 and Primer-BLAST and achieved a lower primer cost per amplicon base covered of 0.21 and 0.19 and 0.18 primer nucleotides on three separate gene sequences, respectively. With multiple sequences, MCMC-ODPR achieved a lower cost per base covered of 0.19 than programs BatchPrimer3 and PAMPS, which achieved 0.25 and 0.64 primer nucleotides, respectively. Conclusions MCMC-ODPR is a useful tool for designing primers at various melting temperatures at good target coverage. By combining degeneracy with optimal primer reuse the user may increase coverage of sequences amplified by the designed primers at significantly lower costs. Our analyses showed that overall MCMC-ODPR outperformed the other primer-design programs in our study in terms of cost per covered base. PMID:23126469

  11. On an adaptive preconditioned Crank-Nicolson MCMC algorithm for infinite dimensional Bayesian inference

    NASA Astrophysics Data System (ADS)

    Hu, Zixi; Yao, Zhewei; Li, Jinglai

    2017-03-01

    Many scientific and engineering problems require to perform Bayesian inference for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.

  12. Teaching Markov Chain Monte Carlo: Revealing the Basic Ideas behind the Algorithm

    ERIC Educational Resources Information Center

    Stewart, Wayne; Stewart, Sepideh

    2014-01-01

    For many scientists, researchers and students Markov chain Monte Carlo (MCMC) simulation is an important and necessary tool to perform Bayesian analyses. The simulation is often presented as a mathematical algorithm and then translated into an appropriate computer program. However, this can result in overlooking the fundamental and deeper…

  13. Modeling and Bayesian parameter estimation for shape memory alloy bending actuators

    NASA Astrophysics Data System (ADS)

    Crews, John H.; Smith, Ralph C.

    2012-04-01

    In this paper, we employ a homogenized energy model (HEM) for shape memory alloy (SMA) bending actuators. Additionally, we utilize a Bayesian method for quantifying parameter uncertainty. The system consists of a SMA wire attached to a flexible beam. As the actuator is heated, the beam bends, providing endoscopic motion. The model parameters are fit to experimental data using an ordinary least-squares approach. The uncertainty in the fit model parameters is then quantified using Markov Chain Monte Carlo (MCMC) methods. The MCMC algorithm provides bounds on the parameters, which will ultimately be used in robust control algorithms. One purpose of the paper is to test the feasibility of the Random Walk Metropolis algorithm, the MCMC method used here.

  14. Modelling maximum river flow by using Bayesian Markov Chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Cheong, R. Y.; Gabda, D.

    2017-09-01

    Analysis of flood trends is vital since flooding threatens human living in terms of financial, environment and security. The data of annual maximum river flows in Sabah were fitted into generalized extreme value (GEV) distribution. Maximum likelihood estimator (MLE) raised naturally when working with GEV distribution. However, previous researches showed that MLE provide unstable results especially in small sample size. In this study, we used different Bayesian Markov Chain Monte Carlo (MCMC) based on Metropolis-Hastings algorithm to estimate GEV parameters. Bayesian MCMC method is a statistical inference which studies the parameter estimation by using posterior distribution based on Bayes’ theorem. Metropolis-Hastings algorithm is used to overcome the high dimensional state space faced in Monte Carlo method. This approach also considers more uncertainty in parameter estimation which then presents a better prediction on maximum river flow in Sabah.

  15. Enhancing Data Assimilation by Evolutionary Particle Filter and Markov Chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Moradkhani, H.; Abbaszadeh, P.; Yan, H.

    2016-12-01

    Particle Filters (PFs) have received increasing attention by the researchers from different disciplines in hydro-geosciences as an effective method to improve model predictions in nonlinear and non-Gaussian dynamical systems. The implication of dual state and parameter estimation by means of data assimilation in hydrology and geoscience has evolved since 2005 from SIR-PF to PF-MCMC and now to the most effective and robust framework through evolutionary PF approach based on Genetic Algorithm (GA) and Markov Chain Monte Carlo (MCMC), the so-called EPF-MCMC. In this framework, the posterior distribution undergoes an evolutionary process to update an ensemble of prior states that more closely resemble realistic posterior probability distribution. The premise of this approach is that the particles move to optimal position using the GA optimization coupled with MCMC increasing the number of effective particles, hence the particle degeneracy is avoided while the particle diversity is improved. The proposed algorithm is applied on a conceptual and highly nonlinear hydrologic model and the effectiveness, robustness and reliability of the method in jointly estimating the states and parameters and also reducing the uncertainty is demonstrated for few river basins across the United States.

  16. Using Stan for Item Response Theory Models

    ERIC Educational Resources Information Center

    Ames, Allison J.; Au, Chi Hang

    2018-01-01

    Stan is a flexible probabilistic programming language providing full Bayesian inference through Hamiltonian Monte Carlo algorithms. The benefits of Hamiltonian Monte Carlo include improved efficiency and faster inference, when compared to other MCMC software implementations. Users can interface with Stan through a variety of computing…

  17. emcee: The MCMC Hammer

    NASA Astrophysics Data System (ADS)

    Foreman-Mackey, Daniel; Hogg, David W.; Lang, Dustin; Goodman, Jonathan

    2013-03-01

    We introduce a stable, well tested Python implementation of the affine-invariant ensemble sampler for Markov chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010). The code is open source and has already been used in several published projects in the astrophysics literature. The algorithm behind emcee has several advantages over traditional MCMC sampling methods and it has excellent performance as measured by the autocorrelation time (or function calls per independent sample). One major advantage of the algorithm is that it requires hand-tuning of only 1 or 2 parameters compared to ˜N2 for a traditional algorithm in an N-dimensional parameter space. In this document, we describe the algorithm and the details of our implementation. Exploiting the parallelism of the ensemble method, emcee permits any user to take advantage of multiple CPU cores without extra effort. The code is available online at http://dan.iel.fm/emcee under the GNU General Public License v2.

  18. Sythesis of MCMC and Belief Propagation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, Sungsoo; Chertkov, Michael; Shin, Jinwoo

    Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach whichmore » allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we first propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes.« less

  19. The estimation of lower refractivity uncertainty from radar sea clutter using the Bayesian—MCMC method

    NASA Astrophysics Data System (ADS)

    Sheng, Zheng

    2013-02-01

    The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem. This paper deals with the RFC problem in a Bayesian framework. It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique, which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework. In contrast to the global optimization algorithm, the Bayesian—MCMC can obtain not only the approximate solutions, but also the probability distributions of the solutions, that is, uncertainty analyses of solutions. The Bayesian—MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar sea-clutter data. Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter. The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions.

  20. Iterative Importance Sampling Algorithms for Parameter Estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grout, Ray W; Morzfeld, Matthias; Day, Marcus S.

    In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is a challenging task. Several sampling algorithms have been proposed over the past years that take an iterative approach to constructing a proposal distribution. We investigate the applicabilitymore » of such algorithms by applying them to two realistic and challenging test problems, one in subsurface flow, and one in combustion modeling. More specifically, we implement importance sampling algorithms that iterate over the mean and covariance matrix of Gaussian or multivariate t-proposal distributions. Our implementation leverages massively parallel computers, and we present strategies to initialize the iterations using 'coarse' MCMC runs or Gaussian mixture models.« less

  1. Finite element model updating using the shadow hybrid Monte Carlo technique

    NASA Astrophysics Data System (ADS)

    Boulkaibet, I.; Mthembu, L.; Marwala, T.; Friswell, M. I.; Adhikari, S.

    2015-02-01

    Recent research in the field of finite element model updating (FEM) advocates the adoption of Bayesian analysis techniques to dealing with the uncertainties associated with these models. However, Bayesian formulations require the evaluation of the Posterior Distribution Function which may not be available in analytical form. This is the case in FEM updating. In such cases sampling methods can provide good approximations of the Posterior distribution when implemented in the Bayesian context. Markov Chain Monte Carlo (MCMC) algorithms are the most popular sampling tools used to sample probability distributions. However, the efficiency of these algorithms is affected by the complexity of the systems (the size of the parameter space). The Hybrid Monte Carlo (HMC) offers a very important MCMC approach to dealing with higher-dimensional complex problems. The HMC uses the molecular dynamics (MD) steps as the global Monte Carlo (MC) moves to reach areas of high probability where the gradient of the log-density of the Posterior acts as a guide during the search process. However, the acceptance rate of HMC is sensitive to the system size as well as the time step used to evaluate the MD trajectory. To overcome this limitation we propose the use of the Shadow Hybrid Monte Carlo (SHMC) algorithm. The SHMC algorithm is a modified version of the Hybrid Monte Carlo (HMC) and designed to improve sampling for large-system sizes and time steps. This is done by sampling from a modified Hamiltonian function instead of the normal Hamiltonian function. In this paper, the efficiency and accuracy of the SHMC method is tested on the updating of two real structures; an unsymmetrical H-shaped beam structure and a GARTEUR SM-AG19 structure and is compared to the application of the HMC algorithm on the same structures.

  2. The Full Monte Carlo: A Live Performance with Stars

    NASA Astrophysics Data System (ADS)

    Meng, Xiao-Li

    2014-06-01

    Markov chain Monte Carlo (MCMC) is being applied increasingly often in modern Astrostatistics. It is indeed incredibly powerful, but also very dangerous. It is popular because of its apparent generality (from simple to highly complex problems) and simplicity (the availability of out-of-the-box recipes). It is dangerous because it always produces something but there is no surefire way to verify or even diagnosis that the “something” is remotely close to what the MCMC theory predicts or one hopes. Using very simple models (e.g., conditionally Gaussian), this talk starts with a tutorial of the two most popular MCMC algorithms, namely, the Gibbs Sampler and the Metropolis-Hasting Algorithm, and illustratestheir good, bad, and ugly implementations via live demonstration. The talk ends with a story of how a recent advance, the Ancillary-Sufficient Interweaving Strategy (ASIS) (Yu and Meng, 2011, http://www.stat.harvard.edu/Faculty_Content/meng/jcgs.2011-article.pdf)reduces the danger. It was discovered almost by accident during a Ph.D. student’s (Yaming Yu) struggle with fitting a Cox process model for detecting changes in source intensity of photon counts observed by the Chandra X-ray telescope from a (candidate) neutron/quark star.

  3. Geometric MCMC for infinite-dimensional inverse problems

    NASA Astrophysics Data System (ADS)

    Beskos, Alexandros; Girolami, Mark; Lan, Shiwei; Farrell, Patrick E.; Stuart, Andrew M.

    2017-04-01

    Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank-Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.

  4. An introduction of Markov chain Monte Carlo method to geochemical inverse problems: Reading melting parameters from REE abundances in abyssal peridotites

    NASA Astrophysics Data System (ADS)

    Liu, Boda; Liang, Yan

    2017-04-01

    Markov chain Monte Carlo (MCMC) simulation is a powerful statistical method in solving inverse problems that arise from a wide range of applications. In Earth sciences applications of MCMC simulations are primarily in the field of geophysics. The purpose of this study is to introduce MCMC methods to geochemical inverse problems related to trace element fractionation during mantle melting. MCMC methods have several advantages over least squares methods in deciphering melting processes from trace element abundances in basalts and mantle rocks. Here we use an MCMC method to invert for extent of melting, fraction of melt present during melting, and extent of chemical disequilibrium between the melt and residual solid from REE abundances in clinopyroxene in abyssal peridotites from Mid-Atlantic Ridge, Central Indian Ridge, Southwest Indian Ridge, Lena Trough, and American-Antarctic Ridge. We consider two melting models: one with exact analytical solution and the other without. We solve the latter numerically in a chain of melting models according to the Metropolis-Hastings algorithm. The probability distribution of inverted melting parameters depends on assumptions of the physical model, knowledge of mantle source composition, and constraints from the REE data. Results from MCMC inversion are consistent with and provide more reliable uncertainty estimates than results based on nonlinear least squares inversion. We show that chemical disequilibrium is likely to play an important role in fractionating LREE in residual peridotites during partial melting beneath mid-ocean ridge spreading centers. MCMC simulation is well suited for more complicated but physically more realistic melting problems that do not have analytical solutions.

  5. EFFICIENT MODEL-FITTING AND MODEL-COMPARISON FOR HIGH-DIMENSIONAL BAYESIAN GEOSTATISTICAL MODELS. (R826887)

    EPA Science Inventory

    Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...

  6. Asteroid mass estimation using Markov-chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Siltala, Lauri; Granvik, Mikael

    2017-11-01

    Estimates for asteroid masses are based on their gravitational perturbations on the orbits of other objects such as Mars, spacecraft, or other asteroids and/or their satellites. In the case of asteroid-asteroid perturbations, this leads to an inverse problem in at least 13 dimensions where the aim is to derive the mass of the perturbing asteroid(s) and six orbital elements for both the perturbing asteroid(s) and the test asteroid(s) based on astrometric observations. We have developed and implemented three different mass estimation algorithms utilizing asteroid-asteroid perturbations: the very rough 'marching' approximation, in which the asteroids' orbital elements are not fitted, thereby reducing the problem to a one-dimensional estimation of the mass, an implementation of the Nelder-Mead simplex method, and most significantly, a Markov-chain Monte Carlo (MCMC) approach. We describe each of these algorithms with particular focus on the MCMC algorithm, and present example results using both synthetic and real data. Our results agree with the published mass estimates, but suggest that the published uncertainties may be misleading as a consequence of using linearized mass-estimation methods. Finally, we discuss remaining challenges with the algorithms as well as future plans.

  7. Auxiliary Parameter MCMC for Exponential Random Graph Models

    NASA Astrophysics Data System (ADS)

    Byshkin, Maksym; Stivala, Alex; Mira, Antonietta; Krause, Rolf; Robins, Garry; Lomi, Alessandro

    2016-11-01

    Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity has so far limited the appeal of ERGMs for the analysis of large social networks. Efficient computational methods are highly desirable in order to extend the empirical scope of ERGMs. In this paper we report results of a research project on the development of snowball sampling methods for ERGMs. We propose an auxiliary parameter Markov chain Monte Carlo (MCMC) algorithm for sampling from the relevant probability distributions. The method is designed to decrease the number of allowed network states without worsening the mixing of the Markov chains, and suggests a new approach for the developments of MCMC samplers for ERGMs. We demonstrate the method on both simulated and actual (empirical) network data and show that it reduces CPU time for parameter estimation by an order of magnitude compared to current MCMC methods.

  8. Enhancing hydrologic data assimilation by evolutionary Particle Filter and Markov Chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Abbaszadeh, Peyman; Moradkhani, Hamid; Yan, Hongxiang

    2018-01-01

    Particle Filters (PFs) have received increasing attention by researchers from different disciplines including the hydro-geosciences, as an effective tool to improve model predictions in nonlinear and non-Gaussian dynamical systems. The implication of dual state and parameter estimation using the PFs in hydrology has evolved since 2005 from the PF-SIR (sampling importance resampling) to PF-MCMC (Markov Chain Monte Carlo), and now to the most effective and robust framework through evolutionary PF approach based on Genetic Algorithm (GA) and MCMC, the so-called EPFM. In this framework, the prior distribution undergoes an evolutionary process based on the designed mutation and crossover operators of GA. The merit of this approach is that the particles move to an appropriate position by using the GA optimization and then the number of effective particles is increased by means of MCMC, whereby the particle degeneracy is avoided and the particle diversity is improved. In this study, the usefulness and effectiveness of the proposed EPFM is investigated by applying the technique on a conceptual and highly nonlinear hydrologic model over four river basins located in different climate and geographical regions of the United States. Both synthetic and real case studies demonstrate that the EPFM improves both the state and parameter estimation more effectively and reliably as compared with the PF-MCMC.

  9. Asteroid mass estimation using Markov-Chain Monte Carlo techniques

    NASA Astrophysics Data System (ADS)

    Siltala, Lauri; Granvik, Mikael

    2016-10-01

    Estimates for asteroid masses are based on their gravitational perturbations on the orbits of other objects such as Mars, spacecraft, or other asteroids and/or their satellites. In the case of asteroid-asteroid perturbations, this leads to a 13-dimensional inverse problem where the aim is to derive the mass of the perturbing asteroid and six orbital elements for both the perturbing asteroid and the test asteroid using astrometric observations. We have developed and implemented three different mass estimation algorithms utilizing asteroid-asteroid perturbations into the OpenOrb asteroid-orbit-computation software: the very rough 'marching' approximation, in which the asteroid orbits are fixed at a given epoch, reducing the problem to a one-dimensional estimation of the mass, an implementation of the Nelder-Mead simplex method, and most significantly, a Markov-Chain Monte Carlo (MCMC) approach. We will introduce each of these algorithms with particular focus on the MCMC algorithm, and present example results for both synthetic and real data. Our results agree with the published mass estimates, but suggest that the published uncertainties may be misleading as a consequence of using linearized mass-estimation methods. Finally, we discuss remaining challenges with the algorithms as well as future plans, particularly in connection with ESA's Gaia mission.

  10. Characterizing the Trade Space Between Capability and Complexity in Next Generation Cloud and Precipitation Observing Systems Using Markov Chain Monte Carlos Techniques

    NASA Astrophysics Data System (ADS)

    Xu, Z.; Mace, G. G.; Posselt, D. J.

    2017-12-01

    As we begin to contemplate the next generation atmospheric observing systems, it will be critically important that we are able to make informed decisions regarding the trade space between scientific capability and the need to keep complexity and cost within definable limits. To explore this trade space as it pertains to understanding key cloud and precipitation processes, we are developing a Markov Chain Monte Carlo (MCMC) algorithm suite that allows us to arbitrarily define the specifications of candidate observing systems and then explore how the uncertainties in key retrieved geophysical parameters respond to that observing system. MCMC algorithms produce a more complete posterior solution space, and allow for an objective examination of information contained in measurements. In our initial implementation, MCMC experiments are performed to retrieve vertical profiles of cloud and precipitation properties from a spectrum of active and passive measurements collected by aircraft during the ACE Radiation Definition Experiments (RADEX). Focusing on shallow cumulus clouds observed during the Integrated Precipitation and Hydrology EXperiment (IPHEX), observing systems in this study we consider W and Ka-band radar reflectivity, path-integrated attenuation at those frequencies, 31 and 94 GHz brightness temperatures as well as visible and near-infrared reflectance. By varying the sensitivity and uncertainty of these measurements, we quantify the capacity of various combinations of observations to characterize the physical properties of clouds and precipitation.

  11. Bayesian Atmospheric Radiative Transfer (BART): Model, Statistics Driver, and Application to HD 209458b

    NASA Astrophysics Data System (ADS)

    Cubillos, Patricio; Harrington, Joseph; Blecic, Jasmina; Stemm, Madison M.; Lust, Nate B.; Foster, Andrew S.; Rojo, Patricio M.; Loredo, Thomas J.

    2014-11-01

    Multi-wavelength secondary-eclipse and transit depths probe the thermo-chemical properties of exoplanets. In recent years, several research groups have developed retrieval codes to analyze the existing data and study the prospects of future facilities. However, the scientific community has limited access to these packages. Here we premiere the open-source Bayesian Atmospheric Radiative Transfer (BART) code. We discuss the key aspects of the radiative-transfer algorithm and the statistical package. The radiation code includes line databases for all HITRAN molecules, high-temperature H2O, TiO, and VO, and includes a preprocessor for adding additional line databases without recompiling the radiation code. Collision-induced absorption lines are available for H2-H2 and H2-He. The parameterized thermal and molecular abundance profiles can be modified arbitrarily without recompilation. The generated spectra are integrated over arbitrary bandpasses for comparison to data. BART's statistical package, Multi-core Markov-chain Monte Carlo (MC3), is a general-purpose MCMC module. MC3 implements the Differental-evolution Markov-chain Monte Carlo algorithm (ter Braak 2006, 2009). MC3 converges 20-400 times faster than the usual Metropolis-Hastings MCMC algorithm, and in addition uses the Message Passing Interface (MPI) to parallelize the MCMC chains. We apply the BART retrieval code to the HD 209458b data set to estimate the planet's temperature profile and molecular abundances. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G. JB holds a NASA Earth and Space Science Fellowship.

  12. Markov Chain Monte Carlo: an introduction for epidemiologists

    PubMed Central

    Hamra, Ghassan; MacLehose, Richard; Richardson, David

    2013-01-01

    Markov Chain Monte Carlo (MCMC) methods are increasingly popular among epidemiologists. The reason for this may in part be that MCMC offers an appealing approach to handling some difficult types of analyses. Additionally, MCMC methods are those most commonly used for Bayesian analysis. However, epidemiologists are still largely unfamiliar with MCMC. They may lack familiarity either with he implementation of MCMC or with interpretation of the resultant output. As with tutorials outlining the calculus behind maximum likelihood in previous decades, a simple description of the machinery of MCMC is needed. We provide an introduction to conducting analyses with MCMC, and show that, given the same data and under certain model specifications, the results of an MCMC simulation match those of methods based on standard maximum-likelihood estimation (MLE). In addition, we highlight examples of instances in which MCMC approaches to data analysis provide a clear advantage over MLE. We hope that this brief tutorial will encourage epidemiologists to consider MCMC approaches as part of their analytic tool-kit. PMID:23569196

  13. An Evaluation of a Markov Chain Monte Carlo Method for the Two-Parameter Logistic Model.

    ERIC Educational Resources Information Center

    Kim, Seock-Ho; Cohen, Allan S.

    The accuracy of the Markov Chain Monte Carlo (MCMC) procedure Gibbs sampling was considered for estimation of item parameters of the two-parameter logistic model. Data for the Law School Admission Test (LSAT) Section 6 were analyzed to illustrate the MCMC procedure. In addition, simulated data sets were analyzed using the MCMC, marginal Bayesian…

  14. A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.

    PubMed

    Wang, Tingting; Chen, Yi-Ping Phoebe; Bowman, Phil J; Goddard, Michael E; Hayes, Ben J

    2016-09-21

    Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets.

  15. Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler.

    PubMed

    Jin, Ick Hoon; Yuan, Ying; Liang, Faming

    2013-10-01

    Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the intractable normalizing constant and model degeneracy. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the intractable normalizing constant and model degeneracy issues encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.

  16. MC3: Multi-core Markov-chain Monte Carlo code

    NASA Astrophysics Data System (ADS)

    Cubillos, Patricio; Harrington, Joseph; Lust, Nate; Foster, AJ; Stemm, Madison; Loredo, Tom; Stevenson, Kevin; Campo, Chris; Hardin, Matt; Hardy, Ryan

    2016-10-01

    MC3 (Multi-core Markov-chain Monte Carlo) is a Bayesian statistics tool that can be executed from the shell prompt or interactively through the Python interpreter with single- or multiple-CPU parallel computing. It offers Markov-chain Monte Carlo (MCMC) posterior-distribution sampling for several algorithms, Levenberg-Marquardt least-squares optimization, and uniform non-informative, Jeffreys non-informative, or Gaussian-informative priors. MC3 can share the same value among multiple parameters and fix the value of parameters to constant values, and offers Gelman-Rubin convergence testing and correlated-noise estimation with time-averaging or wavelet-based likelihood estimation methods.

  17. The use of simple reparameterizations to improve the efficiency of Markov chain Monte Carlo estimation for multilevel models with applications to discrete time survival models.

    PubMed

    Browne, William J; Steele, Fiona; Golalizadeh, Mousa; Green, Martin J

    2009-06-01

    We consider the application of Markov chain Monte Carlo (MCMC) estimation methods to random-effects models and in particular the family of discrete time survival models. Survival models can be used in many situations in the medical and social sciences and we illustrate their use through two examples that differ in terms of both substantive area and data structure. A multilevel discrete time survival analysis involves expanding the data set so that the model can be cast as a standard multilevel binary response model. For such models it has been shown that MCMC methods have advantages in terms of reducing estimate bias. However, the data expansion results in very large data sets for which MCMC estimation is often slow and can produce chains that exhibit poor mixing. Any way of improving the mixing will result in both speeding up the methods and more confidence in the estimates that are produced. The MCMC methodological literature is full of alternative algorithms designed to improve mixing of chains and we describe three reparameterization techniques that are easy to implement in available software. We consider two examples of multilevel survival analysis: incidence of mastitis in dairy cattle and contraceptive use dynamics in Indonesia. For each application we show where the reparameterization techniques can be used and assess their performance.

  18. Inferring soil salinity in a drip irrigation system from multi-configuration EMI measurements using adaptive Markov chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Zaib Jadoon, Khan; Umer Altaf, Muhammad; McCabe, Matthew Francis; Hoteit, Ibrahim; Muhammad, Nisar; Moghadas, Davood; Weihermüller, Lutz

    2017-10-01

    A substantial interpretation of electromagnetic induction (EMI) measurements requires quantifying optimal model parameters and uncertainty of a nonlinear inverse problem. For this purpose, an adaptive Bayesian Markov chain Monte Carlo (MCMC) algorithm is used to assess multi-orientation and multi-offset EMI measurements in an agriculture field with non-saline and saline soil. In MCMC the posterior distribution is computed using Bayes' rule. The electromagnetic forward model based on the full solution of Maxwell's equations was used to simulate the apparent electrical conductivity measured with the configurations of EMI instrument, the CMD Mini-Explorer. Uncertainty in the parameters for the three-layered earth model are investigated by using synthetic data. Our results show that in the scenario of non-saline soil, the parameters of layer thickness as compared to layers electrical conductivity are not very informative and are therefore difficult to resolve. Application of the proposed MCMC-based inversion to field measurements in a drip irrigation system demonstrates that the parameters of the model can be well estimated for the saline soil as compared to the non-saline soil, and provides useful insight about parameter uncertainty for the assessment of the model outputs.

  19. Markov chain Monte Carlo techniques applied to parton distribution functions determination: Proof of concept

    NASA Astrophysics Data System (ADS)

    Gbedo, Yémalin Gabin; Mangin-Brinet, Mariane

    2017-07-01

    We present a new procedure to determine parton distribution functions (PDFs), based on Markov chain Monte Carlo (MCMC) methods. The aim of this paper is to show that we can replace the standard χ2 minimization by procedures grounded on statistical methods, and on Bayesian inference in particular, thus offering additional insight into the rich field of PDFs determination. After a basic introduction to these techniques, we introduce the algorithm we have chosen to implement—namely Hybrid (or Hamiltonian) Monte Carlo. This algorithm, initially developed for Lattice QCD, turns out to be very interesting when applied to PDFs determination by global analyses; we show that it allows us to circumvent the difficulties due to the high dimensionality of the problem, in particular concerning the acceptance. A first feasibility study is performed and presented, which indicates that Markov chain Monte Carlo can successfully be applied to the extraction of PDFs and of their uncertainties.

  20. Of bugs and birds: Markov Chain Monte Carlo for hierarchical modeling in wildlife research

    USGS Publications Warehouse

    Link, W.A.; Cam, E.; Nichols, J.D.; Cooch, E.G.

    2002-01-01

    Markov chain Monte Carlo (MCMC) is a statistical innovation that allows researchers to fit far more complex models to data than is feasible using conventional methods. Despite its widespread use in a variety of scientific fields, MCMC appears to be underutilized in wildlife applications. This may be due to a misconception that MCMC requires the adoption of a subjective Bayesian analysis, or perhaps simply to its lack of familiarity among wildlife researchers. We introduce the basic ideas of MCMC and software BUGS (Bayesian inference using Gibbs sampling), stressing that a simple and satisfactory intuition for MCMC does not require extraordinary mathematical sophistication. We illustrate the use of MCMC with an analysis of the association between latent factors governing individual heterogeneity in breeding and survival rates of kittiwakes (Rissa tridactyla). We conclude with a discussion of the importance of individual heterogeneity for understanding population dynamics and designing management plans.

  1. A trans-dimensional Bayesian Markov chain Monte Carlo algorithm for model assessment using frequency-domain electromagnetic data

    USGS Publications Warehouse

    Minsley, Burke J.

    2011-01-01

    A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, ‘best’ model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequencydomain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favorably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment.

  2. Probabilistic Magnetotelluric Inversion with Adaptive Regularisation Using the No-U-Turns Sampler

    NASA Astrophysics Data System (ADS)

    Conway, Dennis; Simpson, Janelle; Didana, Yohannes; Rugari, Joseph; Heinson, Graham

    2018-04-01

    We present the first inversion of magnetotelluric (MT) data using a Hamiltonian Monte Carlo algorithm. The inversion of MT data is an underdetermined problem which leads to an ensemble of feasible models for a given dataset. A standard approach in MT inversion is to perform a deterministic search for the single solution which is maximally smooth for a given data-fit threshold. An alternative approach is to use Markov Chain Monte Carlo (MCMC) methods, which have been used in MT inversion to explore the entire solution space and produce a suite of likely models. This approach has the advantage of assigning confidence to resistivity models, leading to better geological interpretations. Recent advances in MCMC techniques include the No-U-Turns Sampler (NUTS), an efficient and rapidly converging method which is based on Hamiltonian Monte Carlo. We have implemented a 1D MT inversion which uses the NUTS algorithm. Our model includes a fixed number of layers of variable thickness and resistivity, as well as probabilistic smoothing constraints which allow sharp and smooth transitions. We present the results of a synthetic study and show the accuracy of the technique, as well as the fast convergence, independence of starting models, and sampling efficiency. Finally, we test our technique on MT data collected from a site in Boulia, Queensland, Australia to show its utility in geological interpretation and ability to provide probabilistic estimates of features such as depth to basement.

  3. Bayesian Inference on Malignant Breast Cancer in Nigeria: A Diagnosis of MCMC Convergence

    PubMed Central

    Ogunsakin, Ropo Ebenezer; Siaka, Lougue

    2017-01-01

    Background: There has been no previous study to classify malignant breast tumor in details based on Markov Chain Monte Carlo (MCMC) convergence in Western, Nigeria. This study therefore aims to profile patients living with benign and malignant breast tumor in two different hospitals among women of Western Nigeria, with a focus on prognostic factors and MCMC convergence. Materials and Methods: A hospital-based record was used to identify prognostic factors for malignant breast cancer among women of Western Nigeria. This paper describes Bayesian inference and demonstrates its usage to estimation of parameters of the logistic regression via Markov Chain Monte Carlo (MCMC) algorithm. The result of the Bayesian approach is compared with the classical statistics. Results: The mean age of the respondents was 42.2 ±16.6 years with 52% of the women aged between 35-49 years. The results of both techniques suggest that age and women with at least high school education have a significantly higher risk of being diagnosed with malignant breast tumors than benign breast tumors. The results also indicate a reduction of standard errors is associated with the coefficients obtained from the Bayesian approach. In addition, simulation result reveal that women with at least high school are 1.3 times more at risk of having malignant breast lesion in western Nigeria compared to benign breast lesion. Conclusion: We concluded that more efforts are required towards creating awareness and advocacy campaigns on how the prevalence of malignant breast lesions can be reduced, especially among women. The application of Bayesian produces precise estimates for modeling malignant breast cancer. PMID:29072396

  4. Recovery of Graded Response Model Parameters: A Comparison of Marginal Maximum Likelihood and Markov Chain Monte Carlo Estimation

    ERIC Educational Resources Information Center

    Kieftenbeld, Vincent; Natesan, Prathiba

    2012-01-01

    Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…

  5. A computer program for uncertainty analysis integrating regression and Bayesian methods

    USGS Publications Warehouse

    Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary

    2014-01-01

    This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.

  6. A trans-dimensional Bayesian Markov chain Monte Carlo algorithm for model assessment using frequency-domain electromagnetic data

    USGS Publications Warehouse

    Minsley, B.J.

    2011-01-01

    A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, 'best' model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequency-domain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favourably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment. ?? 2011. Geophysical Journal International ?? 2011 RAS.

  7. Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach

    NASA Technical Reports Server (NTRS)

    Warner, James E.; Hochhalter, Jacob D.

    2016-01-01

    This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.

  8. Smoothing spline ANOVA frailty model for recurrent event data.

    PubMed

    Du, Pang; Jiang, Yihua; Wang, Yuedong

    2011-12-01

    Gap time hazard estimation is of particular interest in recurrent event data. This article proposes a fully nonparametric approach for estimating the gap time hazard. Smoothing spline analysis of variance (ANOVA) decompositions are used to model the log gap time hazard as a joint function of gap time and covariates, and general frailty is introduced to account for between-subject heterogeneity and within-subject correlation. We estimate the nonparametric gap time hazard function and parameters in the frailty distribution using a combination of the Newton-Raphson procedure, the stochastic approximation algorithm (SAA), and the Markov chain Monte Carlo (MCMC) method. The convergence of the algorithm is guaranteed by decreasing the step size of parameter update and/or increasing the MCMC sample size along iterations. Model selection procedure is also developed to identify negligible components in a functional ANOVA decomposition of the log gap time hazard. We evaluate the proposed methods with simulation studies and illustrate its use through the analysis of bladder tumor data. © 2011, The International Biometric Society.

  9. Recovery of Item Parameters in the Nominal Response Model: A Comparison of Marginal Maximum Likelihood Estimation and Markov Chain Monte Carlo Estimation.

    ERIC Educational Resources Information Center

    Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun

    2002-01-01

    Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)

  10. Hydrologic Process Parameterization of Electrical Resistivity Imaging of Solute Plumes Using POD McMC

    NASA Astrophysics Data System (ADS)

    Awatey, M. T.; Irving, J.; Oware, E. K.

    2016-12-01

    Markov chain Monte Carlo (McMC) inversion frameworks are becoming increasingly popular in geophysics due to their ability to recover multiple equally plausible geologic features that honor the limited noisy measurements. Standard McMC methods, however, become computationally intractable with increasing dimensionality of the problem, for example, when working with spatially distributed geophysical parameter fields. We present a McMC approach based on a sparse proper orthogonal decomposition (POD) model parameterization that implicitly incorporates the physics of the underlying process. First, we generate training images (TIs) via Monte Carlo simulations of the target process constrained to a conceptual model. We then apply POD to construct basis vectors from the TIs. A small number of basis vectors can represent most of the variability in the TIs, leading to dimensionality reduction. A projection of the starting model into the reduced basis space generates the starting POD coefficients. At each iteration, only coefficients within a specified sampling window are resimulated assuming a Gaussian prior. The sampling window grows at a specified rate as the number of iteration progresses starting from the coefficients corresponding to the highest ranked basis to those of the least informative basis. We found this gradual increment in the sampling window to be more stable compared to resampling all the coefficients right from the first iteration. We demonstrate the performance of the algorithm with both synthetic and lab-scale electrical resistivity imaging of saline tracer experiments, employing the same set of basis vectors for all inversions. We consider two scenarios of unimodal and bimodal plumes. The unimodal plume is consistent with the hypothesis underlying the generation of the TIs whereas bimodality in plume morphology was not theorized. We show that uncertainty quantification using McMC can proceed in the reduced dimensionality space while accounting for the physics of the underlying process.

  11. Uncertainty Analysis Based on Sparse Grid Collocation and Quasi-Monte Carlo Sampling with Application in Groundwater Modeling

    NASA Astrophysics Data System (ADS)

    Zhang, G.; Lu, D.; Ye, M.; Gunzburger, M.

    2011-12-01

    Markov Chain Monte Carlo (MCMC) methods have been widely used in many fields of uncertainty analysis to estimate the posterior distributions of parameters and credible intervals of predictions in the Bayesian framework. However, in practice, MCMC may be computationally unaffordable due to slow convergence and the excessive number of forward model executions required, especially when the forward model is expensive to compute. Both disadvantages arise from the curse of dimensionality, i.e., the posterior distribution is usually a multivariate function of parameters. Recently, sparse grid method has been demonstrated to be an effective technique for coping with high-dimensional interpolation or integration problems. Thus, in order to accelerate the forward model and avoid the slow convergence of MCMC, we propose a new method for uncertainty analysis based on sparse grid interpolation and quasi-Monte Carlo sampling. First, we construct a polynomial approximation of the forward model in the parameter space by using the sparse grid interpolation. This approximation then defines an accurate surrogate posterior distribution that can be evaluated repeatedly at minimal computational cost. Second, instead of using MCMC, a quasi-Monte Carlo method is applied to draw samples in the parameter space. Then, the desired probability density function of each prediction is approximated by accumulating the posterior density values of all the samples according to the prediction values. Our method has the following advantages: (1) the polynomial approximation of the forward model on the sparse grid provides a very efficient evaluation of the surrogate posterior distribution; (2) the quasi-Monte Carlo method retains the same accuracy in approximating the PDF of predictions but avoids all disadvantages of MCMC. The proposed method is applied to a controlled numerical experiment of groundwater flow modeling. The results show that our method attains the same accuracy much more efficiently than traditional MCMC.

  12. Hydrologic Model Selection using Markov chain Monte Carlo methods

    NASA Astrophysics Data System (ADS)

    Marshall, L.; Sharma, A.; Nott, D.

    2002-12-01

    Estimation of parameter uncertainty (and in turn model uncertainty) allows assessment of the risk in likely applications of hydrological models. Bayesian statistical inference provides an ideal means of assessing parameter uncertainty whereby prior knowledge about the parameter is combined with information from the available data to produce a probability distribution (the posterior distribution) that describes uncertainty about the parameter and serves as a basis for selecting appropriate values for use in modelling applications. Widespread use of Bayesian techniques in hydrology has been hindered by difficulties in summarizing and exploring the posterior distribution. These difficulties have been largely overcome by recent advances in Markov chain Monte Carlo (MCMC) methods that involve random sampling of the posterior distribution. This study presents an adaptive MCMC sampling algorithm which has characteristics that are well suited to model parameters with a high degree of correlation and interdependence, as is often evident in hydrological models. The MCMC sampling technique is used to compare six alternative configurations of a commonly used conceptual rainfall-runoff model, the Australian Water Balance Model (AWBM), using 11 years of daily rainfall runoff data from the Bass river catchment in Australia. The alternative configurations considered fall into two classes - those that consider model errors to be independent of prior values, and those that model the errors as an autoregressive process. Each such class consists of three formulations that represent increasing levels of complexity (and parameterisation) of the original model structure. The results from this study point both to the importance of using Bayesian approaches in evaluating model performance, as well as the simplicity of the MCMC sampling framework that has the ability to bring such approaches within the reach of the applied hydrological community.

  13. Corruption of accuracy and efficiency of Markov chain Monte Carlo simulation by inaccurate numerical implementation of conceptual hydrologic models

    NASA Astrophysics Data System (ADS)

    Schoups, G.; Vrugt, J. A.; Fenicia, F.; van de Giesen, N. C.

    2010-10-01

    Conceptual rainfall-runoff models have traditionally been applied without paying much attention to numerical errors induced by temporal integration of water balance dynamics. Reliance on first-order, explicit, fixed-step integration methods leads to computationally cheap simulation models that are easy to implement. Computational speed is especially desirable for estimating parameter and predictive uncertainty using Markov chain Monte Carlo (MCMC) methods. Confirming earlier work of Kavetski et al. (2003), we show here that the computational speed of first-order, explicit, fixed-step integration methods comes at a cost: for a case study with a spatially lumped conceptual rainfall-runoff model, it introduces artificial bimodality in the marginal posterior parameter distributions, which is not present in numerically accurate implementations of the same model. The resulting effects on MCMC simulation include (1) inconsistent estimates of posterior parameter and predictive distributions, (2) poor performance and slow convergence of the MCMC algorithm, and (3) unreliable convergence diagnosis using the Gelman-Rubin statistic. We studied several alternative numerical implementations to remedy these problems, including various adaptive-step finite difference schemes and an operator splitting method. Our results show that adaptive-step, second-order methods, based on either explicit finite differencing or operator splitting with analytical integration, provide the best alternative for accurate and efficient MCMC simulation. Fixed-step or adaptive-step implicit methods may also be used for increased accuracy, but they cannot match the efficiency of adaptive-step explicit finite differencing or operator splitting. Of the latter two, explicit finite differencing is more generally applicable and is preferred if the individual hydrologic flux laws cannot be integrated analytically, as the splitting method then loses its advantage.

  14. An MCMC method for the evaluation of the Fisher information matrix for non-linear mixed effect models.

    PubMed

    Riviere, Marie-Karelle; Ueckert, Sebastian; Mentré, France

    2016-10-01

    Non-linear mixed effect models (NLMEMs) are widely used for the analysis of longitudinal data. To design these studies, optimal design based on the expected Fisher information matrix (FIM) can be used instead of performing time-consuming clinical trial simulations. In recent years, estimation algorithms for NLMEMs have transitioned from linearization toward more exact higher-order methods. Optimal design, on the other hand, has mainly relied on first-order (FO) linearization to calculate the FIM. Although efficient in general, FO cannot be applied to complex non-linear models and with difficulty in studies with discrete data. We propose an approach to evaluate the expected FIM in NLMEMs for both discrete and continuous outcomes. We used Markov Chain Monte Carlo (MCMC) to integrate the derivatives of the log-likelihood over the random effects, and Monte Carlo to evaluate its expectation w.r.t. the observations. Our method was implemented in R using Stan, which efficiently draws MCMC samples and calculates partial derivatives of the log-likelihood. Evaluated on several examples, our approach showed good performance with relative standard errors (RSEs) close to those obtained by simulations. We studied the influence of the number of MC and MCMC samples and computed the uncertainty of the FIM evaluation. We also compared our approach to Adaptive Gaussian Quadrature, Laplace approximation, and FO. Our method is available in R-package MIXFIM and can be used to evaluate the FIM, its determinant with confidence intervals (CIs), and RSEs with CIs. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  15. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method.

    PubMed

    Norris, Peter M; da Silva, Arlindo M

    2016-07-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.

  16. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 1: Method

    NASA Technical Reports Server (NTRS)

    Norris, Peter M.; Da Silva, Arlindo M.

    2016-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.

  17. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method

    PubMed Central

    Norris, Peter M.; da Silva, Arlindo M.

    2018-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847

  18. Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module

    NASA Astrophysics Data System (ADS)

    Martinez, Gregory D.; McKay, James; Farmer, Ben; Scott, Pat; Roebber, Elinore; Putze, Antje; Conrad, Jan

    2017-11-01

    We introduce ScannerBit, the statistics and sampling module of the public, open-source global fitting framework GAMBIT. ScannerBit provides a standardised interface to different sampling algorithms, enabling the use and comparison of multiple computational methods for inferring profile likelihoods, Bayesian posteriors, and other statistical quantities. The current version offers random, grid, raster, nested sampling, differential evolution, Markov Chain Monte Carlo (MCMC) and ensemble Monte Carlo samplers. We also announce the release of a new standalone differential evolution sampler, Diver, and describe its design, usage and interface to ScannerBit. We subject Diver and three other samplers (the nested sampler MultiNest, the MCMC GreAT, and the native ScannerBit implementation of the ensemble Monte Carlo algorithm T-Walk) to a battery of statistical tests. For this we use a realistic physical likelihood function, based on the scalar singlet model of dark matter. We examine the performance of each sampler as a function of its adjustable settings, and the dimensionality of the sampling problem. We evaluate performance on four metrics: optimality of the best fit found, completeness in exploring the best-fit region, number of likelihood evaluations, and total runtime. For Bayesian posterior estimation at high resolution, T-Walk provides the most accurate and timely mapping of the full parameter space. For profile likelihood analysis in less than about ten dimensions, we find that Diver and MultiNest score similarly in terms of best fit and speed, outperforming GreAT and T-Walk; in ten or more dimensions, Diver substantially outperforms the other three samplers on all metrics.

  19. Joint simulation of regional areas burned in Canadian forest fires: A Markov Chain Monte Carlo approach

    Treesearch

    Steen Magnussen

    2009-01-01

    Areas burned annually in 29 Canadian forest fire regions show a patchy and irregular correlation structure that significantly influences the distribution of annual totals for Canada and for groups of regions. A binary Monte Carlo Markov Chain (MCMC) is constructed for the purpose of joint simulation of regional areas burned in forest fires. For each year the MCMC...

  20. Bayesian forecasting and uncertainty quantifying of stream flows using Metropolis-Hastings Markov Chain Monte Carlo algorithm

    NASA Astrophysics Data System (ADS)

    Wang, Hongrui; Wang, Cheng; Wang, Ying; Gao, Xiong; Yu, Chen

    2017-06-01

    This paper presents a Bayesian approach using Metropolis-Hastings Markov Chain Monte Carlo algorithm and applies this method for daily river flow rate forecast and uncertainty quantification for Zhujiachuan River using data collected from Qiaotoubao Gage Station and other 13 gage stations in Zhujiachuan watershed in China. The proposed method is also compared with the conventional maximum likelihood estimation (MLE) for parameter estimation and quantification of associated uncertainties. While the Bayesian method performs similarly in estimating the mean value of daily flow rate, it performs over the conventional MLE method on uncertainty quantification, providing relatively narrower reliable interval than the MLE confidence interval and thus more precise estimation by using the related information from regional gage stations. The Bayesian MCMC method might be more favorable in the uncertainty analysis and risk management.

  1. Data Analysis Recipes: Using Markov Chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Hogg, David W.; Foreman-Mackey, Daniel

    2018-05-01

    Markov Chain Monte Carlo (MCMC) methods for sampling probability density functions (combined with abundant computational resources) have transformed the sciences, especially in performing probabilistic inferences, or fitting models to data. In this primarily pedagogical contribution, we give a brief overview of the most basic MCMC method and some practical advice for the use of MCMC in real inference problems. We give advice on method choice, tuning for performance, methods for initialization, tests of convergence, troubleshooting, and use of the chain output to produce or report parameter estimates with associated uncertainties. We argue that autocorrelation time is the most important test for convergence, as it directly connects to the uncertainty on the sampling estimate of any quantity of interest. We emphasize that sampling is a method for doing integrals; this guides our thinking about how MCMC output is best used. .

  2. Bayesian calibration of terrestrial ecosystem models: a study of advanced Markov chain Monte Carlo methods

    NASA Astrophysics Data System (ADS)

    Lu, Dan; Ricciuto, Daniel; Walker, Anthony; Safta, Cosmin; Munger, William

    2017-09-01

    Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.

  3. Bayesian calibration of terrestrial ecosystem models: A study of advanced Markov chain Monte Carlo methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lu, Dan; Ricciuto, Daniel; Walker, Anthony

    Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this study, a Differential Evolution Adaptive Metropolis (DREAM) algorithm was used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The DREAM is a multi-chainmore » method and uses differential evolution technique for chain movement, allowing it to be efficiently applied to high-dimensional problems, and can reliably estimate heavy-tailed and multimodal distributions that are difficult for single-chain schemes using a Gaussian proposal distribution. The results were evaluated against the popular Adaptive Metropolis (AM) scheme. DREAM indicated that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identified one mode. The calibration of DREAM resulted in a better model fit and predictive performance compared to the AM. DREAM provides means for a good exploration of the posterior distributions of model parameters. Lastly, it reduces the risk of false convergence to a local optimum and potentially improves the predictive performance of the calibrated model.« less

  4. Bayesian calibration of terrestrial ecosystem models: A study of advanced Markov chain Monte Carlo methods

    DOE PAGES

    Lu, Dan; Ricciuto, Daniel; Walker, Anthony; ...

    2017-02-22

    Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this study, a Differential Evolution Adaptive Metropolis (DREAM) algorithm was used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The DREAM is a multi-chainmore » method and uses differential evolution technique for chain movement, allowing it to be efficiently applied to high-dimensional problems, and can reliably estimate heavy-tailed and multimodal distributions that are difficult for single-chain schemes using a Gaussian proposal distribution. The results were evaluated against the popular Adaptive Metropolis (AM) scheme. DREAM indicated that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identified one mode. The calibration of DREAM resulted in a better model fit and predictive performance compared to the AM. DREAM provides means for a good exploration of the posterior distributions of model parameters. Lastly, it reduces the risk of false convergence to a local optimum and potentially improves the predictive performance of the calibrated model.« less

  5. [Comparison of different methods in dealing with HIV viral load data with diversified missing value mechanism on HIV positive MSM].

    PubMed

    Jiang, Z; Dou, Z; Song, W L; Xu, J; Wu, Z Y

    2017-11-10

    Objective: To compare results of different methods: in organizing HIV viral load (VL) data with missing values mechanism. Methods We used software SPSS 17.0 to simulate complete and missing data with different missing value mechanism from HIV viral loading data collected from MSM in 16 cities in China in 2013. Maximum Likelihood Methods Using the Expectation and Maximization Algorithm (EM), regressive method, mean imputation, delete method, and Markov Chain Monte Carlo (MCMC) were used to supplement missing data respectively. The results: of different methods were compared according to distribution characteristics, accuracy and precision. Results HIV VL data could not be transferred into a normal distribution. All the methods showed good results in iterating data which is Missing Completely at Random Mechanism (MCAR). For the other types of missing data, regressive and MCMC methods were used to keep the main characteristic of the original data. The means of iterating database with different methods were all close to the original one. The EM, regressive method, mean imputation, and delete method under-estimate VL while MCMC overestimates it. Conclusion: MCMC can be used as the main imputation method for HIV virus loading missing data. The iterated data can be used as a reference for mean HIV VL estimation among the investigated population.

  6. Profile-Based LC-MS Data Alignment—A Bayesian Approach

    PubMed Central

    Tsai, Tsung-Heng; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.

    2014-01-01

    A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set, and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time-warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets. PMID:23929872

  7. Comparison of sampling techniques for Bayesian parameter estimation

    NASA Astrophysics Data System (ADS)

    Allison, Rupert; Dunkley, Joanna

    2014-02-01

    The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.

  8. Locating hazardous gas leaks in the atmosphere via modified genetic, MCMC and particle swarm optimization algorithms

    NASA Astrophysics Data System (ADS)

    Wang, Ji; Zhang, Ru; Yan, Yuting; Dong, Xiaoqiang; Li, Jun Ming

    2017-05-01

    Hazardous gas leaks in the atmosphere can cause significant economic losses in addition to environmental hazards, such as fires and explosions. A three-stage hazardous gas leak source localization method was developed that uses movable and stationary gas concentration sensors. The method calculates a preliminary source inversion with a modified genetic algorithm (MGA) and has the potential to crossover with eliminated individuals from the population, following the selection of the best candidate. The method then determines a search zone using Markov Chain Monte Carlo (MCMC) sampling, utilizing a partial evaluation strategy. The leak source is then accurately localized using a modified guaranteed convergence particle swarm optimization algorithm with several bad-performing individuals, following selection of the most successful individual with dynamic updates. The first two stages are based on data collected by motionless sensors, and the last stage is based on data from movable robots with sensors. The measurement error adaptability and the effect of the leak source location were analyzed. The test results showed that this three-stage localization process can localize a leak source within 1.0 m of the source for different leak source locations, with measurement error standard deviation smaller than 2.0.

  9. Bayesian inference on EMRI signals using low frequency approximations

    NASA Astrophysics Data System (ADS)

    Ali, Asad; Christensen, Nelson; Meyer, Renate; Röver, Christian

    2012-07-01

    Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions.

  10. Accuracy of Reaction Cross Section for Exotic Nuclei in Glauber Model Based on MCMC Diagnostics

    NASA Astrophysics Data System (ADS)

    Rueter, Keiti; Novikov, Ivan

    2017-01-01

    Parameters of a nuclear density distribution for an exotic nuclei with halo or skin structures can be determined from the experimentally measured reaction cross-section. In the presented work, to extract parameters such as nuclear size information for a halo and core, we compare experimental data on reaction cross-sections with values obtained using expressions of the Glauber Model. These calculations are performed using a Markov Chain Monte Carlo algorithm. We discuss the accuracy of the Monte Carlo approach and its dependence on k*, the power law turnover point in the discreet power spectrum of the random number sequence and on the lag-1 autocorrelation time of the random number sequence.

  11. MontePython 3: Parameter inference code for cosmology

    NASA Astrophysics Data System (ADS)

    Brinckmann, Thejs; Lesgourgues, Julien; Audren, Benjamin; Benabed, Karim; Prunet, Simon

    2018-05-01

    MontePython 3 provides numerous ways to explore parameter space using Monte Carlo Markov Chain (MCMC) sampling, including Metropolis-Hastings, Nested Sampling, Cosmo Hammer, and a Fisher sampling method. This improved version of the Monte Python (ascl:1307.002) parameter inference code for cosmology offers new ingredients that improve the performance of Metropolis-Hastings sampling, speeding up convergence and offering significant time improvement in difficult runs. Additional likelihoods and plotting options are available, as are post-processing algorithms such as Importance Sampling and Adding Derived Parameter.

  12. A sampling algorithm for segregation analysis

    PubMed Central

    Tier, Bruce; Henshall, John

    2001-01-01

    Methods for detecting Quantitative Trait Loci (QTL) without markers have generally used iterative peeling algorithms for determining genotype probabilities. These algorithms have considerable shortcomings in complex pedigrees. A Monte Carlo Markov chain (MCMC) method which samples the pedigree of the whole population jointly is described. Simultaneous sampling of the pedigree was achieved by sampling descent graphs using the Metropolis-Hastings algorithm. A descent graph describes the inheritance state of each allele and provides pedigrees guaranteed to be consistent with Mendelian sampling. Sampling descent graphs overcomes most, if not all, of the limitations incurred by iterative peeling algorithms. The algorithm was able to find the QTL in most of the simulated populations. However, when the QTL was not modeled or found then its effect was ascribed to the polygenic component. No QTL were detected when they were not simulated. PMID:11742631

  13. Bayesian forecasting and uncertainty quantifying of stream flows using Metropolis–Hastings Markov Chain Monte Carlo algorithm

    DOE PAGES

    Wang, Hongrui; Wang, Cheng; Wang, Ying; ...

    2017-04-05

    This paper presents a Bayesian approach using Metropolis-Hastings Markov Chain Monte Carlo algorithm and applies this method for daily river flow rate forecast and uncertainty quantification for Zhujiachuan River using data collected from Qiaotoubao Gage Station and other 13 gage stations in Zhujiachuan watershed in China. The proposed method is also compared with the conventional maximum likelihood estimation (MLE) for parameter estimation and quantification of associated uncertainties. While the Bayesian method performs similarly in estimating the mean value of daily flow rate, it performs over the conventional MLE method on uncertainty quantification, providing relatively narrower reliable interval than the MLEmore » confidence interval and thus more precise estimation by using the related information from regional gage stations. As a result, the Bayesian MCMC method might be more favorable in the uncertainty analysis and risk management.« less

  14. Development of reversible jump Markov Chain Monte Carlo algorithm in the Bayesian mixture modeling for microarray data in Indonesia

    NASA Astrophysics Data System (ADS)

    Astuti, Ani Budi; Iriawan, Nur; Irhamah, Kuswanto, Heri

    2017-12-01

    In the Bayesian mixture modeling requires stages the identification number of the most appropriate mixture components thus obtained mixture models fit the data through data driven concept. Reversible Jump Markov Chain Monte Carlo (RJMCMC) is a combination of the reversible jump (RJ) concept and the Markov Chain Monte Carlo (MCMC) concept used by some researchers to solve the problem of identifying the number of mixture components which are not known with certainty number. In its application, RJMCMC using the concept of the birth/death and the split-merge with six types of movement, that are w updating, θ updating, z updating, hyperparameter β updating, split-merge for components and birth/death from blank components. The development of the RJMCMC algorithm needs to be done according to the observed case. The purpose of this study is to know the performance of RJMCMC algorithm development in identifying the number of mixture components which are not known with certainty number in the Bayesian mixture modeling for microarray data in Indonesia. The results of this study represent that the concept RJMCMC algorithm development able to properly identify the number of mixture components in the Bayesian normal mixture model wherein the component mixture in the case of microarray data in Indonesia is not known for certain number.

  15. Trans-dimensional MCMC methods for fully automatic motion analysis in tagged MRI.

    PubMed

    Smal, Ihor; Carranza-Herrezuelo, Noemí; Klein, Stefan; Niessen, Wiro; Meijering, Erik

    2011-01-01

    Tagged magnetic resonance imaging (tMRI) is a well-known noninvasive method allowing quantitative analysis of regional heart dynamics. Its clinical use has so far been limited, in part due to the lack of robustness and accuracy of existing tag tracking algorithms in dealing with low (and intrinsically time-varying) image quality. In this paper, we propose a novel probabilistic method for tag tracking, implemented by means of Bayesian particle filtering and a trans-dimensional Markov chain Monte Carlo (MCMC) approach, which efficiently combines information about the imaging process and tag appearance with prior knowledge about the heart dynamics obtained by means of non-rigid image registration. Experiments using synthetic image data (with ground truth) and real data (with expert manual annotation) from preclinical (small animal) and clinical (human) studies confirm that the proposed method yields higher consistency, accuracy, and intrinsic tag reliability assessment in comparison with other frequently used tag tracking methods.

  16. Bayesian calibration of terrestrial ecosystem models: a study of advanced Markov chain Monte Carlo methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.

    Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less

  17. Bayesian calibration of terrestrial ecosystem models: a study of advanced Markov chain Monte Carlo methods

    DOE PAGES

    Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.; ...

    2017-09-27

    Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less

  18. Bayesian-MCMC-based parameter estimation of stealth aircraft RCS models

    NASA Astrophysics Data System (ADS)

    Xia, Wei; Dai, Xiao-Xia; Feng, Yuan

    2015-12-01

    When modeling a stealth aircraft with low RCS (Radar Cross Section), conventional parameter estimation methods may cause a deviation from the actual distribution, owing to the fact that the characteristic parameters are estimated via directly calculating the statistics of RCS. The Bayesian-Markov Chain Monte Carlo (Bayesian-MCMC) method is introduced herein to estimate the parameters so as to improve the fitting accuracies of fluctuation models. The parameter estimations of the lognormal and the Legendre polynomial models are reformulated in the Bayesian framework. The MCMC algorithm is then adopted to calculate the parameter estimates. Numerical results show that the distribution curves obtained by the proposed method exhibit improved consistence with the actual ones, compared with those fitted by the conventional method. The fitting accuracy could be improved by no less than 25% for both fluctuation models, which implies that the Bayesian-MCMC method might be a good candidate among the optimal parameter estimation methods for stealth aircraft RCS models. Project supported by the National Natural Science Foundation of China (Grant No. 61101173), the National Basic Research Program of China (Grant No. 613206), the National High Technology Research and Development Program of China (Grant No. 2012AA01A308), the State Scholarship Fund by the China Scholarship Council (CSC), and the Oversea Academic Training Funds, and University of Electronic Science and Technology of China (UESTC).

  19. Estimation of the four-wave mixing noise probability-density function by the multicanonical Monte Carlo method.

    PubMed

    Neokosmidis, Ioannis; Kamalakis, Thomas; Chipouras, Aristides; Sphicopoulos, Thomas

    2005-01-01

    The performance of high-powered wavelength-division multiplexed (WDM) optical networks can be severely degraded by four-wave-mixing- (FWM-) induced distortion. The multicanonical Monte Carlo method (MCMC) is used to calculate the probability-density function (PDF) of the decision variable of a receiver, limited by FWM noise. Compared with the conventional Monte Carlo method previously used to estimate this PDF, the MCMC method is much faster and can accurately estimate smaller error probabilities. The method takes into account the correlation between the components of the FWM noise, unlike the Gaussian model, which is shown not to provide accurate results.

  20. Monte Carlo Analysis of Reservoir Models Using Seismic Data and Geostatistical Models

    NASA Astrophysics Data System (ADS)

    Zunino, A.; Mosegaard, K.; Lange, K.; Melnikova, Y.; Hansen, T. M.

    2013-12-01

    We present a study on the analysis of petroleum reservoir models consistent with seismic data and geostatistical constraints performed on a synthetic reservoir model. Our aim is to invert directly for structure and rock bulk properties of the target reservoir zone. To infer the rock facies, porosity and oil saturation seismology alone is not sufficient but a rock physics model must be taken into account, which links the unknown properties to the elastic parameters. We then combine a rock physics model with a simple convolutional approach for seismic waves to invert the "measured" seismograms. To solve this inverse problem, we employ a Markov chain Monte Carlo (MCMC) method, because it offers the possibility to handle non-linearity, complex and multi-step forward models and provides realistic estimates of uncertainties. However, for large data sets the MCMC method may be impractical because of a very high computational demand. To face this challenge one strategy is to feed the algorithm with realistic models, hence relying on proper prior information. To address this problem, we utilize an algorithm drawn from geostatistics to generate geologically plausible models which represent samples of the prior distribution. The geostatistical algorithm learns the multiple-point statistics from prototype models (in the form of training images), then generates thousands of different models which are accepted or rejected by a Metropolis sampler. To further reduce the computation time we parallelize the software and run it on multi-core machines. The solution of the inverse problem is then represented by a collection of reservoir models in terms of facies, porosity and oil saturation, which constitute samples of the posterior distribution. We are finally able to produce probability maps of the properties we are interested in by performing statistical analysis on the collection of solutions.

  1. DNA motif alignment by evolving a population of Markov chains.

    PubMed

    Bi, Chengpeng

    2009-01-30

    Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

  2. Time-domain induced polarization - an analysis of Cole-Cole parameter resolution and correlation using Markov Chain Monte Carlo inversion

    NASA Astrophysics Data System (ADS)

    Madsen, Line Meldgaard; Fiandaca, Gianluca; Auken, Esben; Christiansen, Anders Vest

    2017-12-01

    The application of time-domain induced polarization (TDIP) is increasing with advances in acquisition techniques, data processing and spectral inversion schemes. An inversion of TDIP data for the spectral Cole-Cole parameters is a non-linear problem, but by applying a 1-D Markov Chain Monte Carlo (MCMC) inversion algorithm, a full non-linear uncertainty analysis of the parameters and the parameter correlations can be accessed. This is essential to understand to what degree the spectral Cole-Cole parameters can be resolved from TDIP data. MCMC inversions of synthetic TDIP data, which show bell-shaped probability distributions with a single maximum, show that the Cole-Cole parameters can be resolved from TDIP data if an acquisition range above two decades in time is applied. Linear correlations between the Cole-Cole parameters are observed and by decreasing the acquisitions ranges, the correlations increase and become non-linear. It is further investigated how waveform and parameter values influence the resolution of the Cole-Cole parameters. A limiting factor is the value of the frequency exponent, C. As C decreases, the resolution of all the Cole-Cole parameters decreases and the results become increasingly non-linear. While the values of the time constant, τ, must be in the acquisition range to resolve the parameters well, the choice between a 50 per cent and a 100 per cent duty cycle for the current injection does not have an influence on the parameter resolution. The limits of resolution and linearity are also studied in a comparison between the MCMC and a linearized gradient-based inversion approach. The two methods are consistent for resolved models, but the linearized approach tends to underestimate the uncertainties for poorly resolved parameters due to the corresponding non-linear features. Finally, an MCMC inversion of 1-D field data verifies that spectral Cole-Cole parameters can also be resolved from TD field measurements.

  3. Estimating Model Probabilities using Thermodynamic Markov Chain Monte Carlo Methods

    NASA Astrophysics Data System (ADS)

    Ye, M.; Liu, P.; Beerli, P.; Lu, D.; Hill, M. C.

    2014-12-01

    Markov chain Monte Carlo (MCMC) methods are widely used to evaluate model probability for quantifying model uncertainty. In a general procedure, MCMC simulations are first conducted for each individual model, and MCMC parameter samples are then used to approximate marginal likelihood of the model by calculating the geometric mean of the joint likelihood of the model and its parameters. It has been found the method of evaluating geometric mean suffers from the numerical problem of low convergence rate. A simple test case shows that even millions of MCMC samples are insufficient to yield accurate estimation of the marginal likelihood. To resolve this problem, a thermodynamic method is used to have multiple MCMC runs with different values of a heating coefficient between zero and one. When the heating coefficient is zero, the MCMC run is equivalent to a random walk MC in the prior parameter space; when the heating coefficient is one, the MCMC run is the conventional one. For a simple case with analytical form of the marginal likelihood, the thermodynamic method yields more accurate estimate than the method of using geometric mean. This is also demonstrated for a case of groundwater modeling with consideration of four alternative models postulated based on different conceptualization of a confining layer. This groundwater example shows that model probabilities estimated using the thermodynamic method are more reasonable than those obtained using the geometric method. The thermodynamic method is general, and can be used for a wide range of environmental problem for model uncertainty quantification.

  4. Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations

    NASA Astrophysics Data System (ADS)

    Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit

    2016-07-01

    A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.

  5. Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sandhu, Rimple; Poirel, Dominique; Pettit, Chris

    2016-07-01

    A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic systemmore » leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.« less

  6. A Bayesian approach to the modelling of α Cen A

    NASA Astrophysics Data System (ADS)

    Bazot, M.; Bourguignon, S.; Christensen-Dalsgaard, J.

    2012-12-01

    Determining the physical characteristics of a star is an inverse problem consisting of estimating the parameters of models for the stellar structure and evolution, and knowing certain observable quantities. We use a Bayesian approach to solve this problem for α Cen A, which allows us to incorporate prior information on the parameters to be estimated, in order to better constrain the problem. Our strategy is based on the use of a Markov chain Monte Carlo (MCMC) algorithm to estimate the posterior probability densities of the stellar parameters: mass, age, initial chemical composition, etc. We use the stellar evolutionary code ASTEC to model the star. To constrain this model both seismic and non-seismic observations were considered. Several different strategies were tested to fit these values, using either two free parameters or five free parameters in ASTEC. We are thus able to show evidence that MCMC methods become efficient with respect to more classical grid-based strategies when the number of parameters increases. The results of our MCMC algorithm allow us to derive estimates for the stellar parameters and robust uncertainties thanks to the statistical analysis of the posterior probability densities. We are also able to compute odds for the presence of a convective core in α Cen A. When using core-sensitive seismic observational constraints, these can rise above ˜40 per cent. The comparison of results to previous studies also indicates that these seismic constraints are of critical importance for our knowledge of the structure of this star.

  7. A Bayesian approach to modeling diffraction profiles and application to ferroelectric materials

    DOE PAGES

    Iamsasri, Thanakorn; Guerrier, Jonathon; Esteves, Giovanni; ...

    2017-02-01

    A new statistical approach for modeling diffraction profiles is introduced, using Bayesian inference and a Markov chain Monte Carlo (MCMC) algorithm. This method is demonstrated by modeling the degenerate reflections during application of an electric field to two different ferroelectric materials: thin-film lead zirconate titanate (PZT) of composition PbZr 0.3Ti 0.7O 3and a bulk commercial PZT polycrystalline ferroelectric. Here, the new method offers a unique uncertainty quantification of the model parameters that can be readily propagated into new calculated parameters.

  8. Bayesian tomography by interacting Markov chains

    NASA Astrophysics Data System (ADS)

    Romary, T.

    2017-12-01

    In seismic tomography, we seek to determine the velocity of the undergound from noisy first arrival travel time observations. In most situations, this is an ill posed inverse problem that admits several unperfect solutions. Given an a priori distribution over the parameters of the velocity model, the Bayesian formulation allows to state this problem as a probabilistic one, with a solution under the form of a posterior distribution. The posterior distribution is generally high dimensional and may exhibit multimodality. Moreover, as it is known only up to a constant, the only sensible way to addressthis problem is to try to generate simulations from the posterior. The natural tools to perform these simulations are Monte Carlo Markov chains (MCMC). Classical implementations of MCMC algorithms generally suffer from slow mixing: the generated states are slow to enter the stationary regime, that is to fit the observations, and when one mode of the posterior is eventually identified, it may become difficult to visit others. Using a varying temperature parameter relaxing the constraint on the data may help to enter the stationary regime. Besides, the sequential nature of MCMC makes them ill fitted toparallel implementation. Running a large number of chains in parallel may be suboptimal as the information gathered by each chain is not mutualized. Parallel tempering (PT) can be seen as a first attempt to make parallel chains at different temperatures communicate but only exchange information between current states. In this talk, I will show that PT actually belongs to a general class of interacting Markov chains algorithm. I will also show that this class enables to design interacting schemes that can take advantage of the whole history of the chain, by authorizing exchanges toward already visited states. The algorithms will be illustrated with toy examples and an application to first arrival traveltime tomography.

  9. Estimation Methods for One-Parameter Testlet Models

    ERIC Educational Resources Information Center

    Jiao, Hong; Wang, Shudong; He, Wei

    2013-01-01

    This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…

  10. Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals

    PubMed Central

    Fourment, Mathieu; Claywell, Brian C; Dinh, Vu; McCoy, Connor; Matsen IV, Frederick A; Darling, Aaron E

    2018-01-01

    Abstract Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop “guided” proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy. PMID:29186587

  11. Lotic ecosystem response to chronic metal contamination assessed by the resazurin-resorufin smart tracer with data assimilation by the Markov chain Monte Carlo method

    NASA Astrophysics Data System (ADS)

    Stanaway, D. J.; Flores, A. N.; Haggerty, R.; Benner, S. G.; Feris, K. P.

    2011-12-01

    Concurrent assessment of biogeochemical and solute transport data (i.e. advection, dispersion, transient storage) within lotic systems remains a challenge in eco-hydrological research. Recently, the Resazurin-Resorufin Smart Tracer System (RRST) was proposed as a mechanism to measure microbial activity at the sediment-water interface [Haggerty et al., 2008, 2009] associating metabolic and hydrologic processes and allowing for the reach scale extrapolation of biotic function in the context of a dynamic physical environment. This study presents a Markov Chain Monte Carlo (MCMC) data assimilation technique to solve the inverse model of the Raz Rru Advection Dispersion Equation (RRADE). The RRADE is a suite of dependent 1-D reactive ADEs, associated through the microbially mediated reduction of Raz to Rru (k12). This reduction is proportional to DO consumption (R^2=0.928). MCMC is a suite of algorithms that solve Bayes theorem to condition uncertain model states and parameters on imperfect observations. Here, the RRST is employed to quantify the effect of chronic metal exposure on hyporheic microbial metabolism along a 100+ year old metal contamination gradient in the Clark Fork River (CF). We hypothesized that 1) the energetic cost of metal tolerance limits heterotrophic microbial respiration in communities evolved in chronic metal contaminated environments, with respiration inhibition directly correlated to degree of contamination (observational experiment) and 2) when experiencing acute metal stress, respiration rate inhibition of metal tolerant communities is less than that of naïve communities (manipulative experiment). To test these hypotheses, 4 replicate columns containing sediment collected from differently contaminated CF reaches and reference sites were fed a solution of RRST, NaCl, and cadmium (manipulative experiment only) within 24 hrs post collection. Column effluent was collected and measured for Raz, Rru, and EC to determine the Raz Rru breakthrough curves (BTC), subsequently modeled by the RRADE and thereby allowing derivation of in situ rates of metabolism. RRADE parameter values are estimated through Metropolis Hastings MCMC optimization. Unknown prior parameter distributions (PD) were constrained via a sensitivity analysis, except for the empirically estimated velocity. MCMC simulations were initiated at random points within the PD. Convergence of target distributions (TD) is achieved when the variance of the mode values of the six RRADE parameters in independent model replication is at least 10^{-3} less than the mode value. Convergence of k12, the parameter of interest, was more resolved, with modal variance of replicate simulations ranging from 10^{-4} less than the modal value to 0. The MCMC algorithm presented here offers a robust approach to solve the inverse RRST model and could be easily adapted to other inverse problems.

  12. Assessing an ensemble Kalman filter inference of Manning's n coefficient of an idealized tidal inlet against a polynomial chaos-based MCMC

    NASA Astrophysics Data System (ADS)

    Siripatana, Adil; Mayo, Talea; Sraj, Ihab; Knio, Omar; Dawson, Clint; Le Maitre, Olivier; Hoteit, Ibrahim

    2017-08-01

    Bayesian estimation/inversion is commonly used to quantify and reduce modeling uncertainties in coastal ocean model, especially in the framework of parameter estimation. Based on Bayes rule, the posterior probability distribution function (pdf) of the estimated quantities is obtained conditioned on available data. It can be computed either directly, using a Markov chain Monte Carlo (MCMC) approach, or by sequentially processing the data following a data assimilation approach, which is heavily exploited in large dimensional state estimation problems. The advantage of data assimilation schemes over MCMC-type methods arises from the ability to algorithmically accommodate a large number of uncertain quantities without significant increase in the computational requirements. However, only approximate estimates are generally obtained by this approach due to the restricted Gaussian prior and noise assumptions that are generally imposed in these methods. This contribution aims at evaluating the effectiveness of utilizing an ensemble Kalman-based data assimilation method for parameter estimation of a coastal ocean model against an MCMC polynomial chaos (PC)-based scheme. We focus on quantifying the uncertainties of a coastal ocean ADvanced CIRCulation (ADCIRC) model with respect to the Manning's n coefficients. Based on a realistic framework of observation system simulation experiments (OSSEs), we apply an ensemble Kalman filter and the MCMC method employing a surrogate of ADCIRC constructed by a non-intrusive PC expansion for evaluating the likelihood, and test both approaches under identical scenarios. We study the sensitivity of the estimated posteriors with respect to the parameters of the inference methods, including ensemble size, inflation factor, and PC order. A full analysis of both methods, in the context of coastal ocean model, suggests that an ensemble Kalman filter with appropriate ensemble size and well-tuned inflation provides reliable mean estimates and uncertainties of Manning's n coefficients compared to the full posterior distributions inferred by MCMC.

  13. Quantifying MCMC exploration of phylogenetic tree space.

    PubMed

    Whidden, Chris; Matsen, Frederick A

    2015-05-01

    In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  14. Bayesian seismic tomography by parallel interacting Markov chains

    NASA Astrophysics Data System (ADS)

    Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas

    2014-05-01

    The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the high probability zones of the model space while avoiding the chains to end stuck in a probability maximum. This approach supplies thus a robust way to analyze the tomography imaging uncertainties. The interacting MCMC approach is illustrated on two synthetic examples of tomography of calibration shots such as encountered in induced microseismic studies. On the second application, a wavelet based model parameterization is presented that allows to significantly reduce the dimension of the problem, making thus the algorithm efficient even for a complex velocity model.

  15. Bayesian inference for dynamic transcriptional regulation; the Hes1 system as a case study.

    PubMed

    Heron, Elizabeth A; Finkenstädt, Bärbel; Rand, David A

    2007-10-01

    In this study, we address the problem of estimating the parameters of regulatory networks and provide the first application of Markov chain Monte Carlo (MCMC) methods to experimental data. As a case study, we consider a stochastic model of the Hes1 system expressed in terms of stochastic differential equations (SDEs) to which rigorous likelihood methods of inference can be applied. When fitting continuous-time stochastic models to discretely observed time series the lengths of the sampling intervals are important, and much of our study addresses the problem when the data are sparse. We estimate the parameters of an autoregulatory network providing results both for simulated and real experimental data from the Hes1 system. We develop an estimation algorithm using MCMC techniques which are flexible enough to allow for the imputation of latent data on a finer time scale and the presence of prior information about parameters which may be informed from other experiments as well as additional measurement error.

  16. Stochastic Inversion of 2D Magnetotelluric Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Jinsong

    2010-07-01

    The algorithm is developed to invert 2D magnetotelluric (MT) data based on sharp boundary parametrization using a Bayesian framework. Within the algorithm, we consider the locations and the resistivity of regions formed by the interfaces are as unknowns. We use a parallel, adaptive finite-element algorithm to forward simulate frequency-domain MT responses of 2D conductivity structure. Those unknown parameters are spatially correlated and are described by a geostatistical model. The joint posterior probability distribution function is explored by Markov Chain Monte Carlo (MCMC) sampling methods. The developed stochastic model is effective for estimating the interface locations and resistivity. Most importantly, itmore » provides details uncertainty information on each unknown parameter. Hardware requirements: PC, Supercomputer, Multi-platform, Workstation; Software requirements C and Fortan; Operation Systems/version is Linux/Unix or Windows« less

  17. Standard Error Estimation of 3PL IRT True Score Equating with an MCMC Method

    ERIC Educational Resources Information Center

    Liu, Yuming; Schulz, E. Matthew; Yu, Lei

    2008-01-01

    A Markov chain Monte Carlo (MCMC) method and a bootstrap method were compared in the estimation of standard errors of item response theory (IRT) true score equating. Three test form relationships were examined: parallel, tau-equivalent, and congeneric. Data were simulated based on Reading Comprehension and Vocabulary tests of the Iowa Tests of…

  18. Comparison of missing value imputation methods in time series: the case of Turkish meteorological data

    NASA Astrophysics Data System (ADS)

    Yozgatligil, Ceylan; Aslan, Sipan; Iyigun, Cem; Batmaz, Inci

    2013-04-01

    This study aims to compare several imputation methods to complete the missing values of spatio-temporal meteorological time series. To this end, six imputation methods are assessed with respect to various criteria including accuracy, robustness, precision, and efficiency for artificially created missing data in monthly total precipitation and mean temperature series obtained from the Turkish State Meteorological Service. Of these methods, simple arithmetic average, normal ratio (NR), and NR weighted with correlations comprise the simple ones, whereas multilayer perceptron type neural network and multiple imputation strategy adopted by Monte Carlo Markov Chain based on expectation-maximization (EM-MCMC) are computationally intensive ones. In addition, we propose a modification on the EM-MCMC method. Besides using a conventional accuracy measure based on squared errors, we also suggest the correlation dimension (CD) technique of nonlinear dynamic time series analysis which takes spatio-temporal dependencies into account for evaluating imputation performances. Depending on the detailed graphical and quantitative analysis, it can be said that although computational methods, particularly EM-MCMC method, are computationally inefficient, they seem favorable for imputation of meteorological time series with respect to different missingness periods considering both measures and both series studied. To conclude, using the EM-MCMC algorithm for imputing missing values before conducting any statistical analyses of meteorological data will definitely decrease the amount of uncertainty and give more robust results. Moreover, the CD measure can be suggested for the performance evaluation of missing data imputation particularly with computational methods since it gives more precise results in meteorological time series.

  19. Model-based Bayesian inference for ROC data analysis

    NASA Astrophysics Data System (ADS)

    Lei, Tianhu; Bae, K. Ty

    2013-03-01

    This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.

  20. Bayesian inference for OPC modeling

    NASA Astrophysics Data System (ADS)

    Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.

    2016-03-01

    The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.

  1. Improving the Fitness of High-Dimensional Biomechanical Models via Data-Driven Stochastic Exploration

    PubMed Central

    Bustamante, Carlos D.; Valero-Cuevas, Francisco J.

    2010-01-01

    The field of complex biomechanical modeling has begun to rely on Monte Carlo techniques to investigate the effects of parameter variability and measurement uncertainty on model outputs, search for optimal parameter combinations, and define model limitations. However, advanced stochastic methods to perform data-driven explorations, such as Markov chain Monte Carlo (MCMC), become necessary as the number of model parameters increases. Here, we demonstrate the feasibility and, what to our knowledge is, the first use of an MCMC approach to improve the fitness of realistically large biomechanical models. We used a Metropolis–Hastings algorithm to search increasingly complex parameter landscapes (3, 8, 24, and 36 dimensions) to uncover underlying distributions of anatomical parameters of a “truth model” of the human thumb on the basis of simulated kinematic data (thumbnail location, orientation, and linear and angular velocities) polluted by zero-mean, uncorrelated multivariate Gaussian “measurement noise.” Driven by these data, ten Markov chains searched each model parameter space for the subspace that best fit the data (posterior distribution). As expected, the convergence time increased, more local minima were found, and marginal distributions broadened as the parameter space complexity increased. In the 36-D scenario, some chains found local minima but the majority of chains converged to the true posterior distribution (confirmed using a cross-validation dataset), thus demonstrating the feasibility and utility of these methods for realistically large biomechanical problems. PMID:19272906

  2. Low Frequency Flats for Imaging Cameras on the Hubble Space Telescope

    NASA Astrophysics Data System (ADS)

    Kossakowski, Diana; Avila, Roberto J.; Borncamp, David; Grogin, Norman A.

    2017-01-01

    We created a revamped Low Frequency Flat (L-Flat) algorithm for the Hubble Space Telescope (HST) and all of its imaging cameras. The current program that makes these calibration files does not compile on modern computer systems and it requires translation to Python. We took the opportunity to explore various methods that reduce the scatter of photometric observations using chi-squared optimizers along with Markov Chain Monte Carlo (MCMC). We created simulations to validate the algorithms and then worked with the UV photometry of the globular cluster NGC6681 to update the calibration files for the Advanced Camera for Surveys (ACS) and Solar Blind Channel (SBC). The new software was made for general usage and therefore can be applied to any of the current imaging cameras on HST.

  3. Development of Cloud and Precipitation Property Retrieval Algorithms and Measurement Simulators from ASR Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mace, Gerald G.

    What has made the ASR program unique is the amount of information that is available. The suite of recently deployed instruments significantly expands the scope of the program (Mather and Voyles, 2013). The breadth of this information allows us to pose sophisticated process-level questions. Our ASR project, now entering its third year, has been about developing algorithms that use this information in ways that fully exploit the new capacity of the ARM data streams. Using optimal estimation (OE) and Markov Chain Monte Carlo (MCMC) inversion techniques, we have developed methodologies that allow us to use multiple radar frequency Doppler spectramore » along with lidar and passive constraints where data streams can be added or subtracted efficiently and algorithms can be reformulated for various combinations of hydrometeors by exchanging sets of empirical coefficients. These methodologies have been applied to boundary layer clouds, mixed phase snow cloud systems, and cirrus.« less

  4. Exact Bayesian Inference for Phylogenetic Birth-Death Models.

    PubMed

    Parag, K V; Pybus, O G

    2018-04-26

    Inferring the rates of change of a population from a reconstructed phylogeny of genetic sequences is a central problem in macro-evolutionary biology, epidemiology, and many other disciplines. A popular solution involves estimating the parameters of a birth-death process (BDP), which links the shape of the phylogeny to its birth and death rates. Modern BDP estimators rely on random Markov chain Monte Carlo (MCMC) sampling to infer these rates. Such methods, while powerful and scalable, cannot be guaranteed to converge, leading to results that may be hard to replicate or difficult to validate. We present a conceptually and computationally different parametric BDP inference approach using flexible and easy to implement Snyder filter (SF) algorithms. This method is deterministic so its results are provable, guaranteed, and reproducible. We validate the SF on constant rate BDPs and find that it solves BDP likelihoods known to produce robust estimates. We then examine more complex BDPs with time-varying rates. Our estimates compare well with a recently developed parametric MCMC inference method. Lastly, we performmodel selection on an empirical Agamid species phylogeny, obtaining results consistent with the literature. The SF makes no approximations, beyond those required for parameter quantisation and numerical integration, and directly computes the posterior distribution of model parameters. It is a promising alternative inference algorithm that may serve either as a standalone Bayesian estimator or as a useful diagnostic reference for validating more involved MCMC strategies. The Snyder filter is implemented in Matlab and the time-varying BDP models are simulated in R. The source code and data are freely available at https://github.com/kpzoo/snyder-birth-death-code. kris.parag@zoo.ox.ac.uk. Supplementary material is available at Bioinformatics online.

  5. Birth-death prior on phylogeny and speed dating

    PubMed Central

    2008-01-01

    Background In recent years there has been a trend of leaving the strict molecular clock in order to infer dating of speciations and other evolutionary events. Explicit modeling of substitution rates and divergence times makes formulation of informative prior distributions for branch lengths possible. Models with birth-death priors on tree branching and auto-correlated or iid substitution rates among lineages have been proposed, enabling simultaneous inference of substitution rates and divergence times. This problem has, however, mainly been analysed in the Markov chain Monte Carlo (MCMC) framework, an approach requiring computation times of hours or days when applied to large phylogenies. Results We demonstrate that a hill-climbing maximum a posteriori (MAP) adaptation of the MCMC scheme results in considerable gain in computational efficiency. We demonstrate also that a novel dynamic programming (DP) algorithm for branch length factorization, useful both in the hill-climbing and in the MCMC setting, further reduces computation time. For the problem of inferring rates and times parameters on a fixed tree, we perform simulations, comparisons between hill-climbing and MCMC on a plant rbcL gene dataset, and dating analysis on an animal mtDNA dataset, showing that our methodology enables efficient, highly accurate analysis of very large trees. Datasets requiring a computation time of several days with MCMC can with our MAP algorithm be accurately analysed in less than a minute. From the results of our example analyses, we conclude that our methodology generally avoids getting trapped early in local optima. For the cases where this nevertheless can be a problem, for instance when we in addition to the parameters also infer the tree topology, we show that the problem can be evaded by using a simulated-annealing like (SAL) method in which we favour tree swaps early in the inference while biasing our focus towards rate and time parameter changes later on. Conclusion Our contribution leaves the field open for fast and accurate dating analysis of nucleotide sequence data. Modeling branch substitutions rates and divergence times separately allows us to include birth-death priors on the times without the assumption of a molecular clock. The methodology is easily adapted to take data from fossil records into account and it can be used together with a broad range of rate and substitution models. PMID:18318893

  6. Birth-death prior on phylogeny and speed dating.

    PubMed

    Akerborg, Orjan; Sennblad, Bengt; Lagergren, Jens

    2008-03-04

    In recent years there has been a trend of leaving the strict molecular clock in order to infer dating of speciations and other evolutionary events. Explicit modeling of substitution rates and divergence times makes formulation of informative prior distributions for branch lengths possible. Models with birth-death priors on tree branching and auto-correlated or iid substitution rates among lineages have been proposed, enabling simultaneous inference of substitution rates and divergence times. This problem has, however, mainly been analysed in the Markov chain Monte Carlo (MCMC) framework, an approach requiring computation times of hours or days when applied to large phylogenies. We demonstrate that a hill-climbing maximum a posteriori (MAP) adaptation of the MCMC scheme results in considerable gain in computational efficiency. We demonstrate also that a novel dynamic programming (DP) algorithm for branch length factorization, useful both in the hill-climbing and in the MCMC setting, further reduces computation time. For the problem of inferring rates and times parameters on a fixed tree, we perform simulations, comparisons between hill-climbing and MCMC on a plant rbcL gene dataset, and dating analysis on an animal mtDNA dataset, showing that our methodology enables efficient, highly accurate analysis of very large trees. Datasets requiring a computation time of several days with MCMC can with our MAP algorithm be accurately analysed in less than a minute. From the results of our example analyses, we conclude that our methodology generally avoids getting trapped early in local optima. For the cases where this nevertheless can be a problem, for instance when we in addition to the parameters also infer the tree topology, we show that the problem can be evaded by using a simulated-annealing like (SAL) method in which we favour tree swaps early in the inference while biasing our focus towards rate and time parameter changes later on. Our contribution leaves the field open for fast and accurate dating analysis of nucleotide sequence data. Modeling branch substitutions rates and divergence times separately allows us to include birth-death priors on the times without the assumption of a molecular clock. The methodology is easily adapted to take data from fossil records into account and it can be used together with a broad range of rate and substitution models.

  7. BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC

    PubMed Central

    Satija, Rahul; Novák, Ádám; Miklós, István; Lyngsø, Rune; Hein, Jotun

    2009-01-01

    Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from PMID:19715598

  8. BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC.

    PubMed

    Satija, Rahul; Novák, Adám; Miklós, István; Lyngsø, Rune; Hein, Jotun

    2009-08-28

    We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the alpha-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/~satija/BigFoot/

  9. A surrogate-based sensitivity quantification and Bayesian inversion of a regional groundwater flow model

    NASA Astrophysics Data System (ADS)

    Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor

    2018-02-01

    Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.

  10. A comparison between Gauss-Newton and Markov chain Monte Carlo basedmethods for inverting spectral induced polarization data for Cole-Coleparameters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Jinsong; Kemna, Andreas; Hubbard, Susan S.

    2008-05-15

    We develop a Bayesian model to invert spectral induced polarization (SIP) data for Cole-Cole parameters using Markov chain Monte Carlo (MCMC) sampling methods. We compare the performance of the MCMC based stochastic method with an iterative Gauss-Newton based deterministic method for Cole-Cole parameter estimation through inversion of synthetic and laboratory SIP data. The Gauss-Newton based method can provide an optimal solution for given objective functions under constraints, but the obtained optimal solution generally depends on the choice of initial values and the estimated uncertainty information is often inaccurate or insufficient. In contrast, the MCMC based inversion method provides extensive globalmore » information on unknown parameters, such as the marginal probability distribution functions, from which we can obtain better estimates and tighter uncertainty bounds of the parameters than with the deterministic method. Additionally, the results obtained with the MCMC method are independent of the choice of initial values. Because the MCMC based method does not explicitly offer single optimal solution for given objective functions, the deterministic and stochastic methods can complement each other. For example, the stochastic method can first be used to obtain the means of the unknown parameters by starting from an arbitrary set of initial values and the deterministic method can then be initiated using the means as starting values to obtain the optimal estimates of the Cole-Cole parameters.« less

  11. Bioinactivation: Software for modelling dynamic microbial inactivation.

    PubMed

    Garre, Alberto; Fernández, Pablo S; Lindqvist, Roland; Egea, Jose A

    2017-03-01

    This contribution presents the bioinactivation software, which implements functions for the modelling of isothermal and non-isothermal microbial inactivation. This software offers features such as user-friendliness, modelling of dynamic conditions, possibility to choose the fitting algorithm and generation of prediction intervals. The software is offered in two different formats: Bioinactivation core and Bioinactivation SE. Bioinactivation core is a package for the R programming language, which includes features for the generation of predictions and for the fitting of models to inactivation experiments using non-linear regression or a Markov Chain Monte Carlo algorithm (MCMC). The calculations are based on inactivation models common in academia and industry (Bigelow, Peleg, Mafart and Geeraerd). Bioinactivation SE supplies a user-friendly interface to selected functions of Bioinactivation core, namely the model fitting of non-isothermal experiments and the generation of prediction intervals. The capabilities of bioinactivation are presented in this paper through a case study, modelling the non-isothermal inactivation of Bacillus sporothermodurans. This study has provided a full characterization of the response of the bacteria to dynamic temperature conditions, including confidence intervals for the model parameters and a prediction interval of the survivor curve. We conclude that the MCMC algorithm produces a better characterization of the biological uncertainty and variability than non-linear regression. The bioinactivation software can be relevant to the food and pharmaceutical industry, as well as to regulatory agencies, as part of a (quantitative) microbial risk assessment. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Improved Assimilation of Streamflow and Satellite Soil Moisture with the Evolutionary Particle Filter and Geostatistical Modeling

    NASA Astrophysics Data System (ADS)

    Yan, Hongxiang; Moradkhani, Hamid; Abbaszadeh, Peyman

    2017-04-01

    Assimilation of satellite soil moisture and streamflow data into hydrologic models using has received increasing attention over the past few years. Currently, these observations are increasingly used to improve the model streamflow and soil moisture predictions. However, the performance of this land data assimilation (DA) system still suffers from two limitations: 1) satellite data scarcity and quality; and 2) particle weight degeneration. In order to overcome these two limitations, we propose two possible solutions in this study. First, the general Gaussian geostatistical approach is proposed to overcome the limitation in the space/time resolution of satellite soil moisture products thus improving their accuracy at uncovered/biased grid cells. Secondly, an evolutionary PF approach based on Genetic Algorithm (GA) and Markov Chain Monte Carlo (MCMC), the so-called EPF-MCMC, is developed to further reduce weight degeneration and improve the robustness of the land DA system. This study provides a detailed analysis of the joint and separate assimilation of streamflow and satellite soil moisture into a distributed Sacramento Soil Moisture Accounting (SAC-SMA) model, with the use of recently developed EPF-MCMC and the general Gaussian geostatistical approach. Performance is assessed over several basins in the USA selected from Model Parameter Estimation Experiment (MOPEX) and located in different climate regions. The results indicate that: 1) the general Gaussian approach can predict the soil moisture at uncovered grid cells within the expected satellite data quality threshold; 2) assimilation of satellite soil moisture inferred from the general Gaussian model can significantly improve the soil moisture predictions; and 3) in terms of both deterministic and probabilistic measures, the EPF-MCMC can achieve better streamflow predictions. These results recommend that the geostatistical model is a helpful tool to aid the remote sensing technique and the EPF-MCMC is a reliable and effective DA approach in hydrologic applications.

  13. Real-time individual predictions of prostate cancer recurrence using joint models

    PubMed Central

    Taylor, Jeremy M. G.; Park, Yongseok; Ankerst, Donna P.; Proust-Lima, Cecile; Williams, Scott; Kestin, Larry; Bae, Kyoungwha; Pickles, Tom; Sandler, Howard

    2012-01-01

    Summary Patients who were previously treated for prostate cancer with radiation therapy are monitored at regular intervals using a laboratory test called Prostate Specific Antigen (PSA). If the value of the PSA test starts to rise, this is an indication that the prostate cancer is more likely to recur, and the patient may wish to initiate new treatments. Such patients could be helped in making medical decisions by an accurate estimate of the probability of recurrence of the cancer in the next few years. In this paper, we describe the methodology for giving the probability of recurrence for a new patient, as implemented on a web-based calculator. The methods use a joint longitudinal survival model. The model is developed on a training dataset of 2,386 patients and tested on a dataset of 846 patients. Bayesian estimation methods are used with one Markov chain Monte Carlo (MCMC) algorithm developed for estimation of the parameters from the training dataset and a second quick MCMC developed for prediction of the risk of recurrence that uses the longitudinal PSA measures from a new patient. PMID:23379600

  14. Bayesian analysis of the flutter margin method in aeroelasticity

    DOE PAGES

    Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit

    2016-08-27

    A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the fluttermore » speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.« less

  15. Bayesian Peptide Peak Detection for High Resolution TOF Mass Spectrometry.

    PubMed

    Zhang, Jianqiu; Zhou, Xiaobo; Wang, Honghui; Suffredini, Anthony; Zhang, Lin; Huang, Yufei; Wong, Stephen

    2010-11-01

    In this paper, we address the issue of peptide ion peak detection for high resolution time-of-flight (TOF) mass spectrometry (MS) data. A novel Bayesian peptide ion peak detection method is proposed for TOF data with resolution of 10 000-15 000 full width at half-maximum (FWHW). MS spectra exhibit distinct characteristics at this resolution, which are captured in a novel parametric model. Based on the proposed parametric model, a Bayesian peak detection algorithm based on Markov chain Monte Carlo (MCMC) sampling is developed. The proposed algorithm is tested on both simulated and real datasets. The results show a significant improvement in detection performance over a commonly employed method. The results also agree with expert's visual inspection. Moreover, better detection consistency is achieved across MS datasets from patients with identical pathological condition.

  16. Bayesian Peptide Peak Detection for High Resolution TOF Mass Spectrometry

    PubMed Central

    Zhang, Jianqiu; Zhou, Xiaobo; Wang, Honghui; Suffredini, Anthony; Zhang, Lin; Huang, Yufei; Wong, Stephen

    2011-01-01

    In this paper, we address the issue of peptide ion peak detection for high resolution time-of-flight (TOF) mass spectrometry (MS) data. A novel Bayesian peptide ion peak detection method is proposed for TOF data with resolution of 10 000–15 000 full width at half-maximum (FWHW). MS spectra exhibit distinct characteristics at this resolution, which are captured in a novel parametric model. Based on the proposed parametric model, a Bayesian peak detection algorithm based on Markov chain Monte Carlo (MCMC) sampling is developed. The proposed algorithm is tested on both simulated and real datasets. The results show a significant improvement in detection performance over a commonly employed method. The results also agree with expert’s visual inspection. Moreover, better detection consistency is achieved across MS datasets from patients with identical pathological condition. PMID:21544266

  17. Gradient-free MCMC methods for dynamic causal modelling

    DOE PAGES

    Sengupta, Biswa; Friston, Karl J.; Penny, Will D.

    2015-03-14

    Here, we compare the performance of four gradient-free MCMC samplers (random walk Metropolis sampling, slice-sampling, adaptive MCMC sampling and population-based MCMC sampling with tempering) in terms of the number of independent samples they can produce per unit computational time. For the Bayesian inversion of a single-node neural mass model, both adaptive and population-based samplers are more efficient compared with random walk Metropolis sampler or slice-sampling; yet adaptive MCMC sampling is more promising in terms of compute time. Slice-sampling yields the highest number of independent samples from the target density -- albeit at almost 1000% increase in computational time, in comparisonmore » to the most efficient algorithm (i.e., the adaptive MCMC sampler).« less

  18. A Bayesian Uncertainty Framework for Conceptual Snowmelt and Hydrologic Models Applied to the Tenderfoot Creek Experimental Forest

    NASA Astrophysics Data System (ADS)

    Smith, T.; Marshall, L.

    2007-12-01

    In many mountainous regions, the single most important parameter in forecasting the controls on regional water resources is snowpack (Williams et al., 1999). In an effort to bridge the gap between theoretical understanding and functional modeling of snow-driven watersheds, a flexible hydrologic modeling framework is being developed. The aim is to create a suite of models that move from parsimonious structures, concentrated on aggregated watershed response, to those focused on representing finer scale processes and distributed response. This framework will operate as a tool to investigate the link between hydrologic model predictive performance, uncertainty, model complexity, and observable hydrologic processes. Bayesian methods, and particularly Markov chain Monte Carlo (MCMC) techniques, are extremely useful in uncertainty assessment and parameter estimation of hydrologic models. However, these methods have some difficulties in implementation. In a traditional Bayesian setting, it can be difficult to reconcile multiple data types, particularly those offering different spatial and temporal coverage, depending on the model type. These difficulties are also exacerbated by sensitivity of MCMC algorithms to model initialization and complex parameter interdependencies. As a way of circumnavigating some of the computational complications, adaptive MCMC algorithms have been developed to take advantage of the information gained from each successive iteration. Two adaptive algorithms are compared is this study, the Adaptive Metropolis (AM) algorithm, developed by Haario et al (2001), and the Delayed Rejection Adaptive Metropolis (DRAM) algorithm, developed by Haario et al (2006). While neither algorithm is truly Markovian, it has been proven that each satisfies the desired ergodicity and stationarity properties of Markov chains. Both algorithms were implemented as the uncertainty and parameter estimation framework for a conceptual rainfall-runoff model based on the Probability Distributed Model (PDM), developed by Moore (1985). We implement the modeling framework in Stringer Creek watershed in the Tenderfoot Creek Experimental Forest (TCEF), Montana. The snowmelt-driven watershed offers that additional challenge of modeling snow accumulation and melt and current efforts are aimed at developing a temperature- and radiation-index snowmelt model. Auxiliary data available from within TCEF's watersheds are used to support in the understanding of information value as it relates to predictive performance. Because the model is based on lumped parameters, auxiliary data are hard to incorporate directly. However, these additional data offer benefits through the ability to inform prior distributions of the lumped, model parameters. By incorporating data offering different information into the uncertainty assessment process, a cross-validation technique is engaged to better ensure that modeled results reflect real process complexity.

  19. An investigation into exoplanet transits and uncertainties

    NASA Astrophysics Data System (ADS)

    Ji, Y.; Banks, T.; Budding, E.; Rhodes, M. D.

    2017-06-01

    A simple transit model is described along with tests of this model against published results for 4 exoplanet systems (Kepler-1, 2, 8, and 77). Data from the Kepler mission are used. The Markov Chain Monte Carlo (MCMC) method is applied to obtain realistic error estimates. Optimisation of limb darkening coefficients is subject to data quality. It is more likely for MCMC to derive an empirical limb darkening coefficient for light curves with S/N (signal to noise) above 15. Finally, the model is applied to Kepler data for 4 Kepler candidate systems (KOI 760.01, 767.01, 802.01, and 824.01) with previously unpublished results. Error estimates for these systems are obtained via the MCMC method.

  20. SPOTting model parameters using a ready-made Python package

    NASA Astrophysics Data System (ADS)

    Houska, Tobias; Kraft, Philipp; Breuer, Lutz

    2015-04-01

    The selection and parameterization of reliable process descriptions in ecological modelling is driven by several uncertainties. The procedure is highly dependent on various criteria, like the used algorithm, the likelihood function selected and the definition of the prior parameter distributions. A wide variety of tools have been developed in the past decades to optimize parameters. Some of the tools are closed source. Due to this, the choice for a specific parameter estimation method is sometimes more dependent on its availability than the performance. A toolbox with a large set of methods can support users in deciding about the most suitable method. Further, it enables to test and compare different methods. We developed the SPOT (Statistical Parameter Optimization Tool), an open source python package containing a comprehensive set of modules, to analyze and optimize parameters of (environmental) models. SPOT comes along with a selected set of algorithms for parameter optimization and uncertainty analyses (Monte Carlo, MC; Latin Hypercube Sampling, LHS; Maximum Likelihood, MLE; Markov Chain Monte Carlo, MCMC; Scuffled Complex Evolution, SCE-UA; Differential Evolution Markov Chain, DE-MCZ), together with several likelihood functions (Bias, (log-) Nash-Sutcliff model efficiency, Correlation Coefficient, Coefficient of Determination, Covariance, (Decomposed-, Relative-, Root-) Mean Squared Error, Mean Absolute Error, Agreement Index) and prior distributions (Binomial, Chi-Square, Dirichlet, Exponential, Laplace, (log-, multivariate-) Normal, Pareto, Poisson, Cauchy, Uniform, Weibull) to sample from. The model-independent structure makes it suitable to analyze a wide range of applications. We apply all algorithms of the SPOT package in three different case studies. Firstly, we investigate the response of the Rosenbrock function, where the MLE algorithm shows its strengths. Secondly, we study the Griewank function, which has a challenging response surface for optimization methods. Here we see simple algorithms like the MCMC struggling to find the global optimum of the function, while algorithms like SCE-UA and DE-MCZ show their strengths. Thirdly, we apply an uncertainty analysis of a one-dimensional physically based hydrological model build with the Catchment Modelling Framework (CMF). The model is driven by meteorological and groundwater data from a Free Air Carbon Enrichment (FACE) experiment in Linden (Hesse, Germany). Simulation results are evaluated with measured soil moisture data. We search for optimal parameter sets of the van Genuchten-Mualem function and find different equally optimal solutions with some of the algorithms. The case studies reveal that the implemented SPOT methods work sufficiently well. They further show the benefit of having one tool at hand that includes a number of parameter search methods, likelihood functions and a priori parameter distributions within one platform independent package.

  1. Uncertainty Quantification of GEOS-5 L-band Radiative Transfer Model Parameters Using Bayesian Inference and SMOS Observations

    NASA Technical Reports Server (NTRS)

    DeLannoy, Gabrielle J. M.; Reichle, Rolf H.; Vrugt, Jasper A.

    2013-01-01

    Uncertainties in L-band (1.4 GHz) radiative transfer modeling (RTM) affect the simulation of brightness temperatures (Tb) over land and the inversion of satellite-observed Tb into soil moisture retrievals. In particular, accurate estimates of the microwave soil roughness, vegetation opacity and scattering albedo for large-scale applications are difficult to obtain from field studies and often lack an uncertainty estimate. Here, a Markov Chain Monte Carlo (MCMC) simulation method is used to determine satellite-scale estimates of RTM parameters and their posterior uncertainty by minimizing the misfit between long-term averages and standard deviations of simulated and observed Tb at a range of incidence angles, at horizontal and vertical polarization, and for morning and evening overpasses. Tb simulations are generated with the Goddard Earth Observing System (GEOS-5) and confronted with Tb observations from the Soil Moisture Ocean Salinity (SMOS) mission. The MCMC algorithm suggests that the relative uncertainty of the RTM parameter estimates is typically less than 25 of the maximum a posteriori density (MAP) parameter value. Furthermore, the actual root-mean-square-differences in long-term Tb averages and standard deviations are found consistent with the respective estimated total simulation and observation error standard deviations of m3.1K and s2.4K. It is also shown that the MAP parameter values estimated through MCMC simulation are in close agreement with those obtained with Particle Swarm Optimization (PSO).

  2. Gradient-free MCMC methods for dynamic causal modelling.

    PubMed

    Sengupta, Biswa; Friston, Karl J; Penny, Will D

    2015-05-15

    In this technical note we compare the performance of four gradient-free MCMC samplers (random walk Metropolis sampling, slice-sampling, adaptive MCMC sampling and population-based MCMC sampling with tempering) in terms of the number of independent samples they can produce per unit computational time. For the Bayesian inversion of a single-node neural mass model, both adaptive and population-based samplers are more efficient compared with random walk Metropolis sampler or slice-sampling; yet adaptive MCMC sampling is more promising in terms of compute time. Slice-sampling yields the highest number of independent samples from the target density - albeit at almost 1000% increase in computational time, in comparison to the most efficient algorithm (i.e., the adaptive MCMC sampler). Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  3. A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

    ERIC Educational Resources Information Center

    Edwards, Michael C.

    2010-01-01

    Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show…

  4. Markov Chain Monte Carlo Estimation of Item Parameters for the Generalized Graded Unfolding Model

    ERIC Educational Resources Information Center

    de la Torre, Jimmy; Stark, Stephen; Chernyshenko, Oleksandr S.

    2006-01-01

    The authors present a Markov Chain Monte Carlo (MCMC) parameter estimation procedure for the generalized graded unfolding model (GGUM) and compare it to the marginal maximum likelihood (MML) approach implemented in the GGUM2000 computer program, using simulated and real personality data. In the simulation study, test length, number of response…

  5. The Bayesian group lasso for confounded spatial data

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.

    2017-01-01

    Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.

  6. Inference of multi-Gaussian property fields by probabilistic inversion of crosshole ground penetrating radar data using an improved dimensionality reduction

    NASA Astrophysics Data System (ADS)

    Hunziker, Jürg; Laloy, Eric; Linde, Niklas

    2016-04-01

    Deterministic inversion procedures can often explain field data, but they only deliver one final subsurface model that depends on the initial model and regularization constraints. This leads to poor insights about the uncertainties associated with the inferred model properties. In contrast, probabilistic inversions can provide an ensemble of model realizations that accurately span the range of possible models that honor the available calibration data and prior information allowing a quantitative description of model uncertainties. We reconsider the problem of inferring the dielectric permittivity (directly related to radar velocity) structure of the subsurface by inversion of first-arrival travel times from crosshole ground penetrating radar (GPR) measurements. We rely on the DREAM_(ZS) algorithm that is a state-of-the-art Markov chain Monte Carlo (MCMC) algorithm. Such algorithms need several orders of magnitude more forward simulations than deterministic algorithms and often become infeasible in high parameter dimensions. To enable high-resolution imaging with MCMC, we use a recently proposed dimensionality reduction approach that allows reproducing 2D multi-Gaussian fields with far fewer parameters than a classical grid discretization. We consider herein a dimensionality reduction from 5000 to 257 unknowns. The first 250 parameters correspond to a spectral representation of random and uncorrelated spatial fluctuations while the remaining seven geostatistical parameters are (1) the standard deviation of the data error, (2) the mean and (3) the variance of the relative electric permittivity, (4) the integral scale along the major axis of anisotropy, (5) the anisotropy angle, (6) the ratio of the integral scale along the minor axis of anisotropy to the integral scale along the major axis of anisotropy and (7) the shape parameter of the Matérn function. The latter essentially defines the type of covariance function (e.g., exponential, Whittle, Gaussian). We present an improved formulation of the dimensionality reduction, and numerically show how it reduces artifacts in the generated models and provides better posterior estimation of the subsurface geostatistical structure. We next show that the results of the method compare very favorably against previous deterministic and stochastic inversion results obtained at the South Oyster Bacterial Transport Site in Virginia, USA. The long-term goal of this work is to enable MCMC-based full waveform inversion of crosshole GPR data.

  7. Asteroid orbital inversion using uniform phase-space sampling

    NASA Astrophysics Data System (ADS)

    Muinonen, K.; Pentikäinen, H.; Granvik, M.; Oszkiewicz, D.; Virtanen, J.

    2014-07-01

    We review statistical inverse methods for asteroid orbit computation from a small number of astrometric observations and short time intervals of observations. With the help of Markov-chain Monte Carlo methods (MCMC), we present a novel inverse method that utilizes uniform sampling of the phase space for the orbital elements. The statistical orbital ranging method (Virtanen et al. 2001, Muinonen et al. 2001) was set out to resolve the long-lasting challenges in the initial computation of orbits for asteroids. The ranging method starts from the selection of a pair of astrometric observations. Thereafter, the topocentric ranges and angular deviations in R.A. and Decl. are randomly sampled. The two Cartesian positions allow for the computation of orbital elements and, subsequently, the computation of ephemerides for the observation dates. Candidate orbital elements are included in the sample of accepted elements if the χ^2-value between the observed and computed observations is within a pre-defined threshold. The sample orbital elements obtain weights based on a certain debiasing procedure. When the weights are available, the full sample of orbital elements allows the probabilistic assessments for, e.g., object classification and ephemeris computation as well as the computation of collision probabilities. The MCMC ranging method (Oszkiewicz et al. 2009; see also Granvik et al. 2009) replaces the original sampling algorithm described above with a proposal probability density function (p.d.f.), and a chain of sample orbital elements results in the phase space. MCMC ranging is based on a bivariate Gaussian p.d.f. for the topocentric ranges, and allows for the sampling to focus on the phase-space domain with most of the probability mass. In the virtual-observation MCMC method (Muinonen et al. 2012), the proposal p.d.f. for the orbital elements is chosen to mimic the a posteriori p.d.f. for the elements: first, random errors are simulated for each observation, resulting in a set of virtual observations; second, corresponding virtual least-squares orbital elements are derived using the Nelder-Mead downhill simplex method; third, repeating the procedure two times allows for a computation of a difference for two sets of virtual orbital elements; and, fourth, this orbital-element difference constitutes a symmetric proposal in a random-walk Metropolis-Hastings algorithm, avoiding the explicit computation of the proposal p.d.f. In a discrete approximation, the allowed proposals coincide with the differences that are based on a large number of pre-computed sets of virtual least-squares orbital elements. The virtual-observation MCMC method is thus based on the characterization of the relevant volume in the orbital-element phase space. Here we utilize MCMC to map the phase-space domain of acceptable solutions. We can make use of the proposal p.d.f.s from the MCMC ranging and virtual-observation methods. The present phase-space mapping produces, upon convergence, a uniform sampling of the solution space within a pre-defined χ^2-value. The weights of the sampled orbital elements are then computed on the basis of the corresponding χ^2-values. The present method resembles the original ranging method. On one hand, MCMC mapping is insensitive to local extrema in the phase space and efficiently maps the solution space. This is somewhat contrary to the MCMC methods described above. On the other hand, MCMC mapping can suffer from producing a small number of sample elements with small χ^2-values, in resemblance to the original ranging method. We apply the methods to example near-Earth, main-belt, and transneptunian objects, and highlight the utilization of the methods in the data processing and analysis pipeline of the ESA Gaia space mission.

  8. Application of the Markov Chain Monte Carlo method for snow water equivalent retrieval based on passive microwave measurements

    NASA Astrophysics Data System (ADS)

    Pan, J.; Durand, M. T.; Vanderjagt, B. J.

    2015-12-01

    Markov Chain Monte Carlo (MCMC) method is a retrieval algorithm based on Bayes' rule, which starts from an initial state of snow/soil parameters, and updates it to a series of new states by comparing the posterior probability of simulated snow microwave signals before and after each time of random walk. It is a realization of the Bayes' rule, which gives an approximation to the probability of the snow/soil parameters in condition of the measured microwave TB signals at different bands. Although this method could solve all snow parameters including depth, density, snow grain size and temperature at the same time, it still needs prior information of these parameters for posterior probability calculation. How the priors will influence the SWE retrieval is a big concern. Therefore, in this paper at first, a sensitivity test will be carried out to study how accurate the snow emission models and how explicit the snow priors need to be to maintain the SWE error within certain amount. The synthetic TB simulated from the measured snow properties plus a 2-K observation error will be used for this purpose. It aims to provide a guidance on the MCMC application under different circumstances. Later, the method will be used for the snowpits at different sites, including Sodankyla, Finland, Churchill, Canada and Colorado, USA, using the measured TB from ground-based radiometers at different bands. Based on the previous work, the error in these practical cases will be studied, and the error sources will be separated and quantified.

  9. Combined state and parameter identification of nonlinear structural dynamical systems based on Rao-Blackwellization and Markov chain Monte Carlo simulations

    NASA Astrophysics Data System (ADS)

    Abhinav, S.; Manohar, C. S.

    2018-03-01

    The problem of combined state and parameter estimation in nonlinear state space models, based on Bayesian filtering methods, is considered. A novel approach, which combines Rao-Blackwellized particle filters for state estimation with Markov chain Monte Carlo (MCMC) simulations for parameter identification, is proposed. In order to ensure successful performance of the MCMC samplers, in situations involving large amount of dynamic measurement data and (or) low measurement noise, the study employs a modified measurement model combined with an importance sampling based correction. The parameters of the process noise covariance matrix are also included as quantities to be identified. The study employs the Rao-Blackwellization step at two stages: one, associated with the state estimation problem in the particle filtering step, and, secondly, in the evaluation of the ratio of likelihoods in the MCMC run. The satisfactory performance of the proposed method is illustrated on three dynamical systems: (a) a computational model of a nonlinear beam-moving oscillator system, (b) a laboratory scale beam traversed by a loaded trolley, and (c) an earthquake shake table study on a bending-torsion coupled nonlinear frame subjected to uniaxial support motion.

  10. Parameter identifiability and regional calibration for reservoir inflow prediction

    NASA Astrophysics Data System (ADS)

    Kolberg, Sjur; Engeland, Kolbjørn; Tøfte, Lena S.; Bruland, Oddbjørn

    2013-04-01

    The large hydropower producer Statkraft is currently testing regional, distributed models for operational reservoir inflow prediction. The need for simultaneous forecasts and consistent updating in a large number of catchments supports the shift from catchment-oriented to regional models. Low-quality naturalized inflow series in the reservoir catchments further encourages the use of donor catchments and regional simulation for calibration purposes. MCMC based parameter estimation (the Dream algorithm; Vrugt et al, 2009) is adapted to regional parameter estimation, and implemented within the open source ENKI framework. The likelihood is based on the concept of effectively independent number of observations, spatially as well as in time. Marginal and conditional (around an optimum) parameter distributions for each catchment may be extracted, even though the MCMC algorithm itself is guided only by the regional likelihood surface. Early results indicate that the average performance loss associated with regional calibration (difference in Nash-Sutcliffe R2 between regionally and locally optimal parameters) is in the range of 0.06. The importance of the seasonal snow storage and melt in Norwegian mountain catchments probably contributes to the high degree of similarity among catchments. The evaluation continues for several regions, focusing on posterior parameter uncertainty and identifiability. Vrugt, J. A., C. J. F. ter Braak, C. G. H. Diks, B. A. Robinson, J. M. Hyman and D. Higdon: Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling. Int. J. of nonlinear sciences and numerical simulation 10, 3, 273-290, 2009.

  11. Reciprocal Sliding Friction Model for an Electro-Deposited Coating and Its Parameter Estimation Using Markov Chain Monte Carlo Method

    PubMed Central

    Kim, Kyungmok; Lee, Jaewook

    2016-01-01

    This paper describes a sliding friction model for an electro-deposited coating. Reciprocating sliding tests using ball-on-flat plate test apparatus are performed to determine an evolution of the kinetic friction coefficient. The evolution of the friction coefficient is classified into the initial running-in period, steady-state sliding, and transition to higher friction. The friction coefficient during the initial running-in period and steady-state sliding is expressed as a simple linear function. The friction coefficient in the transition to higher friction is described with a mathematical model derived from Kachanov-type damage law. The model parameters are then estimated using the Markov Chain Monte Carlo (MCMC) approach. It is identified that estimated friction coefficients obtained by MCMC approach are in good agreement with measured ones. PMID:28773359

  12. Application of Bayesian Inversion for Multilayer Reservoir Mapping while Drilling Measurements

    NASA Astrophysics Data System (ADS)

    Wang, J.; Chen, H.; Wang, X.

    2017-12-01

    Real-time geosteering technology plays a key role in horizontal well development, which keeps the wellbore trajectories within target zones to maximize reservoir contact. The new generation logging while drilling (LWD) resistivity tools have longer spacing and deeper investigation depth, but meanwhile bring a new challenge to inversion of logging data that is formation model not be restricted to few possible numbers of layer such as typical three layers model. If the inappropriate starting models of deterministic and gradient-based methods are adopted may mislead geophysicists in interpretation of subsurface structure. For this purpose, to take advantage of richness of the measurements and deep depth of investigation across multiple formation boundaries, a trans-dimensional Markov chain Monte Carlo(MCMC) inversion algorithm has been developed that combines phase and attenuation measurements at various frequencies and spacings. Unlike conventional gradient-based inversion approaches, MCMC algorithm does not introduce bias from prior information and require any subjective choice of regularization parameter. A synthetic three layers model example demonstrates how the algorithm can be used to image the subsurface using the LWD data. When the tool is far from top boundary, the inversion clearly resolves the boundary position; that is where the boundary histogram shows a large peak. But the measurements cannot resolve the bottom boundary; the large spread between quantiles reflects the uncertainty associated with the bed resolution. As the tool moves closer to the top boundary, the middle layer and bottom layer are resolved and retained models are more similar, the uncertainty associated with these two beds decreases. From the spread observed between models, we can evaluate actual depth of investigation, uncertainty, and sensitivity, which is more useful then just a single best model.

  13. Assessing the pollution risk of a groundwater source field at western Laizhou Bay under seawater intrusion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zeng, Xiankui; Wu, Jichun; Wang, Dong, E-mail: wangdong@nju.edu.cn

    Coastal areas have great significance for human living, economy and society development in the world. With the rapid increase of pressures from human activities and climate change, the safety of groundwater resource is under the threat of seawater intrusion in coastal areas. The area of Laizhou Bay is one of the most serious seawater intruded areas in China, since seawater intrusion phenomenon was firstly recognized in the middle of 1970s. This study assessed the pollution risk of a groundwater source filed of western Laizhou Bay area by inferring the probability distribution of groundwater Cl{sup −} concentration. The numerical model ofmore » seawater intrusion process is built by using SEAWAT4. The parameter uncertainty of this model is evaluated by Markov Chain Monte Carlo (MCMC) simulation, and DREAM{sub (ZS)} is used as sampling algorithm. Then, the predictive distribution of Cl{sup -} concentration at groundwater source field is inferred by using the samples of model parameters obtained from MCMC. After that, the pollution risk of groundwater source filed is assessed by the predictive quantiles of Cl{sup -} concentration. The results of model calibration and verification demonstrate that the DREAM{sub (ZS)} based MCMC is efficient and reliable to estimate model parameters under current observation. Under the condition of 95% confidence level, the groundwater source point will not be polluted by seawater intrusion in future five years (2015–2019). In addition, the 2.5% and 97.5% predictive quantiles show that the Cl{sup −} concentration of groundwater source field always vary between 175 mg/l and 200 mg/l. - Highlights: • The parameter uncertainty of seawater intrusion model is evaluated by MCMC. • Groundwater source field won’t be polluted by seawater intrusion in future 5 years. • The pollution risk is assessed by the predictive quantiles of Cl{sup −} concentration.« less

  14. Multiple-Event Seismic Location Using the Markov-Chain Monte Carlo Technique

    NASA Astrophysics Data System (ADS)

    Myers, S. C.; Johannesson, G.; Hanley, W.

    2005-12-01

    We develop a new multiple-event location algorithm (MCMCloc) that utilizes the Markov-Chain Monte Carlo (MCMC) method. Unlike most inverse methods, the MCMC approach produces a suite of solutions, each of which is consistent with observations and prior estimates of data and model uncertainties. Model parameters in MCMCloc consist of event hypocenters, and travel-time predictions. Data are arrival time measurements and phase assignments. Posteriori estimates of event locations, path corrections, pick errors, and phase assignments are made through analysis of the posteriori suite of acceptable solutions. Prior uncertainty estimates include correlations between travel-time predictions, correlations between measurement errors, the probability of misidentifying one phase for another, and the probability of spurious data. Inclusion of prior constraints on location accuracy allows direct utilization of ground-truth locations or well-constrained location parameters (e.g. from InSAR) that aid in the accuracy of the solution. Implementation of a correlation structure for travel-time predictions allows MCMCloc to operate over arbitrarily large geographic areas. Transition in behavior between a multiple-event locator for tightly clustered events and a single-event locator for solitary events is controlled by the spatial correlation of travel-time predictions. We test the MCMC locator on a regional data set of Nevada Test Site nuclear explosions. Event locations and origin times are known for these events, allowing us to test the features of MCMCloc using a high-quality ground truth data set. Preliminary tests suggest that MCMCloc provides excellent relative locations, often outperforming traditional multiple-event location algorithms, and excellent absolute locations are attained when constraints from one or more ground truth event are included. When phase assignments are switched, we find that MCMCloc properly corrects the error when predicted arrival times are separated by several seconds. In cases where the predicted arrival times are within the combined uncertainty of prediction and measurement errors, MCMCloc determines the probability of one or the other phase assignment and propagates this uncertainty into all model parameters. We find that MCMCloc is a promising method for simultaneously locating large, geographically distributed data sets. Because we incorporate prior knowledge on many parameters, MCMCloc is ideal for combining trusted data with data of unknown reliability. This work was performed under the auspices of the U.S. Department of Energy by the University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48, Contribution UCRL-ABS-215048

  15. Comparison of Bootstrapping and Markov Chain Monte Carlo for Copula Analysis of Hydrological Droughts

    NASA Astrophysics Data System (ADS)

    Yang, P.; Ng, T. L.; Yang, W.

    2015-12-01

    Effective water resources management depends on the reliable estimation of the uncertainty of drought events. Confidence intervals (CIs) are commonly applied to quantify this uncertainty. A CI seeks to be at the minimal length necessary to cover the true value of the estimated variable with the desired probability. In drought analysis where two or more variables (e.g., duration and severity) are often used to describe a drought, copulas have been found suitable for representing the joint probability behavior of these variables. However, the comprehensive assessment of the parameter uncertainties of copulas of droughts has been largely ignored, and the few studies that have recognized this issue have not explicitly compared the various methods to produce the best CIs. Thus, the objective of this study to compare the CIs generated using two widely applied uncertainty estimation methods, bootstrapping and Markov Chain Monte Carlo (MCMC). To achieve this objective, (1) the marginal distributions lognormal, Gamma, and Generalized Extreme Value, and the copula functions Clayton, Frank, and Plackett are selected to construct joint probability functions of two drought related variables. (2) The resulting joint functions are then fitted to 200 sets of simulated realizations of drought events with known distribution and extreme parameters and (3) from there, using bootstrapping and MCMC, CIs of the parameters are generated and compared. The effect of an informative prior on the CIs generated by MCMC is also evaluated. CIs are produced for different sample sizes (50, 100, and 200) of the simulated drought events for fitting the joint probability functions. Preliminary results assuming lognormal marginal distributions and the Clayton copula function suggest that for cases with small or medium sample sizes (~50-100), MCMC to be superior method if an informative prior exists. Where an informative prior is unavailable, for small sample sizes (~50), both bootstrapping and MCMC yield the same level of performance, and for medium sample sizes (~100), bootstrapping is better. For cases with a large sample size (~200), there is little difference between the CIs generated using bootstrapping and MCMC regardless of whether or not an informative prior exists.

  16. An adaptive Bayesian inference algorithm to estimate the parameters of a hazardous atmospheric release

    NASA Astrophysics Data System (ADS)

    Rajaona, Harizo; Septier, François; Armand, Patrick; Delignon, Yves; Olry, Christophe; Albergel, Armand; Moussafir, Jacques

    2015-12-01

    In the eventuality of an accidental or intentional atmospheric release, the reconstruction of the source term using measurements from a set of sensors is an important and challenging inverse problem. A rapid and accurate estimation of the source allows faster and more efficient action for first-response teams, in addition to providing better damage assessment. This paper presents a Bayesian probabilistic approach to estimate the location and the temporal emission profile of a pointwise source. The release rate is evaluated analytically by using a Gaussian assumption on its prior distribution, and is enhanced with a positivity constraint to improve the estimation. The source location is obtained by the means of an advanced iterative Monte-Carlo technique called Adaptive Multiple Importance Sampling (AMIS), which uses a recycling process at each iteration to accelerate its convergence. The proposed methodology is tested using synthetic and real concentration data in the framework of the Fusion Field Trials 2007 (FFT-07) experiment. The quality of the obtained results is comparable to those coming from the Markov Chain Monte Carlo (MCMC) algorithm, a popular Bayesian method used for source estimation. Moreover, the adaptive processing of the AMIS provides a better sampling efficiency by reusing all the generated samples.

  17. Multi-chain Markov chain Monte Carlo methods for computationally expensive models

    NASA Astrophysics Data System (ADS)

    Huang, M.; Ray, J.; Ren, H.; Hou, Z.; Bao, J.

    2017-12-01

    Markov chain Monte Carlo (MCMC) methods are used to infer model parameters from observational data. The parameters are inferred as probability densities, thus capturing estimation error due to sparsity of the data, and the shortcomings of the model. Multiple communicating chains executing the MCMC method have the potential to explore the parameter space better, and conceivably accelerate the convergence to the final distribution. We present results from tests conducted with the multi-chain method to show how the acceleration occurs i.e., for loose convergence tolerances, the multiple chains do not make much of a difference. The ensemble of chains also seems to have the ability to accelerate the convergence of a few chains that might start from suboptimal starting points. Finally, we show the performance of the chains in the estimation of O(10) parameters using computationally expensive forward models such as the Community Land Model, where the sampling burden is distributed over multiple chains.

  18. Ascertainment correction for Markov chain Monte Carlo segregation and linkage analysis of a quantitative trait.

    PubMed

    Ma, Jianzhong; Amos, Christopher I; Warwick Daw, E

    2007-09-01

    Although extended pedigrees are often sampled through probands with extreme levels of a quantitative trait, Markov chain Monte Carlo (MCMC) methods for segregation and linkage analysis have not been able to perform ascertainment corrections. Further, the extent to which ascertainment of pedigrees leads to biases in the estimation of segregation and linkage parameters has not been previously studied for MCMC procedures. In this paper, we studied these issues with a Bayesian MCMC approach for joint segregation and linkage analysis, as implemented in the package Loki. We first simulated pedigrees ascertained through individuals with extreme values of a quantitative trait in spirit of the sequential sampling theory of Cannings and Thompson [Cannings and Thompson [1977] Clin. Genet. 12:208-212]. Using our simulated data, we detected no bias in estimates of the trait locus location. However, in addition to allele frequencies, when the ascertainment threshold was higher than or close to the true value of the highest genotypic mean, bias was also found in the estimation of this parameter. When there were multiple trait loci, this bias destroyed the additivity of the effects of the trait loci, and caused biases in the estimation all genotypic means when a purely additive model was used for analyzing the data. To account for pedigree ascertainment with sequential sampling, we developed a Bayesian ascertainment approach and implemented Metropolis-Hastings updates in the MCMC samplers used in Loki. Ascertainment correction greatly reduced biases in parameter estimates. Our method is designed for multiple, but a fixed number of trait loci. Copyright (c) 2007 Wiley-Liss, Inc.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crowder, Jeff; Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California 91109; Cornish, Neil J.

    Low frequency gravitational wave detectors, such as the Laser Interferometer Space Antenna (LISA), will have to contend with large foregrounds produced by millions of compact galactic binaries in our galaxy. While these galactic signals are interesting in their own right, the unresolved component can obscure other sources. The science yield for the LISA mission can be improved if the brighter and more isolated foreground sources can be identified and regressed from the data. Since the signals overlap with one another, we are faced with a 'cocktail party' problem of picking out individual conversations in a crowded room. Here we presentmore » and implement an end-to-end solution to the galactic foreground problem that is able to resolve tens of thousands of sources from across the LISA band. Our algorithm employs a variant of the Markov chain Monte Carlo (MCMC) method, which we call the blocked annealed Metropolis-Hastings (BAM) algorithm. Following a description of the algorithm and its implementation, we give several examples ranging from searches for a single source to searches for hundreds of overlapping sources. Our examples include data sets from the first round of mock LISA data challenges.« less

  20. A new approach for handling longitudinal count data with zero-inflation and overdispersion: poisson geometric process model.

    PubMed

    Wan, Wai-Yin; Chan, Jennifer S K

    2009-08-01

    For time series of count data, correlated measurements, clustering as well as excessive zeros occur simultaneously in biomedical applications. Ignoring such effects might contribute to misleading treatment outcomes. A generalized mixture Poisson geometric process (GMPGP) model and a zero-altered mixture Poisson geometric process (ZMPGP) model are developed from the geometric process model, which was originally developed for modelling positive continuous data and was extended to handle count data. These models are motivated by evaluating the trend development of new tumour counts for bladder cancer patients as well as by identifying useful covariates which affect the count level. The models are implemented using Bayesian method with Markov chain Monte Carlo (MCMC) algorithms and are assessed using deviance information criterion (DIC).

  1. Modelling past land use using archaeological and pollen data

    NASA Astrophysics Data System (ADS)

    Pirzamanbein, Behnaz; Lindström, johan; Poska, Anneli; Gaillard-Lemdahl, Marie-José

    2016-04-01

    Accurate maps of past land use are necessary for studying the impact of anthropogenic land-cover changes on climate and biodiversity. We develop a Bayesian hierarchical model to reconstruct the land use using Gaussian Markov random fields. The model uses two observations sets: 1) archaeological data, representing human settlements, urbanization and agricultural findings; and 2) pollen-based land estimates of the three land-cover types Coniferous forest, Broadleaved forest and Unforested/Open land. The pollen based estimates are obtained from the REVEALS model, based on pollen counts from lakes and bogs. Our developed model uses the sparse pollen-based estimations to reconstruct the spatial continuous cover of three land cover types. Using the open-land component and the archaeological data, the extent of land-use is reconstructed. The model is applied on three time periods - centred around 1900 CE, 1000 and, 4000 BCE over Sweden for which both pollen-based estimates and archaeological data are available. To estimate the model parameters and land use, a block updated Markov chain Monte Carlo (MCMC) algorithm is applied. Using the MCMC posterior samples uncertainties in land-use predictions are computed. Due to lack of good historic land use data, model results are evaluated by cross-validation. Keywords. Spatial reconstruction, Gaussian Markov random field, Fossil pollen records, Archaeological data, Human land-use, Prediction uncertainty

  2. EXOFIT: orbital parameters of extrasolar planets from radial velocities

    NASA Astrophysics Data System (ADS)

    Balan, Sreekumar T.; Lahav, Ofer

    2009-04-01

    Retrieval of orbital parameters of extrasolar planets poses considerable statistical challenges. Due to sparse sampling, measurement errors, parameters degeneracy and modelling limitations, there are no unique values of basic parameters, such as period and eccentricity. Here, we estimate the orbital parameters from radial velocity data in a Bayesian framework by utilizing Markov Chain Monte Carlo (MCMC) simulations with the Metropolis-Hastings algorithm. We follow a methodology recently proposed by Gregory and Ford. Our implementation of MCMC is based on the object-oriented approach outlined by Graves. We make our resulting code, EXOFIT, publicly available with this paper. It can search for either one or two planets as illustrated on mock data. As an example we re-analysed the orbital solution of companions to HD 187085 and HD 159868 from the published radial velocity data. We confirm the degeneracy reported for orbital parameters of the companion to HD 187085, and show that a low-eccentricity orbit is more probable for this planet. For HD 159868, we obtained slightly different orbital solution and a relatively high `noise' factor indicating the presence of an unaccounted signal in the radial velocity data. EXOFIT is designed in such a way that it can be extended for a variety of probability models, including different Bayesian priors.

  3. Efficient estimation of ideal-observer performance in classification tasks involving high-dimensional complex backgrounds

    PubMed Central

    Park, Subok; Clarkson, Eric

    2010-01-01

    The Bayesian ideal observer is optimal among all observers and sets an absolute upper bound for the performance of any observer in classification tasks [Van Trees, Detection, Estimation, and Modulation Theory, Part I (Academic, 1968).]. Therefore, the ideal observer should be used for objective image quality assessment whenever possible. However, computation of ideal-observer performance is difficult in practice because this observer requires the full description of unknown, statistical properties of high-dimensional, complex data arising in real life problems. Previously, Markov-chain Monte Carlo (MCMC) methods were developed by Kupinski et al. [J. Opt. Soc. Am. A 20, 430(2003) ] and by Park et al. [J. Opt. Soc. Am. A 24, B136 (2007) and IEEE Trans. Med. Imaging 28, 657 (2009) ] to estimate the performance of the ideal observer and the channelized ideal observer (CIO), respectively, in classification tasks involving non-Gaussian random backgrounds. However, both algorithms had the disadvantage of long computation times. We propose a fast MCMC for real-time estimation of the likelihood ratio for the CIO. Our simulation results show that our method has the potential to speed up ideal-observer performance in tasks involving complex data when efficient channels are used for the CIO. PMID:19884916

  4. Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.

    PubMed

    Datta, Abhirup; Banerjee, Sudipto; Finley, Andrew O; Gelfand, Alan E

    2016-01-01

    Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.

  5. Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

    PubMed Central

    Datta, Abhirup; Banerjee, Sudipto; Finley, Andrew O.; Gelfand, Alan E.

    2018-01-01

    Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online. PMID:29720777

  6. Bayesian posterior distributions without Markov chains.

    PubMed

    Cole, Stephen R; Chu, Haitao; Greenland, Sander; Hamra, Ghassan; Richardson, David B

    2012-03-01

    Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976-1983) assessing the relation between residential exposure to magnetic fields and the development of childhood cancer. Results from rejection sampling (odds ratio (OR) = 1.69, 95% posterior interval (PI): 0.57, 5.00) were similar to MCMC results (OR = 1.69, 95% PI: 0.58, 4.95) and approximations from data-augmentation priors (OR = 1.74, 95% PI: 0.60, 5.06). In example 2, the authors apply rejection sampling to a cohort study of 315 human immunodeficiency virus seroconverters (1984-1998) to assess the relation between viral load after infection and 5-year incidence of acquired immunodeficiency syndrome, adjusting for (continuous) age at seroconversion and race. In this more complex example, rejection sampling required a notably longer run time than MCMC sampling but remained feasible and again yielded similar results. The transparency of the proposed approach comes at a price of being less broadly applicable than MCMC.

  7. Approximate Bayesian computation for spatial SEIR(S) epidemic models.

    PubMed

    Brown, Grant D; Porter, Aaron T; Oleson, Jacob J; Hinman, Jessica A

    2018-02-01

    Approximate Bayesia n Computation (ABC) provides an attractive approach to estimation in complex Bayesian inferential problems for which evaluation of the kernel of the posterior distribution is impossible or computationally expensive. These highly parallelizable techniques have been successfully applied to many fields, particularly in cases where more traditional approaches such as Markov chain Monte Carlo (MCMC) are impractical. In this work, we demonstrate the application of approximate Bayesian inference to spatially heterogeneous Susceptible-Exposed-Infectious-Removed (SEIR) stochastic epidemic models. These models have a tractable posterior distribution, however MCMC techniques nevertheless become computationally infeasible for moderately sized problems. We discuss the practical implementation of these techniques via the open source ABSEIR package for R. The performance of ABC relative to traditional MCMC methods in a small problem is explored under simulation, as well as in the spatially heterogeneous context of the 2014 epidemic of Chikungunya in the Americas. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Chunyuan; Stevens, Andrew J.; Chen, Changyou

    2016-08-10

    Learning the representation of shape cues in 2D & 3D objects for recognition is a fundamental task in computer vision. Deep neural networks (DNNs) have shown promising performance on this task. Due to the large variability of shapes, accurate recognition relies on good estimates of model uncertainty, ignored in traditional training of DNNs, typically learned via stochastic optimization. This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (SG-MCMC) to learn weight uncertainty in DNNs. It yields principled Bayesian interpretations for the commonly used Dropout/DropConnect techniques and incorporates them into the SG-MCMC framework. Extensive experiments on 2D &more » 3D shape datasets and various DNN models demonstrate the superiority of the proposed approach over stochastic optimization. Our approach yields higher recognition accuracy when used in conjunction with Dropout and Batch-Normalization.« less

  9. Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)

    NASA Astrophysics Data System (ADS)

    Kasibhatla, P.

    2004-12-01

    In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.

  10. An efficient interpolation technique for jump proposals in reversible-jump Markov chain Monte Carlo calculations

    PubMed Central

    Farr, W. M.; Mandel, I.; Stevens, D.

    2015-01-01

    Selection among alternative theoretical models given an observed dataset is an important challenge in many areas of physics and astronomy. Reversible-jump Markov chain Monte Carlo (RJMCMC) is an extremely powerful technique for performing Bayesian model selection, but it suffers from a fundamental difficulty and it requires jumps between model parameter spaces, but cannot efficiently explore both parameter spaces at once. Thus, a naive jump between parameter spaces is unlikely to be accepted in the Markov chain Monte Carlo (MCMC) algorithm and convergence is correspondingly slow. Here, we demonstrate an interpolation technique that uses samples from single-model MCMCs to propose intermodel jumps from an approximation to the single-model posterior of the target parameter space. The interpolation technique, based on a kD-tree data structure, is adaptive and efficient in modest dimensionality. We show that our technique leads to improved convergence over naive jumps in an RJMCMC, and compare it to other proposals in the literature to improve the convergence of RJMCMCs. We also demonstrate the use of the same interpolation technique as a way to construct efficient ‘global’ proposal distributions for single-model MCMCs without prior knowledge of the structure of the posterior distribution, and discuss improvements that permit the method to be used in higher dimensional spaces efficiently. PMID:26543580

  11. SaaS enabled admission control for MCMC simulation in cloud computing infrastructures

    NASA Astrophysics Data System (ADS)

    Vázquez-Poletti, J. L.; Moreno-Vozmediano, R.; Han, R.; Wang, W.; Llorente, I. M.

    2017-02-01

    Markov Chain Monte Carlo (MCMC) methods are widely used in the field of simulation and modelling of materials, producing applications that require a great amount of computational resources. Cloud computing represents a seamless source for these resources in the form of HPC. However, resource over-consumption can be an important drawback, specially if the cloud provision process is not appropriately optimized. In the present contribution we propose a two-level solution that, on one hand, takes advantage of approximate computing for reducing the resource demand and on the other, uses admission control policies for guaranteeing an optimal provision to running applications.

  12. Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods

    NASA Astrophysics Data System (ADS)

    Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.

    2011-12-01

    Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion of random effects. However, in the biased case, only Method 3 correctly estimates all the unknown parameters, and both Methods 1 and 2 provide wrong values for the biased parameters. The synthetic case study demonstrates that if the covariance matrix for PCA analysis is inconsistent with true models, the PCA methods with geometric or MCMC sampling will provide incorrect estimates.

  13. Fast-NPS-A Markov Chain Monte Carlo-based analysis tool to obtain structural information from single-molecule FRET measurements

    NASA Astrophysics Data System (ADS)

    Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens

    2017-10-01

    The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.

  14. Bayesian inversion using a geologically realistic and discrete model space

    NASA Astrophysics Data System (ADS)

    Jaeggli, C.; Julien, S.; Renard, P.

    2017-12-01

    Since the early days of groundwater modeling, inverse methods play a crucial role. Many research and engineering groups aim to infer extensive knowledge of aquifer parameters from a sparse set of observations. Despite decades of dedicated research on this topic, there are still several major issues to be solved. In the hydrogeological framework, one is often confronted with underground structures that present very sharp contrasts of geophysical properties. In particular, subsoil structures such as karst conduits, channels, faults, or lenses, strongly influence groundwater flow and transport behavior of the underground. For this reason it can be essential to identify their location and shape very precisely. Unfortunately, when inverse methods are specially trained to consider such complex features, their computation effort often becomes unaffordably high. The following work is an attempt to solve this dilemma. We present a new method that is, in some sense, a compromise between the ergodicity of Markov chain Monte Carlo (McMC) methods and the efficient handling of data by the ensemble based Kalmann filters. The realistic and complex random fields are generated by a Multiple-Point Statistics (MPS) tool. Nonetheless, it is applicable with any conditional geostatistical simulation tool. Furthermore, the algorithm is independent of any parametrization what becomes most important when two parametric systems are equivalent (permeability and resistivity, speed and slowness, etc.). When compared to two existing McMC schemes, the computational effort was divided by a factor of 12.

  15. Nonstationary Extreme Value Analysis in a Changing Climate: A Software Package

    NASA Astrophysics Data System (ADS)

    Cheng, L.; AghaKouchak, A.; Gilleland, E.

    2013-12-01

    Numerous studies show that climatic extremes have increased substantially in the second half of the 20th century. For this reason, analysis of extremes under a nonstationary assumption has received a great deal of attention. This paper presents a software package developed for estimation of return levels, return periods, and risks of climatic extremes in a changing climate. This MATLAB software package offers tools for analysis of climate extremes under both stationary and non-stationary assumptions. The Nonstationary Extreme Value Analysis (hereafter, NEVA) provides an efficient and generalized framework for analyzing extremes using Bayesian inference. NEVA estimates the extreme value parameters using a Differential Evolution Markov Chain (DE-MC) which utilizes the genetic algorithm Differential Evolution (DE) for global optimization over the real parameter space with the Markov Chain Monte Carlo (MCMC) approach and has the advantage of simplicity, speed of calculation and convergence over conventional MCMC. NEVA also offers the confidence interval and uncertainty bounds of estimated return levels based on the sampled parameters. NEVA integrates extreme value design concepts, data analysis tools, optimization and visualization, explicitly designed to facilitate analysis extremes in geosciences. The generalized input and output files of this software package make it attractive for users from across different fields. Both stationary and nonstationary components of the package are validated for a number of case studies using empirical return levels. The results show that NEVA reliably describes extremes and their return levels.

  16. Parameter Identification and Uncertainty Analysis for Visual MODFLOW based Groundwater Flow Model in a Small River Basin, Eastern India

    NASA Astrophysics Data System (ADS)

    Jena, S.

    2015-12-01

    The overexploitation of groundwater resulted in abandoning many shallow tube wells in the river Basin in Eastern India. For the sustainability of groundwater resources, basin-scale modelling of groundwater flow is essential for the efficient planning and management of the water resources. The main intent of this study is to develope a 3-D groundwater flow model of the study basin using the Visual MODFLOW package and successfully calibrate and validate it using 17 years of observed data. The sensitivity analysis was carried out to quantify the susceptibility of aquifer system to the river bank seepage, recharge from rainfall and agriculture practices, horizontal and vertical hydraulic conductivities, and specific yield. To quantify the impact of parameter uncertainties, Sequential Uncertainty Fitting Algorithm (SUFI-2) and Markov chain Monte Carlo (MCMC) techniques were implemented. Results from the two techniques were compared and the advantages and disadvantages were analysed. Nash-Sutcliffe coefficient (NSE) and coefficient of determination (R2) were adopted as two criteria during calibration and validation of the developed model. NSE and R2 values of groundwater flow model for calibration and validation periods were in acceptable range. Also, the MCMC technique was able to provide more reasonable results than SUFI-2. The calibrated and validated model will be useful to identify the aquifer properties, analyse the groundwater flow dynamics and the change in groundwater levels in future forecasts.

  17. Hierarchical multistage MCMC follow-up of continuous gravitational wave candidates

    NASA Astrophysics Data System (ADS)

    Ashton, G.; Prix, R.

    2018-05-01

    Leveraging Markov chain Monte Carlo optimization of the F statistic, we introduce a method for the hierarchical follow-up of continuous gravitational wave candidates identified by wide-parameter space semicoherent searches. We demonstrate parameter estimation for continuous wave sources and develop a framework and tools to understand and control the effective size of the parameter space, critical to the success of the method. Monte Carlo tests of simulated signals in noise demonstrate that this method is close to the theoretical optimal performance.

  18. Bayesian Monte Carlo and Maximum Likelihood Approach for ...

    EPA Pesticide Factsheets

    Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood estimation (BMCML) to calibrate a lake oxygen recovery model. We first derive an analytical solution of the differential equation governing lake-averaged oxygen dynamics as a function of time-variable wind speed. Statistical inferences on model parameters and predictive uncertainty are then drawn by Bayesian conditioning of the analytical solution on observed daily wind speed and oxygen concentration data obtained from an earlier study during two recovery periods on a eutrophic lake in upper state New York. The model is calibrated using oxygen recovery data for one year and statistical inferences were validated using recovery data for another year. Compared with essentially two-step, regression and optimization approach, the BMCML results are more comprehensive and performed relatively better in predicting the observed temporal dissolved oxygen levels (DO) in the lake. BMCML also produced comparable calibration and validation results with those obtained using popular Markov Chain Monte Carlo technique (MCMC) and is computationally simpler and easier to implement than the MCMC. Next, using the calibrated model, we derive an optimal relationship between liquid film-transfer coefficien

  19. MCMC genome rearrangement.

    PubMed

    Miklós, István

    2003-10-01

    As more and more genomes have been sequenced, genomic data is rapidly accumulating. Genome-wide mutations are believed more neutral than local mutations such as substitutions, insertions and deletions, therefore phylogenetic investigations based on inversions, transpositions and inverted transpositions are less biased by the hypothesis on neutral evolution. Although efficient algorithms exist for obtaining the inversion distance of two signed permutations, there is no reliable algorithm when both inversions and transpositions are considered. Moreover, different type of mutations happen with different rates, and it is not clear how to weight them in a distance based approach. We introduce a Markov Chain Monte Carlo method to genome rearrangement based on a stochastic model of evolution, which can estimate the number of different evolutionary events needed to sort a signed permutation. The performance of the method was tested on simulated data, and the estimated numbers of different types of mutations were reliable. Human and Drosophila mitochondrial data were also analysed with the new method. The mixing time of the Markov Chain is short both in terms of CPU times and number of proposals. The source code in C is available on request from the author.

  20. Bayesian Analysis of Evolutionary Divergence with Genomic Data under Diverse Demographic Models.

    PubMed

    Chung, Yujin; Hey, Jody

    2017-06-01

    We present a new Bayesian method for estimating demographic and phylogenetic history using population genomic data. Several key innovations are introduced that allow the study of diverse models within an Isolation-with-Migration framework. The new method implements a 2-step analysis, with an initial Markov chain Monte Carlo (MCMC) phase that samples simple coalescent trees, followed by the calculation of the joint posterior density for the parameters of a demographic model. In step 1, the MCMC sampling phase, the method uses a reduced state space, consisting of coalescent trees without migration paths, and a simple importance sampling distribution without the demography of interest. Once obtained, a single sample of trees can be used in step 2 to calculate the joint posterior density for model parameters under multiple diverse demographic models, without having to repeat MCMC runs. Because migration paths are not included in the state space of the MCMC phase, but rather are handled by analytic integration in step 2 of the analysis, the method is scalable to a large number of loci with excellent MCMC mixing properties. With an implementation of the new method in the computer program MIST, we demonstrate the method's accuracy, scalability, and other advantages using simulated data and DNA sequences of two common chimpanzee subspecies: Pan troglodytes (P. t.) troglodytes and P. t. verus. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. Vector Mesons in Cold Nuclear Matter

    NASA Astrophysics Data System (ADS)

    Rodrigues, Tulio E.; Dias de Toledo Arruda-Neto, Joāo

    2013-03-01

    The attenuation of vector mesons in cold nuclear matter is studied through the mechanism of incoherent photoproduction off complex nuclei. The latter is described via the time-dependent multi-collisional Monte Carlo (MCMC) intranuclear cascade model. The results for the transparency ratios of ω mesons reproduce previous measurements of CB-ELSA/TAPS with an inelastic ωN cross section around 40 mb for ρω ~ 1.1 GeV/c. The corresponding in-medium width (nuclear rest frame) is extracted dinamically from the algorithm and depends on the average nuclear density pN and target nucleus: ~ 49.2 MeV/c2 for carbon (pN 0.114 far-3) and ~ 77.3 MeV/c2 for lead (pN 0.137 far--3). The calculations fail to reproduce the huge absorption observed at JLab assuming the same inelastic cross section and the discrepancy between the two experiments remains a challenge.

  2. Bayesian analysis of stochastic volatility-in-mean model with leverage and asymmetrically heavy-tailed error using generalized hyperbolic skew Student’s t-distribution*

    PubMed Central

    Leão, William L.; Chen, Ming-Hui

    2017-01-01

    A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor’s 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model. PMID:29333210

  3. Bayesian Group Bridge for Bi-level Variable Selection.

    PubMed

    Mallick, Himel; Yi, Nengjun

    2017-06-01

    A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.

  4. Bayesian analysis of stochastic volatility-in-mean model with leverage and asymmetrically heavy-tailed error using generalized hyperbolic skew Student's t-distribution.

    PubMed

    Leão, William L; Abanto-Valle, Carlos A; Chen, Ming-Hui

    2017-01-01

    A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor's 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model.

  5. Item Response Theory Equating Using Bayesian Informative Priors.

    ERIC Educational Resources Information Center

    de la Torre, Jimmy; Patz, Richard J.

    This paper seeks to extend the application of Markov chain Monte Carlo (MCMC) methods in item response theory (IRT) to include the estimation of equating relationships along with the estimation of test item parameters. A method is proposed that incorporates estimation of the equating relationship in the item calibration phase. Item parameters from…

  6. Bayesian Estimation of the Logistic Positive Exponent IRT Model

    ERIC Educational Resources Information Center

    Bolfarine, Heleno; Bazan, Jorge Luis

    2010-01-01

    A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…

  7. Markov chain Monte Carlo linkage analysis: effect of bin width on the probability of linkage.

    PubMed

    Slager, S L; Juo, S H; Durner, M; Hodge, S E

    2001-01-01

    We analyzed part of the Genetic Analysis Workshop (GAW) 12 simulated data using Monte Carlo Markov chain (MCMC) methods that are implemented in the computer program Loki. The MCMC method reports the "probability of linkage" (PL) across the chromosomal regions of interest. The point of maximum PL can then be taken as a "location estimate" for the location of the quantitative trait locus (QTL). However, Loki does not provide a formal statistical test of linkage. In this paper, we explore how the bin width used in the calculations affects the max PL and the location estimate. We analyzed age at onset (AO) and quantitative trait number 5, Q5, from 26 replicates of the general simulated data in one region where we knew a major gene, MG5, is located. For each trait, we found the max PL and the corresponding location estimate, using four different bin widths. We found that bin width, as expected, does affect the max PL and the location estimate, and we recommend that users of Loki explore how their results vary with different bin widths.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hou Fengji; Hogg, David W.; Goodman, Jonathan

    Markov chain Monte Carlo (MCMC) proves to be powerful for Bayesian inference and in particular for exoplanet radial velocity fitting because MCMC provides more statistical information and makes better use of data than common approaches like chi-square fitting. However, the nonlinear density functions encountered in these problems can make MCMC time-consuming. In this paper, we apply an ensemble sampler respecting affine invariance to orbital parameter extraction from radial velocity data. This new sampler has only one free parameter, and does not require much tuning for good performance, which is important for automatization. The autocorrelation time of this sampler is approximatelymore » the same for all parameters and far smaller than Metropolis-Hastings, which means it requires many fewer function calls to produce the same number of independent samples. The affine-invariant sampler speeds up MCMC by hundreds of times compared with Metropolis-Hastings in the same computing situation. This novel sampler would be ideal for projects involving large data sets such as statistical investigations of planet distribution. The biggest obstacle to ensemble samplers is the existence of multiple local optima; we present a clustering technique to deal with local optima by clustering based on the likelihood of the walkers in the ensemble. We demonstrate the effectiveness of the sampler on real radial velocity data.« less

  9. An adaptive Gaussian process-based method for efficient Bayesian experimental design in groundwater contaminant source identification problems: ADAPTIVE GAUSSIAN PROCESS-BASED INVERSION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jiangjiang; Li, Weixuan; Zeng, Lingzao

    Surrogate models are commonly used in Bayesian approaches such as Markov Chain Monte Carlo (MCMC) to avoid repetitive CPU-demanding model evaluations. However, the approximation error of a surrogate may lead to biased estimations of the posterior distribution. This bias can be corrected by constructing a very accurate surrogate or implementing MCMC in a two-stage manner. Since the two-stage MCMC requires extra original model evaluations, the computational cost is still high. If the information of measurement is incorporated, a locally accurate approximation of the original model can be adaptively constructed with low computational cost. Based on this idea, we propose amore » Gaussian process (GP) surrogate-based Bayesian experimental design and parameter estimation approach for groundwater contaminant source identification problems. A major advantage of the GP surrogate is that it provides a convenient estimation of the approximation error, which can be incorporated in the Bayesian formula to avoid over-confident estimation of the posterior distribution. The proposed approach is tested with a numerical case study. Without sacrificing the estimation accuracy, the new approach achieves about 200 times of speed-up compared to our previous work using two-stage MCMC.« less

  10. Invited commentary: Lost in estimation--searching for alternatives to markov chains to fit complex Bayesian models.

    PubMed

    Molitor, John

    2012-03-01

    Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.

  11. Dimension-independent likelihood-informed MCMC

    DOE PAGES

    Cui, Tiangang; Law, Kody J. H.; Marzouk, Youssef M.

    2015-10-08

    Many Bayesian inference problems require exploring the posterior distribution of highdimensional parameters that represent the discretization of an underlying function. Our work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. There are two distinct lines of research that intersect in the methods we develop here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian informationmore » and any associated lowdimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Finally, we use two nonlinear inverse problems in order to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.« less

  12. Inverse Modeling Using Markov Chain Monte Carlo Aided by Adaptive Stochastic Collocation Method with Transformation

    NASA Astrophysics Data System (ADS)

    Zhang, D.; Liao, Q.

    2016-12-01

    The Bayesian inference provides a convenient framework to solve statistical inverse problems. In this method, the parameters to be identified are treated as random variables. The prior knowledge, the system nonlinearity, and the measurement errors can be directly incorporated in the posterior probability density function (PDF) of the parameters. The Markov chain Monte Carlo (MCMC) method is a powerful tool to generate samples from the posterior PDF. However, since the MCMC usually requires thousands or even millions of forward simulations, it can be a computationally intensive endeavor, particularly when faced with large-scale flow and transport models. To address this issue, we construct a surrogate system for the model responses in the form of polynomials by the stochastic collocation method. In addition, we employ interpolation based on the nested sparse grids and takes into account the different importance of the parameters, under the condition of high random dimensions in the stochastic space. Furthermore, in case of low regularity such as discontinuous or unsmooth relation between the input parameters and the output responses, we introduce an additional transform process to improve the accuracy of the surrogate model. Once we build the surrogate system, we may evaluate the likelihood with very little computational cost. We analyzed the convergence rate of the forward solution and the surrogate posterior by Kullback-Leibler divergence, which quantifies the difference between probability distributions. The fast convergence of the forward solution implies fast convergence of the surrogate posterior to the true posterior. We also tested the proposed algorithm on water-flooding two-phase flow reservoir examples. The posterior PDF calculated from a very long chain with direct forward simulation is assumed to be accurate. The posterior PDF calculated using the surrogate model is in reasonable agreement with the reference, revealing a great improvement in terms of computational efficiency.

  13. Sediment classification using neural networks: An example from the site-U1344A of IODP Expedition 323 in the Bering Sea

    NASA Astrophysics Data System (ADS)

    Ojha, Maheswar; Maiti, Saumen

    2016-03-01

    A novel approach based on the concept of Bayesian neural network (BNN) has been implemented for classifying sediment boundaries using downhole log data obtained during Integrated Ocean Drilling Program (IODP) Expedition 323 in the Bering Sea slope region. The Bayesian framework in conjunction with Markov Chain Monte Carlo (MCMC)/hybrid Monte Carlo (HMC) learning paradigm has been applied to constrain the lithology boundaries using density, density porosity, gamma ray, sonic P-wave velocity and electrical resistivity at the Hole U1344A. We have demonstrated the effectiveness of our supervised classification methodology by comparing our findings with a conventional neural network and a Bayesian neural network optimized by scaled conjugate gradient method (SCG), and tested the robustness of the algorithm in the presence of red noise in the data. The Bayesian results based on the HMC algorithm (BNN.HMC) resolve detailed finer structures at certain depths in addition to main lithology such as silty clay, diatom clayey silt and sandy silt. Our method also recovers the lithology information from a depth ranging between 615 and 655 m Wireline log Matched depth below Sea Floor of no core recovery zone. Our analyses demonstrate that the BNN based approach renders robust means for the classification of complex lithology successions at the Hole U1344A, which could be very useful for other studies and understanding the oceanic crustal inhomogeneity and structural discontinuities.

  14. Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model

    NASA Astrophysics Data System (ADS)

    Al Sobhi, Mashail M.

    2015-02-01

    Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.

  15. Near-optimal alternative generation using modified hit-and-run sampling for non-linear, non-convex problems

    NASA Astrophysics Data System (ADS)

    Rosenberg, D. E.; Alafifi, A.

    2016-12-01

    Water resources systems analysis often focuses on finding optimal solutions. Yet an optimal solution is optimal only for the modelled issues and managers often seek near-optimal alternatives that address un-modelled objectives, preferences, limits, uncertainties, and other issues. Early on, Modelling to Generate Alternatives (MGA) formalized near-optimal as the region comprising the original problem constraints plus a new constraint that allowed performance within a specified tolerance of the optimal objective function value. MGA identified a few maximally-different alternatives from the near-optimal region. Subsequent work applied Markov Chain Monte Carlo (MCMC) sampling to generate a larger number of alternatives that span the near-optimal region of linear problems or select portions for non-linear problems. We extend the MCMC Hit-And-Run method to generate alternatives that span the full extent of the near-optimal region for non-linear, non-convex problems. First, start at a feasible hit point within the near-optimal region, then run a random distance in a random direction to a new hit point. Next, repeat until generating the desired number of alternatives. The key step at each iterate is to run a random distance along the line in the specified direction to a new hit point. If linear equity constraints exist, we construct an orthogonal basis and use a null space transformation to confine hits and runs to a lower-dimensional space. Linear inequity constraints define the convex bounds on the line that runs through the current hit point in the specified direction. We then use slice sampling to identify a new hit point along the line within bounds defined by the non-linear inequity constraints. This technique is computationally efficient compared to prior near-optimal alternative generation techniques such MGA, MCMC Metropolis-Hastings, evolutionary, or firefly algorithms because search at each iteration is confined to the hit line, the algorithm can move in one step to any point in the near-optimal region, and each iterate generates a new, feasible alternative. We use the method to generate alternatives that span the near-optimal regions of simple and more complicated water management problems and may be preferred to optimal solutions. We also discuss extensions to handle non-linear equity constraints.

  16. Stochastic static fault slip inversion from geodetic data with non-negativity and bound constraints

    NASA Astrophysics Data System (ADS)

    Nocquet, J.-M.

    2018-07-01

    Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modelling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a truncated multivariate normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulae for the single, 2-D or n-D marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations. Posterior mean and covariance can also be efficiently derived. I show that the maximum posterior (MAP) can be obtained using a non-negative least-squares algorithm for the single truncated case or using the bounded-variable least-squares algorithm for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modelling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC-based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the MAP is extremely fast.

  17. Modeling the Hyperdistribution of Item Parameters To Improve the Accuracy of Recovery in Estimation Procedures.

    ERIC Educational Resources Information Center

    Matthews-Lopez, Joy L.; Hombo, Catherine M.

    The purpose of this study was to examine the recovery of item parameters in simulated Automatic Item Generation (AIG) conditions, using Markov chain Monte Carlo (MCMC) estimation methods to attempt to recover the generating distributions. To do this, variability in item and ability parameters was manipulated. Realistic AIG conditions were…

  18. Marginal Maximum A Posteriori Item Parameter Estimation for the Generalized Graded Unfolding Model

    ERIC Educational Resources Information Center

    Roberts, James S.; Thompson, Vanessa M.

    2011-01-01

    A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…

  19. Mass change distribution inverted from space-borne gravimetric data using a Monte Carlo method

    NASA Astrophysics Data System (ADS)

    Zhou, X.; Sun, X.; Wu, Y.; Sun, W.

    2017-12-01

    Mass estimate plays a key role in using temporally satellite gravimetric data to quantify the terrestrial water storage change. GRACE (Gravity Recovery and Climate Experiment) only observes the low degree gravity field changes, which can be used to estimate the total surface density or equivalent water height (EWH) variation, with a limited spatial resolution of 300 km. There are several methods to estimate the mass variation in an arbitrary region, such as averaging kernel, forward modelling and mass concentration (mascon). Mascon method can isolate the local mass from the gravity change at a large scale through solving the observation equation (objective function) which represents the relationship between unknown masses and the measurements. To avoid the unreasonable local mass inverted from smoothed gravity change map, regularization has to be used in the inversion. We herein give a Markov chain Monte Carlo (MCMC) method to objectively determine the regularization parameter for the non-negative mass inversion problem. We first apply this approach to the mass inversion from synthetic data. Result show MCMC can effectively reproduce the local mass variation taking GRACE measurement error into consideration. We then use MCMC to estimate the ground water change rate of North China Plain from GRACE gravity change rate from 2003 to 2014 under a supposition of the continuous ground water loss in this region. Inversion result show that the ground water loss rate in North China Plain is 7.6±0.2Gt/yr during past 12 years which is coincident with that from previous researches.

  20. SCoPE: an efficient method of Cosmological Parameter Estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Das, Santanu; Souradeep, Tarun, E-mail: santanud@iucaa.ernet.in, E-mail: tarun@iucaa.ernet.in

    Markov Chain Monte Carlo (MCMC) sampler is widely used for cosmological parameter estimation from CMB and other data. However, due to the intrinsic serial nature of the MCMC sampler, convergence is often very slow. Here we present a fast and independently written Monte Carlo method for cosmological parameter estimation named as Slick Cosmological Parameter Estimator (SCoPE), that employs delayed rejection to increase the acceptance rate of a chain, and pre-fetching that helps an individual chain to run on parallel CPUs. An inter-chain covariance update is also incorporated to prevent clustering of the chains allowing faster and better mixing of themore » chains. We use an adaptive method for covariance calculation to calculate and update the covariance automatically as the chains progress. Our analysis shows that the acceptance probability of each step in SCoPE is more than 95% and the convergence of the chains are faster. Using SCoPE, we carry out some cosmological parameter estimations with different cosmological models using WMAP-9 and Planck results. One of the current research interests in cosmology is quantifying the nature of dark energy. We analyze the cosmological parameters from two illustrative commonly used parameterisations of dark energy models. We also asses primordial helium fraction in the universe can be constrained by the present CMB data from WMAP-9 and Planck. The results from our MCMC analysis on the one hand helps us to understand the workability of the SCoPE better, on the other hand it provides a completely independent estimation of cosmological parameters from WMAP-9 and Planck data.« less

  1. Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

    PubMed Central

    Lanfear, Robert; Hua, Xia; Warren, Dan L.

    2016-01-01

    Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794

  2. RadVel: The Radial Velocity Modeling Toolkit

    NASA Astrophysics Data System (ADS)

    Fulton, Benjamin J.; Petigura, Erik A.; Blunt, Sarah; Sinukoff, Evan

    2018-04-01

    RadVel is an open-source Python package for modeling Keplerian orbits in radial velocity (RV) timeseries. RadVel provides a convenient framework to fit RVs using maximum a posteriori optimization and to compute robust confidence intervals by sampling the posterior probability density via Markov Chain Monte Carlo (MCMC). RadVel allows users to float or fix parameters, impose priors, and perform Bayesian model comparison. We have implemented real-time MCMC convergence tests to ensure adequate sampling of the posterior. RadVel can output a number of publication-quality plots and tables. Users may interface with RadVel through a convenient command-line interface or directly from Python. The code is object-oriented and thus naturally extensible. We encourage contributions from the community. Documentation is available at http://radvel.readthedocs.io.

  3. Population forecasts for Bangladesh, using a Bayesian methodology.

    PubMed

    Mahsin, Md; Hossain, Syed Shahadat

    2012-12-01

    Population projection for many developing countries could be quite a challenging task for the demographers mostly due to lack of availability of enough reliable data. The objective of this paper is to present an overview of the existing methods for population forecasting and to propose an alternative based on the Bayesian statistics, combining the formality of inference. The analysis has been made using Markov Chain Monte Carlo (MCMC) technique for Bayesian methodology available with the software WinBUGS. Convergence diagnostic techniques available with the WinBUGS software have been applied to ensure the convergence of the chains necessary for the implementation of MCMC. The Bayesian approach allows for the use of observed data and expert judgements by means of appropriate priors, and a more realistic population forecasts, along with associated uncertainty, has been possible.

  4. An information-theoretic approach to the gravitational-wave burst detection problem

    NASA Astrophysics Data System (ADS)

    Katsavounidis, E.; Lynch, R.; Vitale, S.; Essick, R.; Robinet, F.

    2016-03-01

    The advanced era of gravitational-wave astronomy, with data collected in part by the LIGO gravitational-wave interferometers, has begun as of fall 2015. One potential type of detectable gravitational waves is short-duration gravitational-wave bursts, whose waveforms can be difficult to predict. We present the framework for a new detection algorithm - called oLIB - that can be used in relatively low-latency to turn calibrated strain data into a detection significance statement. This pipeline consists of 1) a sine-Gaussian matched-filter trigger generator based on the Q-transform - known as Omicron -, 2) incoherent down-selection of these triggers to the most signal-like set, and 3) a fully coherent analysis of this signal-like set using the Markov chain Monte Carlo (MCMC) Bayesian evidence calculator LALInferenceBurst (LIB). We optimally extract this information by using a likelihood-ratio test (LRT) to map these search statistics into a significance statement. Using representative archival LIGO data, we show that the algorithm can detect gravitational-wave burst events of realistic strength in realistic instrumental noise with good detection efficiencies across different burst waveform morphologies. With support from the National Science Foundation under Grant PHY-0757058.

  5. Mathematical modeling, analysis and Markov Chain Monte Carlo simulation of Ebola epidemics

    NASA Astrophysics Data System (ADS)

    Tulu, Thomas Wetere; Tian, Boping; Wu, Zunyou

    Ebola virus infection is a severe infectious disease with the highest case fatality rate which become the global public health treat now. What makes the disease the worst of all is no specific effective treatment available, its dynamics is not much researched and understood. In this article a new mathematical model incorporating both vaccination and quarantine to study the dynamics of Ebola epidemic has been developed and comprehensively analyzed. The existence as well as uniqueness of the solution to the model is also verified and the basic reproduction number is calculated. Besides, stability conditions are also checked and finally simulation is done using both Euler method and one of the top ten most influential algorithm known as Markov Chain Monte Carlo (MCMC) method. Different rates of vaccination to predict the effect of vaccination on the infected individual over time and that of quarantine are discussed. The results show that quarantine and vaccination are very effective ways to control Ebola epidemic. From our study it was also seen that there is less possibility of an individual for getting Ebola virus for the second time if they survived his/her first infection. Last but not least real data has been fitted to the model, showing that it can used to predict the dynamic of Ebola epidemic.

  6. A Bayesian trans-dimensional approach for the fusion of multiple geophysical datasets

    NASA Astrophysics Data System (ADS)

    JafarGandomi, Arash; Binley, Andrew

    2013-09-01

    We propose a Bayesian fusion approach to integrate multiple geophysical datasets with different coverage and sensitivity. The fusion strategy is based on the capability of various geophysical methods to provide enough resolution to identify either subsurface material parameters or subsurface structure, or both. We focus on electrical resistivity as the target material parameter and electrical resistivity tomography (ERT), electromagnetic induction (EMI), and ground penetrating radar (GPR) as the set of geophysical methods. However, extending the approach to different sets of geophysical parameters and methods is straightforward. Different geophysical datasets are entered into a trans-dimensional Markov chain Monte Carlo (McMC) search-based joint inversion algorithm. The trans-dimensional property of the McMC algorithm allows dynamic parameterisation of the model space, which in turn helps to avoid bias of the post-inversion results towards a particular model. Given that we are attempting to develop an approach that has practical potential, we discretize the subsurface into an array of one-dimensional earth-models. Accordingly, the ERT data that are collected by using two-dimensional acquisition geometry are re-casted to a set of equivalent vertical electric soundings. Different data are inverted either individually or jointly to estimate one-dimensional subsurface models at discrete locations. We use Shannon's information measure to quantify the information obtained from the inversion of different combinations of geophysical datasets. Information from multiple methods is brought together via introducing joint likelihood function and/or constraining the prior information. A Bayesian maximum entropy approach is used for spatial fusion of spatially dispersed estimated one-dimensional models and mapping of the target parameter. We illustrate the approach with a synthetic dataset and then apply it to a field dataset. We show that the proposed fusion strategy is successful not only in enhancing the subsurface information but also as a survey design tool to identify the appropriate combination of the geophysical tools and show whether application of an individual method for further investigation of a specific site is beneficial.

  7. Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model.

    PubMed

    Dong, Linsong; Wang, Zhiyong

    2018-06-11

    Genomic prediction is feasible for estimating genomic breeding values because of dense genome-wide markers and credible statistical methods, such as Genomic Best Linear Unbiased Prediction (GBLUP) and various Bayesian methods. Compared with GBLUP, Bayesian methods propose more flexible assumptions for the distributions of SNP effects. However, most Bayesian methods are performed based on Markov chain Monte Carlo (MCMC) algorithms, leading to computational efficiency challenges. Hence, some fast Bayesian approaches, such as fast BayesB (fBayesB), were proposed to speed up the calculation. This study proposed another fast Bayesian method termed fast BayesC (fBayesC). The prior distribution of fBayesC assumes that a SNP with probability γ has a non-zero effect which comes from a normal density with a common variance. The simulated data from QTLMAS XII workshop and actual data on large yellow croaker were used to compare the predictive results of fBayesB, fBayesC and (MCMC-based) BayesC. The results showed that when γ was set as a small value, such as 0.01 in the simulated data or 0.001 in the actual data, fBayesB and fBayesC yielded lower prediction accuracies (abilities) than BayesC. In the actual data, fBayesC could yield very similar predictive abilities as BayesC when γ ≥ 0.01. When γ = 0.01, fBayesB could also yield similar results as fBayesC and BayesC. However, fBayesB could not yield an explicit result when γ ≥ 0.1, but a similar situation was not observed for fBayesC. Moreover, the computational speed of fBayesC was significantly faster than that of BayesC, making fBayesC a promising method for genomic prediction.

  8. Robust Bayesian Analysis of Heavy-tailed Stochastic Volatility Models using Scale Mixtures of Normal Distributions

    PubMed Central

    Abanto-Valle, C. A.; Bandyopadhyay, D.; Lachos, V. H.; Enriquez, I.

    2009-01-01

    A Bayesian analysis of stochastic volatility (SV) models using the class of symmetric scale mixtures of normal (SMN) distributions is considered. In the face of non-normality, this provides an appealing robust alternative to the routine use of the normal distribution. Specific distributions examined include the normal, student-t, slash and the variance gamma distributions. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo (MCMC) algorithm is introduced for parameter estimation. Moreover, the mixing parameters obtained as a by-product of the scale mixture representation can be used to identify outliers. The methods developed are applied to analyze daily stock returns data on S&P500 index. Bayesian model selection criteria as well as out-of- sample forecasting results reveal that the SV models based on heavy-tailed SMN distributions provide significant improvement in model fit as well as prediction to the S&P500 index data over the usual normal model. PMID:20730043

  9. PyDREAM: high-dimensional parameter inference for biological models in python.

    PubMed

    Shockley, Erin M; Vrugt, Jasper A; Lopez, Carlos F; Valencia, Alfonso

    2018-02-15

    Biological models contain many parameters whose values are difficult to measure directly via experimentation and therefore require calibration against experimental data. Markov chain Monte Carlo (MCMC) methods are suitable to estimate multivariate posterior model parameter distributions, but these methods may exhibit slow or premature convergence in high-dimensional search spaces. Here, we present PyDREAM, a Python implementation of the (Multiple-Try) Differential Evolution Adaptive Metropolis [DREAM(ZS)] algorithm developed by Vrugt and ter Braak (2008) and Laloy and Vrugt (2012). PyDREAM achieves excellent performance for complex, parameter-rich models and takes full advantage of distributed computing resources, facilitating parameter inference and uncertainty estimation of CPU-intensive biological models. PyDREAM is freely available under the GNU GPLv3 license from the Lopez lab GitHub repository at http://github.com/LoLab-VU/PyDREAM. c.lopez@vanderbilt.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  10. Spectral decompositions of multiple time series: a Bayesian non-parametric approach.

    PubMed

    Macaro, Christian; Prado, Raquel

    2014-01-01

    We consider spectral decompositions of multiple time series that arise in studies where the interest lies in assessing the influence of two or more factors. We write the spectral density of each time series as a sum of the spectral densities associated to the different levels of the factors. We then use Whittle's approximation to the likelihood function and follow a Bayesian non-parametric approach to obtain posterior inference on the spectral densities based on Bernstein-Dirichlet prior distributions. The prior is strategically important as it carries identifiability conditions for the models and allows us to quantify our degree of confidence in such conditions. A Markov chain Monte Carlo (MCMC) algorithm for posterior inference within this class of frequency-domain models is presented.We illustrate the approach by analyzing simulated and real data via spectral one-way and two-way models. In particular, we present an analysis of functional magnetic resonance imaging (fMRI) brain responses measured in individuals who participated in a designed experiment to study pain perception in humans.

  11. Charming dark matter

    NASA Astrophysics Data System (ADS)

    Jubb, Thomas; Kirk, Matthew; Lenz, Alexander

    2017-12-01

    We have considered a model of Dark Minimal Flavour Violation (DMFV), in which a triplet of dark matter particles couple to right-handed up-type quarks via a heavy colour-charged scalar mediator. By studying a large spectrum of possible constraints, and assessing the entire parameter space using a Markov Chain Monte Carlo (MCMC), we can place strong restrictions on the allowed parameter space for dark matter models of this type.

  12. An example of complex modelling in dentistry using Markov chain Monte Carlo (MCMC) simulation.

    PubMed

    Helfenstein, Ulrich; Menghini, Giorgio; Steiner, Marcel; Murati, Francesca

    2002-09-01

    In the usual regression setting one regression line is computed for a whole data set. In a more complex situation, each person may be observed for example at several points in time and thus a regression line might be calculated for each person. Additional complexities, such as various forms of errors in covariables may make a straightforward statistical evaluation difficult or even impossible. During recent years methods have been developed allowing convenient analysis of problems where the data and the corresponding models show these and many other forms of complexity. The methodology makes use of a Bayesian approach and Markov chain Monte Carlo (MCMC) simulations. The methods allow the construction of increasingly elaborate models by building them up from local sub-models. The essential structure of the models can be represented visually by directed acyclic graphs (DAG). This attractive property allows communication and discussion of the essential structure and the substantial meaning of a complex model without needing algebra. After presentation of the statistical methods an example from dentistry is presented in order to demonstrate their application and use. The dataset of the example had a complex structure; each of a set of children was followed up over several years. The number of new fillings in permanent teeth had been recorded at several ages. The dependent variables were markedly different from the normal distribution and could not be transformed to normality. In addition, explanatory variables were assumed to be measured with different forms of error. Illustration of how the corresponding models can be estimated conveniently via MCMC simulation, in particular, 'Gibbs sampling', using the freely available software BUGS is presented. In addition, how the measurement error may influence the estimates of the corresponding coefficients is explored. It is demonstrated that the effect of the independent variable on the dependent variable may be markedly underestimated if the measurement error is not taken into account ('regression dilution bias'). Markov chain Monte Carlo methods may be of great value to dentists in allowing analysis of data sets which exhibit a wide range of different forms of complexity.

  13. Joint analysis of input and parametric uncertainties in watershed water quality modeling: A formal Bayesian approach

    NASA Astrophysics Data System (ADS)

    Han, Feng; Zheng, Yi

    2018-06-01

    Significant Input uncertainty is a major source of error in watershed water quality (WWQ) modeling. It remains challenging to address the input uncertainty in a rigorous Bayesian framework. This study develops the Bayesian Analysis of Input and Parametric Uncertainties (BAIPU), an approach for the joint analysis of input and parametric uncertainties through a tight coupling of Markov Chain Monte Carlo (MCMC) analysis and Bayesian Model Averaging (BMA). The formal likelihood function for this approach is derived considering a lag-1 autocorrelated, heteroscedastic, and Skew Exponential Power (SEP) distributed error model. A series of numerical experiments were performed based on a synthetic nitrate pollution case and on a real study case in the Newport Bay Watershed, California. The Soil and Water Assessment Tool (SWAT) and Differential Evolution Adaptive Metropolis (DREAM(ZS)) were used as the representative WWQ model and MCMC algorithm, respectively. The major findings include the following: (1) the BAIPU can be implemented and used to appropriately identify the uncertain parameters and characterize the predictive uncertainty; (2) the compensation effect between the input and parametric uncertainties can seriously mislead the modeling based management decisions, if the input uncertainty is not explicitly accounted for; (3) the BAIPU accounts for the interaction between the input and parametric uncertainties and therefore provides more accurate calibration and uncertainty results than a sequential analysis of the uncertainties; and (4) the BAIPU quantifies the credibility of different input assumptions on a statistical basis and can be implemented as an effective inverse modeling approach to the joint inference of parameters and inputs.

  14. A Spitzer five-band analysis of the Jupiter-sized planet TrES-1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cubillos, Patricio; Harrington, Joseph; Foster, Andrew S. D.

    2014-12-10

    With an equilibrium temperature of 1200 K, TrES-1 is one of the coolest hot Jupiters observed by Spitzer. It was also the first planet discovered by any transit survey and one of the first exoplanets from which thermal emission was directly observed. We analyzed all Spitzer eclipse and transit data for TrES-1 and obtained its eclipse depths and brightness temperatures in the 3.6 μm (0.083% ± 0.024%, 1270 ± 110 K), 4.5 μm (0.094% ± 0.024%, 1126 ± 90 K), 5.8 μm (0.162% ± 0.042%, 1205 ± 130 K), 8.0 μm (0.213% ± 0.042%, 1190 ± 130 K), and 16more » μm (0.33% ± 0.12%, 1270 ± 310 K) bands. The eclipse depths can be explained, within 1σ errors, by a standard atmospheric model with solar abundance composition in chemical equilibrium, with or without a thermal inversion. The combined analysis of the transit, eclipse, and radial-velocity ephemerides gives an eccentricity of e=0.033{sub −0.031}{sup +0.015}, consistent with a circular orbit. Since TrES-1's eclipses have low signal-to-noise ratios, we implemented optimal photometry and differential-evolution Markov Chain Monte Carlo (MCMC) algorithms in our Photometry for Orbits, Eclipses, and Transits pipeline. Benefits include higher photometric precision and ∼10 times faster MCMC convergence, with better exploration of the phase space and no manual parameter tuning.« less

  15. Parameter Optimisation and Uncertainty Analysis in Visual MODFLOW based Flow Model for predicting the groundwater head in an Eastern Indian Aquifer

    NASA Astrophysics Data System (ADS)

    Mohanty, B.; Jena, S.; Panda, R. K.

    2016-12-01

    The overexploitation of groundwater elicited in abandoning several shallow tube wells in the study Basin in Eastern India. For the sustainability of groundwater resources, basin-scale modelling of groundwater flow is indispensable for the effective planning and management of the water resources. The basic intent of this study is to develop a 3-D groundwater flow model of the study basin using the Visual MODFLOW Flex 2014.2 package and successfully calibrate and validate the model using 17 years of observed data. The sensitivity analysis was carried out to quantify the susceptibility of aquifer system to the river bank seepage, recharge from rainfall and agriculture practices, horizontal and vertical hydraulic conductivities, and specific yield. To quantify the impact of parameter uncertainties, Sequential Uncertainty Fitting Algorithm (SUFI-2) and Markov chain Monte Carlo (McMC) techniques were implemented. Results from the two techniques were compared and the advantages and disadvantages were analysed. Nash-Sutcliffe coefficient (NSE), Coefficient of Determination (R2), Mean Absolute Error (MAE), Mean Percent Deviation (Dv) and Root Mean Squared Error (RMSE) were adopted as criteria of model evaluation during calibration and validation of the developed model. NSE, R2, MAE, Dv and RMSE values for groundwater flow model during calibration and validation were in acceptable range. Also, the McMC technique was able to provide more reasonable results than SUFI-2. The calibrated and validated model will be useful to identify the aquifer properties, analyse the groundwater flow dynamics and the change in groundwater levels in future forecasts.

  16. Multilocus lod scores in large pedigrees: combination of exact and approximate calculations.

    PubMed

    Tong, Liping; Thompson, Elizabeth

    2008-01-01

    To detect the positions of disease loci, lod scores are calculated at multiple chromosomal positions given trait and marker data on members of pedigrees. Exact lod score calculations are often impossible when the size of the pedigree and the number of markers are both large. In this case, a Markov Chain Monte Carlo (MCMC) approach provides an approximation. However, to provide accurate results, mixing performance is always a key issue in these MCMC methods. In this paper, we propose two methods to improve MCMC sampling and hence obtain more accurate lod score estimates in shorter computation time. The first improvement generalizes the block-Gibbs meiosis (M) sampler to multiple meiosis (MM) sampler in which multiple meioses are updated jointly, across all loci. The second one divides the computations on a large pedigree into several parts by conditioning on the haplotypes of some 'key' individuals. We perform exact calculations for the descendant parts where more data are often available, and combine this information with sampling of the hidden variables in the ancestral parts. Our approaches are expected to be most useful for data on a large pedigree with a lot of missing data. (c) 2007 S. Karger AG, Basel

  17. Multilocus Lod Scores in Large Pedigrees: Combination of Exact and Approximate Calculations

    PubMed Central

    Tong, Liping; Thompson, Elizabeth

    2007-01-01

    To detect the positions of disease loci, lod scores are calculated at multiple chromosomal positions given trait and marker data on members of pedigrees. Exact lod score calculations are often impossible when the size of the pedigree and the number of markers are both large. In this case, a Markov Chain Monte Carlo (MCMC) approach provides an approximation. However, to provide accurate results, mixing performance is always a key issue in these MCMC methods. In this paper, we propose two methods to improve MCMC sampling and hence obtain more accurate lod score estimates in shorter computation time. The first improvement generalizes the block-Gibbs meiosis (M) sampler to multiple meiosis (MM) sampler in which multiple meioses are updated jointly, across all loci. The second one divides the computations on a large pedigree into several parts by conditioning on the haplotypes of some ‘key’ individuals. We perform exact calculations for the descendant parts where more data are often available, and combine this information with sampling of the hidden variables in the ancestral parts. Our approaches are expected to be most useful for data on a large pedigree with a lot of missing data. PMID:17934317

  18. A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

    PubMed Central

    Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.

    2011-01-01

    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348

  19. MCMC multilocus lod scores: application of a new approach.

    PubMed

    George, Andrew W; Wijsman, Ellen M; Thompson, Elizabeth A

    2005-01-01

    On extended pedigrees with extensive missing data, the calculation of multilocus likelihoods for linkage analysis is often beyond the computational bounds of exact methods. Growing interest therefore surrounds the implementation of Monte Carlo estimation methods. In this paper, we demonstrate the speed and accuracy of a new Markov chain Monte Carlo method for the estimation of linkage likelihoods through an analysis of real data from a study of early-onset Alzheimer's disease. For those data sets where comparison with exact analysis is possible, we achieved up to a 100-fold increase in speed. Our approach is implemented in the program lm_bayes within the framework of the freely available MORGAN 2.6 package for Monte Carlo genetic analysis (http://www.stat.washington.edu/thompson/Genepi/MORGAN/Morgan.shtml).

  20. Toward a probabilistic acoustic emission source location algorithm: A Bayesian approach

    NASA Astrophysics Data System (ADS)

    Schumacher, Thomas; Straub, Daniel; Higgins, Christopher

    2012-09-01

    Acoustic emissions (AE) are stress waves initiated by sudden strain releases within a solid body. These can be caused by internal mechanisms such as crack opening or propagation, crushing, or rubbing of crack surfaces. One application for the AE technique in the field of Structural Engineering is Structural Health Monitoring (SHM). With piezo-electric sensors mounted to the surface of the structure, stress waves can be detected, recorded, and stored for later analysis. An important step in quantitative AE analysis is the estimation of the stress wave source locations. Commonly, source location results are presented in a rather deterministic manner as spatial and temporal points, excluding information about uncertainties and errors. Due to variability in the material properties and uncertainty in the mathematical model, measures of uncertainty are needed beyond best-fit point solutions for source locations. This paper introduces a novel holistic framework for the development of a probabilistic source location algorithm. Bayesian analysis methods with Markov Chain Monte Carlo (MCMC) simulation are employed where all source location parameters are described with posterior probability density functions (PDFs). The proposed methodology is applied to an example employing data collected from a realistic section of a reinforced concrete bridge column. The selected approach is general and has the advantage that it can be extended and refined efficiently. Results are discussed and future steps to improve the algorithm are suggested.

  1. Stochastic static fault slip inversion from geodetic data with non-negativity and bounds constraints

    NASA Astrophysics Data System (ADS)

    Nocquet, J.-M.

    2018-04-01

    Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems (Tarantola & Valette 1982; Tarantola 2005) provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modeling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a Truncated Multi-Variate Normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulas for the single, two-dimensional or n-dimensional marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations (e.g. Genz & Bretz 2009). Posterior mean and covariance can also be efficiently derived. I show that the Maximum Posterior (MAP) can be obtained using a Non-Negative Least-Squares algorithm (Lawson & Hanson 1974) for the single truncated case or using the Bounded-Variable Least-Squares algorithm (Stark & Parker 1995) for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov Chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modeling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the Maximum Posterior (MAP) is extremely fast.

  2. Orbits for 18 Visual Binaries and Two Double-line Spectroscopic Binaries Observed with HRCAM on the CTIO SOAR 4 m Telescope, Using a New Bayesian Orbit Code Based on Markov Chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Mendez, Rene A.; Claveria, Ruben M.; Orchard, Marcos E.; Silva, Jorge F.

    2017-11-01

    We present orbital elements and mass sums for 18 visual binary stars of spectral types B to K (five of which are new orbits) with periods ranging from 20 to more than 500 yr. For two double-line spectroscopic binaries with no previous orbits, the individual component masses, using combined astrometric and radial velocity data, have a formal uncertainty of ˜ 0.1 {M}⊙ . Adopting published photometry and trigonometric parallaxes, plus our own measurements, we place these objects on an H-R diagram and discuss their evolutionary status. These objects are part of a survey to characterize the binary population of stars in the Southern Hemisphere using the SOAR 4 m telescope+HRCAM at CTIO. Orbital elements are computed using a newly developed Markov chain Monte Carlo (MCMC) algorithm that delivers maximum-likelihood estimates of the parameters, as well as posterior probability density functions that allow us to evaluate the uncertainty of our derived parameters in a robust way. For spectroscopic binaries, using our approach, it is possible to derive a self-consistent parallax for the system from the combined astrometric and radial velocity data (“orbital parallax”), which compares well with the trigonometric parallaxes. We also present a mathematical formalism that allows a dimensionality reduction of the feature space from seven to three search parameters (or from 10 to seven dimensions—including parallax—in the case of spectroscopic binaries with astrometric data), which makes it possible to explore a smaller number of parameters in each case, improving the computational efficiency of our MCMC code. Based on observations obtained at the Southern Astrophysical Research (SOAR) telescope, which is a joint project of the Ministério da Ciência, Tecnologia, e Inovação (MCTI) da República Federativa do Brasil, the U.S. National Optical Astronomy Observatory (NOAO), the University of North Carolina at Chapel Hill (UNC), and Michigan State University (MSU).

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nitao, J J

    The goal of the Event Reconstruction Project is to find the location and strength of atmospheric release points, both stationary and moving. Source inversion relies on observational data as input. The methodology is sufficiently general to allow various forms of data. In this report, the authors will focus primarily on concentration measurements obtained at point monitoring locations at various times. The algorithms being investigated in the Project are the MCMC (Markov Chain Monte Carlo), SMC (Sequential Monte Carlo) Methods, classical inversion methods, and hybrids of these. They refer the reader to the report by Johannesson et al. (2004) for explanationsmore » of these methods. These methods require computing the concentrations at all monitoring locations for a given ''proposed'' source characteristic (locations and strength history). It is anticipated that the largest portion of the CPU time will take place performing this computation. MCMC and SMC will require this computation to be done at least tens of thousands of times. Therefore, an efficient means of computing forward model predictions is important to making the inversion practical. In this report they show how Green's functions and reciprocal Green's functions can significantly accelerate forward model computations. First, instead of computing a plume for each possible source strength history, they can compute plumes from unit impulse sources only. By using linear superposition, they can obtain the response for any strength history. This response is given by the forward Green's function. Second, they may use the law of reciprocity. Suppose that they require the concentration at a single monitoring point x{sub m} due to a potential (unit impulse) source that is located at x{sub s}. instead of computing a plume with source location x{sub s}, they compute a ''reciprocal plume'' whose (unit impulse) source is at the monitoring locations x{sub m}. The reciprocal plume is computed using a reversed-direction wind field. The wind field and transport coefficients must also be appropriately time-reversed. Reciprocity says that the concentration of reciprocal plume at x{sub s} is related to the desired concentration at x{sub m}. Since there are many less monitoring points than potential source locations, the number of forward model computations is drastically reduced.« less

  4. Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition.

    PubMed

    Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen

    2017-12-27

    Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.

  5. LATTE Linking Acoustic Tests and Tagging Using Statistical Estimation

    DTIC Science & Technology

    2015-09-30

    the complexity of the model: (from simplest to most complex) Kalman filter , Markov chain Monte-Carlo (MCMC) and ABC. Many of these methods have been...using SMMs fitted using Kalman filters . Therefore, using the DTAG data, we can estimate the distributions associated with 2D horizontal displacement...speed (a key problem in the previous Kalman filter implementation). This new approach also allows the animal’s horizontal movement direction to differ

  6. Evaluation of Electromagnetic Induction (EMI) Resistivity Technologies for Assessing Permafrost Geomorphologies

    DTIC Science & Technology

    2016-08-01

    Structures Laboratory MCMC Markov Chain Monte Carlo NAPL Non -Aqueous Phase Liquids ppm Parts per Million R&D Research and Development SERDP... Research Engineering Labora- tory (CRREL) and Fridon Shubitidze at Dartmouth College lead a research group that has constructed several research -grade...by Bar- rowes’ research group to obtain EC over lines or areas. Another possibility is to use unmanned helicopters to acquire data over larger areas

  7. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan

    In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic amplitude versus angle (AVA) and controlled source electromagnetic (CSEM) data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo (MCMC) sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis (DREAM) and Adaptive Metropolis (AM) samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and CSEM data. The multi-chain MCMC is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration,more » the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic AVA and CSEM joint inversion provides better estimation of reservoir saturations than the seismic AVA-only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated – reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  8. The Joker: A Custom Monte Carlo Sampler for Binary-star and Exoplanet Radial Velocity Data

    NASA Astrophysics Data System (ADS)

    Price-Whelan, Adrian M.; Hogg, David W.; Foreman-Mackey, Daniel; Rix, Hans-Walter

    2017-03-01

    Given sparse or low-quality radial velocity measurements of a star, there are often many qualitatively different stellar or exoplanet companion orbit models that are consistent with the data. The consequent multimodality of the likelihood function leads to extremely challenging search, optimization, and Markov chain Monte Carlo (MCMC) posterior sampling over the orbital parameters. Here we create a custom Monte Carlo sampler for sparse or noisy radial velocity measurements of two-body systems that can produce posterior samples for orbital parameters even when the likelihood function is poorly behaved. The six standard orbital parameters for a binary system can be split into four nonlinear parameters (period, eccentricity, argument of pericenter, phase) and two linear parameters (velocity amplitude, barycenter velocity). We capitalize on this by building a sampling method in which we densely sample the prior probability density function (pdf) in the nonlinear parameters and perform rejection sampling using a likelihood function marginalized over the linear parameters. With sparse or uninformative data, the sampling obtained by this rejection sampling is generally multimodal and dense. With informative data, the sampling becomes effectively unimodal but too sparse: in these cases we follow the rejection sampling with standard MCMC. The method produces correct samplings in orbital parameters for data that include as few as three epochs. The Joker can therefore be used to produce proper samplings of multimodal pdfs, which are still informative and can be used in hierarchical (population) modeling. We give some examples that show how the posterior pdf depends sensitively on the number and time coverage of the observations and their uncertainties.

  9. Cognitive diagnosis modelling incorporating item response times.

    PubMed

    Zhan, Peida; Jiao, Hong; Liao, Dandan

    2018-05-01

    To provide more refined diagnostic feedback with collateral information in item response times (RTs), this study proposed joint modelling of attributes and response speed using item responses and RTs simultaneously for cognitive diagnosis. For illustration, an extended deterministic input, noisy 'and' gate (DINA) model was proposed for joint modelling of responses and RTs. Model parameter estimation was explored using the Bayesian Markov chain Monte Carlo (MCMC) method. The PISA 2012 computer-based mathematics data were analysed first. These real data estimates were treated as true values in a subsequent simulation study. A follow-up simulation study with ideal testing conditions was conducted as well to further evaluate model parameter recovery. The results indicated that model parameters could be well recovered using the MCMC approach. Further, incorporating RTs into the DINA model would improve attribute and profile correct classification rates and result in more accurate and precise estimation of the model parameters. © 2017 The British Psychological Society.

  10. Asteroid mass estimation with Markov-chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Siltala, Lauri; Granvik, Mikael

    2017-10-01

    Estimates for asteroid masses are based on their gravitational perturbations on the orbits of other objects such as Mars, spacecraft, or other asteroids and/or their satellites. In the case of asteroid-asteroid perturbations, this leads to a 13-dimensional inverse problem at minimum where the aim is to derive the mass of the perturbing asteroid and six orbital elements for both the perturbing asteroid and the test asteroid by fitting their trajectories to their observed positions. The fitting has typically been carried out with linearized methods such as the least-squares method. These methods need to make certain assumptions regarding the shape of the probability distributions of the model parameters. This is problematic as these assumptions have not been validated. We have developed a new Markov-chain Monte Carlo method for mass estimation which does not require an assumption regarding the shape of the parameter distribution. Recently, we have implemented several upgrades to our MCMC method including improved schemes for handling observational errors and outlier data alongside the option to consider multiple perturbers and/or test asteroids simultaneously. These upgrades promise significantly improved results: based on two separate results for (19) Fortuna with different test asteroids we previously hypothesized that simultaneous use of both test asteroids would lead to an improved result similar to the average literature value for (19) Fortuna with substantially reduced uncertainties. Our upgraded algorithm indeed finds a result essentially equal to the literature value for this asteroid, confirming our previous hypothesis. Here we show these new results for (19) Fortuna and other example cases, and compare our results to previous estimates. Finally, we discuss our plans to improve our algorithm further, particularly in connection with Gaia.

  11. Systematic evaluation of sequential geostatistical resampling within MCMC for posterior sampling of near-surface geophysical inverse problems

    NASA Astrophysics Data System (ADS)

    Ruggeri, Paolo; Irving, James; Holliger, Klaus

    2015-08-01

    We critically examine the performance of sequential geostatistical resampling (SGR) as a model proposal mechanism for Bayesian Markov-chain-Monte-Carlo (MCMC) solutions to near-surface geophysical inverse problems. Focusing on a series of simple yet realistic synthetic crosshole georadar tomographic examples characterized by different numbers of data, levels of data error and degrees of model parameter spatial correlation, we investigate the efficiency of three different resampling strategies with regard to their ability to generate statistically independent realizations from the Bayesian posterior distribution. Quite importantly, our results show that, no matter what resampling strategy is employed, many of the examined test cases require an unreasonably high number of forward model runs to produce independent posterior samples, meaning that the SGR approach as currently implemented will not be computationally feasible for a wide range of problems. Although use of a novel gradual-deformation-based proposal method can help to alleviate these issues, it does not offer a full solution. Further, we find that the nature of the SGR is found to strongly influence MCMC performance; however no clear rule exists as to what set of inversion parameters and/or overall proposal acceptance rate will allow for the most efficient implementation. We conclude that although the SGR methodology is highly attractive as it allows for the consideration of complex geostatistical priors as well as conditioning to hard and soft data, further developments are necessary in the context of novel or hybrid MCMC approaches for it to be considered generally suitable for near-surface geophysical inversions.

  12. Reparametrization-based estimation of genetic parameters in multi-trait animal model using Integrated Nested Laplace Approximation.

    PubMed

    Mathew, Boby; Holand, Anna Marie; Koistinen, Petri; Léon, Jens; Sillanpää, Mikko J

    2016-02-01

    A novel reparametrization-based INLA approach as a fast alternative to MCMC for the Bayesian estimation of genetic parameters in multivariate animal model is presented. Multi-trait genetic parameter estimation is a relevant topic in animal and plant breeding programs because multi-trait analysis can take into account the genetic correlation between different traits and that significantly improves the accuracy of the genetic parameter estimates. Generally, multi-trait analysis is computationally demanding and requires initial estimates of genetic and residual correlations among the traits, while those are difficult to obtain. In this study, we illustrate how to reparametrize covariance matrices of a multivariate animal model/animal models using modified Cholesky decompositions. This reparametrization-based approach is used in the Integrated Nested Laplace Approximation (INLA) methodology to estimate genetic parameters of multivariate animal model. Immediate benefits are: (1) to avoid difficulties of finding good starting values for analysis which can be a problem, for example in Restricted Maximum Likelihood (REML); (2) Bayesian estimation of (co)variance components using INLA is faster to execute than using Markov Chain Monte Carlo (MCMC) especially when realized relationship matrices are dense. The slight drawback is that priors for covariance matrices are assigned for elements of the Cholesky factor but not directly to the covariance matrix elements as in MCMC. Additionally, we illustrate the concordance of the INLA results with the traditional methods like MCMC and REML approaches. We also present results obtained from simulated data sets with replicates and field data in rice.

  13. RadVel: General toolkit for modeling Radial Velocities

    NASA Astrophysics Data System (ADS)

    Fulton, Benjamin J.; Petigura, Erik A.; Blunt, Sarah; Sinukoff, Evan

    2018-01-01

    RadVel models Keplerian orbits in radial velocity (RV) time series. The code is written in Python with a fast Kepler's equation solver written in C. It provides a framework for fitting RVs using maximum a posteriori optimization and computing robust confidence intervals by sampling the posterior probability density via Markov Chain Monte Carlo (MCMC). RadVel can perform Bayesian model comparison and produces publication quality plots and LaTeX tables.

  14. Bayesian Orbit Computation Tools for Objects on Geocentric Orbits

    NASA Astrophysics Data System (ADS)

    Virtanen, J.; Granvik, M.; Muinonen, K.; Oszkiewicz, D.

    2013-08-01

    We consider the space-debris orbital inversion problem via the concept of Bayesian inference. The methodology has been put forward for the orbital analysis of solar system small bodies in early 1990's [7] and results in a full solution of the statistical inverse problem given in terms of a posteriori probability density function (PDF) for the orbital parameters. We demonstrate the applicability of our statistical orbital analysis software to Earth orbiting objects, both using well-established Monte Carlo (MC) techniques (for a review, see e.g. [13] as well as recently developed Markov-chain MC (MCMC) techniques (e.g., [9]). In particular, we exploit the novel virtual observation MCMC method [8], which is based on the characterization of the phase-space volume of orbital solutions before the actual MCMC sampling. Our statistical methods and the resulting PDFs immediately enable probabilistic impact predictions to be carried out. Furthermore, this can be readily done also for very sparse data sets and data sets of poor quality - providing that some a priori information on the observational uncertainty is available. For asteroids, impact probabilities with the Earth from the discovery night onwards have been provided, e.g., by [11] and [10], the latter study includes the sampling of the observational-error standard deviation as a random variable.

  15. No control genes required: Bayesian analysis of qRT-PCR data.

    PubMed

    Matz, Mikhail V; Wright, Rachel M; Scott, James G

    2013-01-01

    Model-based analysis of data from quantitative reverse-transcription PCR (qRT-PCR) is potentially more powerful and versatile than traditional methods. Yet existing model-based approaches cannot properly deal with the higher sampling variances associated with low-abundant targets, nor do they provide a natural way to incorporate assumptions about the stability of control genes directly into the model-fitting process. In our method, raw qPCR data are represented as molecule counts, and described using generalized linear mixed models under Poisson-lognormal error. A Markov Chain Monte Carlo (MCMC) algorithm is used to sample from the joint posterior distribution over all model parameters, thereby estimating the effects of all experimental factors on the expression of every gene. The Poisson-based model allows for the correct specification of the mean-variance relationship of the PCR amplification process, and can also glean information from instances of no amplification (zero counts). Our method is very flexible with respect to control genes: any prior knowledge about the expected degree of their stability can be directly incorporated into the model. Yet the method provides sensible answers without such assumptions, or even in the complete absence of control genes. We also present a natural Bayesian analogue of the "classic" analysis, which uses standard data pre-processing steps (logarithmic transformation and multi-gene normalization) but estimates all gene expression changes jointly within a single model. The new methods are considerably more flexible and powerful than the standard delta-delta Ct analysis based on pairwise t-tests. Our methodology expands the applicability of the relative-quantification analysis protocol all the way to the lowest-abundance targets, and provides a novel opportunity to analyze qRT-PCR data without making any assumptions concerning target stability. These procedures have been implemented as the MCMC.qpcr package in R.

  16. Towards a formal genealogical classification of the Lezgian languages (North Caucasus): testing various phylogenetic methods on lexical data.

    PubMed

    Kassian, Alexei

    2015-01-01

    A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies.

  17. Towards a Formal Genealogical Classification of the Lezgian Languages (North Caucasus): Testing Various Phylogenetic Methods on Lexical Data

    PubMed Central

    Kassian, Alexei

    2015-01-01

    A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies. PMID:25719456

  18. Development of engine activity cycles for the prime movers of unconventional natural gas well development.

    PubMed

    Johnson, Derek; Heltzel, Robert; Nix, Andrew; Barrow, Rebekah

    2017-03-01

    With the advent of unconventional natural gas resources, new research focuses on the efficiency and emissions of the prime movers powering these fleets. These prime movers also play important roles in emissions inventories for this sector. Industry seeks to reduce operating costs by decreasing the required fuel demands of these high horsepower engines but conducting in-field or full-scale research on new technologies is cost prohibitive. As such, this research completed extensive in-use data collection efforts for the engines powering over-the-road trucks, drilling engines, and hydraulic stimulation pump engines. These engine activity data were processed in order to make representative test cycles using a Markov Chain, Monte Carlo (MCMC) simulation method. Such cycles can be applied under controlled environments on scaled engines for future research. In addition to MCMC, genetic algorithms were used to improve the overall performance values for the test cycles and smoothing was applied to ensure regression criteria were met during implementation on a test engine and dynamometer. The variations in cycle and in-use statistics are presented along with comparisons to conventional test cycles used for emissions compliance. Development of representative, engine dynamometer test cycles, from in-use activity data, is crucial in understanding fuel efficiency and emissions for engine operating modes that are different from cycles mandated by the Code of Federal Regulations. Representative cycles were created for the prime movers of unconventional well development-over-the-road (OTR) trucks and drilling and hydraulic fracturing engines. The representative cycles are implemented on scaled engines to reduce fuel consumption during research and development of new technologies in controlled laboratory environments.

  19. Iterative updating of model error for Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew

    2018-02-01

    In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.

  20. A High-Resolution Aerosol Retrieval Method for Urban Areas Using MISR Data

    NASA Astrophysics Data System (ADS)

    Moon, T.; Wang, Y.; Liu, Y.; Yu, B.

    2012-12-01

    Satellite-retrieved Aerosol Optical Depth (AOD) can provide a cost-effective way to monitor particulate air pollution without using expensive ground measurement sensors. One of the current state-of-the-art AOD retrieval method is NASA's Multi-angle Imaging SpectroRadiometer (MISR) operational algorithm, which has the spatial resolution of 17.6 km x 17.6 km. While the MISR baseline scheme already leads to exciting research opportunities to study particle compositions at regional scale, its spatial resolution is too coarse for analyzing urban areas where the AOD level has stronger spatial variations. We develop a novel high-resolution AOD retrieval algorithm that still uses MISR's radiance observations but has the resolution of 4.4km x 4.4km. We achieve the high resolution AOD retrieval by implementing a hierarchical Bayesian model and Monte-Carlo Markov Chain (MCMC) inference method. Our algorithm not only improves the spatial resolution, but also extends the coverage of AOD retrieval and provides with additional composition information of aerosol components that contribute to the AOD. We validate our method using the recent NASA's DISCOVER-AQ mission data, which contains the ground measured AOD values for Washington DC and Baltimore area. The validation result shows that, compared to the operational MISR retrievals, our scheme has 41.1% more AOD retrieval coverage for the DISCOVER-AQ data points and 24.2% improvement in mean-squared error (MSE) with respect to the AERONET ground measurements.

  1. Study of optical and electronic properties of nickel from reflection electron energy loss spectra

    NASA Astrophysics Data System (ADS)

    Xu, H.; Yang, L. H.; Da, B.; Tóth, J.; Tőkési, K.; Ding, Z. J.

    2017-09-01

    We use the classical Monte Carlo transport model of electrons moving near the surface and inside solids to reproduce the measured reflection electron energy-loss spectroscopy (REELS) spectra. With the combination of the classical transport model and the Markov chain Monte Carlo (MCMC) sampling of oscillator parameters the so-called reverse Monte Carlo (RMC) method was developed, and used to obtain optical constants of Ni in this work. A systematic study of the electronic and optical properties of Ni has been performed in an energy loss range of 0-200 eV from the measured REELS spectra at primary energies of 1000 eV, 2000 eV and 3000 eV. The reliability of our method was tested by comparing our results with the previous data. Moreover, the accuracy of our optical data has been confirmed by applying oscillator strength-sum rule and perfect-screening-sum rule.

  2. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability using High-Resolution Cloud Observations

    NASA Astrophysics Data System (ADS)

    Norris, P. M.; da Silva, A. M., Jr.

    2016-12-01

    Norris and da Silva recently published a method to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation (CDA). The gridcolumn model includes assumed-PDF intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used are MODIS cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. The new approach not only significantly reduces mean and standard deviation biases with respect to the assimilated observables, but also improves the simulated rotational-Ramman scattering cloud optical centroid pressure against independent (non-assimilated) retrievals from the OMI instrument. One obvious difficulty for the method, and other CDA methods, is the lack of information content in passive cloud observables on cloud vertical structure, beyond cloud-top and thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard is helpful, better honoring inversion structures in the background state.

  3. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

    PubMed

    Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

    2016-12-20

    Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  4. Copula-based assessment of the relationship between food peaks and flood volumes using information on historical floods by Bayesian Monte Carlo Markov Chain simulations

    NASA Astrophysics Data System (ADS)

    Gaál, Ladislav; Szolgay, Ján.; Bacigál, Tomáå.¡; Kohnová, Silvia

    2010-05-01

    Copula-based estimation methods of hydro-climatological extremes have increasingly been gaining attention of researchers and practitioners in the last couple of years. Unlike the traditional estimation methods which are based on bivariate cumulative distribution functions (CDFs), copulas are a relatively flexible tool of statistics that allow for modelling dependencies between two or more variables such as flood peaks and flood volumes without making strict assumptions on the marginal distributions. The dependence structure and the reliability of the joint estimates of hydro-climatological extremes, mainly in the right tail of the joint CDF not only depends on the particular copula adopted but also on the data available for the estimation of the marginal distributions of the individual variables. Generally, data samples for frequency modelling have limited temporal extent, which is a considerable drawback of frequency analyses in practice. Therefore, it is advised to deal with statistical methods that improve any part of the process of copula construction and result in more reliable design values of hydrological variables. The scarcity of the data sample mostly in the extreme tail of the joint CDF can be bypassed, e.g., by using a considerably larger amount of simulated data by rainfall-runoff analysis or by including historical information on the variables under study. The latter approach of data extension is used here to make the quantile estimates of the individual marginals of the copula more reliable. In the presented paper it is proposed to use historical information in the frequency analysis of the marginal distributions in the framework of Bayesian Monte Carlo Markov Chain (MCMC) simulations. Generally, a Bayesian approach allows for a straightforward combination of different sources of information on floods (e.g. flood data from systematic measurements and historical flood records, respectively) in terms of a product of the corresponding likelihood functions. On the other hand, the MCMC algorithm is a numerical approach for sampling from the likelihood distributions. The Bayesian MCMC methods therefore provide an attractive way to estimate the uncertainty in parameters and quantile metrics of frequency distributions. The applicability of the method is demonstrated in a case study of the hydroelectric power station Orlík on the Vltava River. This site has a key role in the flood prevention of Prague, the capital city of the Czech Republic. The record length of the available flood data is 126 years from the period 1877-2002, while the flood event observed in 2002 that caused extensive damages and numerous casualties is treated as a historic one. To estimate the joint probabilities of flood peaks and volumes, different copulas are fitted and their goodness-of-fit are evaluated by bootstrap simulations. Finally, selected quantiles of flood volumes conditioned on given flood peaks are derived and compared with those obtained by the traditional method used in the practice of water management specialists of the Vltava River.

  5. Kepler Uniform Modeling of KOIs: MCMC Notes for Data Release 25

    NASA Technical Reports Server (NTRS)

    Hoffman, Kelsey L.; Rowe, Jason F.

    2017-01-01

    This document describes data products related to the reported planetary parameters and uncertainties for the Kepler Objects of Interest (KOIs) based on a Markov-Chain-Monte-Carlo (MCMC) analysis. Reported parameters, uncertainties and data products can be found at the NASA Exoplanet Archive . The codes used for this data analysis are available on the Github website (Rowe 2016). The relevant paper for details of the calculations is Rowe et al. (2015). The main differences between the model fits discussed here and those in the DR24 catalogue are that the DR25 light curves were used in the analysis, our processing of the MAST light curves took into account different data flags, the number of chains calculated was doubled to 200 000, and the parameters which are reported are based on a damped least-squares fit, instead of the median value from the Markov chain or the chain with the lowest 2 as reported in the past.

  6. Assessing Model Fitting of Megamaser Disks with Simulated Observations

    NASA Astrophysics Data System (ADS)

    Han, Jiwon; Braatz, James; Pesce, Dominic

    2018-01-01

    The Megamaser Cosmology Project (MCP) measures the Hubble Constant by determining distances to galaxies with observations of 22 GHz H20 megamasers. The megamasers arise in the circumnuclear accretion disks of active galaxies. In this research, we aim to improve the estimation of systematic errors in MCP measurements. Currently, the MCP fits a disk model to the observed maser data with a Markov Chain Monte Carlo (MCMC) code. The disk model is described by up to 14 global parameters, including up to 6 that describe the disk warping. We first assess the model by generating synthetic datasets in which the locations and dynamics of the maser spots are exactly known, and fitting the model to these data. By doing so, we can also test the effects of unmodeled substructure on the estimated uncertainties. Furthermore, in order to gain better understanding of the physics behind accretion disk warping, we develop a physics-driven model for the warp and test it with the MCMC approach.

  7. Itô-SDE MCMC method for Bayesian characterization of errors associated with data limitations in stochastic expansion methods for uncertainty quantification

    NASA Astrophysics Data System (ADS)

    Arnst, M.; Abello Álvarez, B.; Ponthot, J.-P.; Boman, R.

    2017-11-01

    This paper is concerned with the characterization and the propagation of errors associated with data limitations in polynomial-chaos-based stochastic methods for uncertainty quantification. Such an issue can arise in uncertainty quantification when only a limited amount of data is available. When the available information does not suffice to accurately determine the probability distributions that must be assigned to the uncertain variables, the Bayesian method for assigning these probability distributions becomes attractive because it allows the stochastic model to account explicitly for insufficiency of the available information. In previous work, such applications of the Bayesian method had already been implemented by using the Metropolis-Hastings and Gibbs Markov Chain Monte Carlo (MCMC) methods. In this paper, we present an alternative implementation, which uses an alternative MCMC method built around an Itô stochastic differential equation (SDE) that is ergodic for the Bayesian posterior. We draw together from the mathematics literature a number of formal properties of this Itô SDE that lend support to its use in the implementation of the Bayesian method, and we describe its discretization, including the choice of the free parameters, by using the implicit Euler method. We demonstrate the proposed methodology on a problem of uncertainty quantification in a complex nonlinear engineering application relevant to metal forming.

  8. pyblocxs: Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

    NASA Astrophysics Data System (ADS)

    Siemiginowska, A.; Kashyap, V.; Refsdal, B.; van Dyk, D.; Connors, A.; Park, T.

    2011-07-01

    Typical X-ray spectra have low counts and should be modeled using the Poisson distribution. However, χ2 statistic is often applied as an alternative and the data are assumed to follow the Gaussian distribution. A variety of weights to the statistic or a binning of the data is performed to overcome the low counts issues. However, such modifications introduce biases or/and a loss of information. Standard modeling packages such as XSPEC and Sherpa provide the Poisson likelihood and allow computation of rudimentary MCMC chains, but so far do not allow for setting a full Bayesian model. We have implemented a sophisticated Bayesian MCMC-based algorithm to carry out spectral fitting of low counts sources in the Sherpa environment. The code is a Python extension to Sherpa and allows to fit a predefined Sherpa model to high-energy X-ray spectral data and other generic data. We present the algorithm and discuss several issues related to the implementation, including flexible definition of priors and allowing for variations in the calibration information.

  9. The Joker: A custom Monte Carlo sampler for binary-star and exoplanet radial velocity data

    NASA Astrophysics Data System (ADS)

    Price-Whelan, Adrian M.; Hogg, David W.; Foreman-Mackey, Daniel; Rix, Hans-Walter

    2017-01-01

    Given sparse or low-quality radial-velocity measurements of a star, there are often many qualitatively different stellar or exoplanet companion orbit models that are consistent with the data. The consequent multimodality of the likelihood function leads to extremely challenging search, optimization, and MCMC posterior sampling over the orbital parameters. The Joker is a custom-built Monte Carlo sampler that can produce a posterior sampling for orbital parameters given sparse or noisy radial-velocity measurements, even when the likelihood function is poorly behaved. The method produces correct samplings in orbital parameters for data that include as few as three epochs. The Joker can therefore be used to produce proper samplings of multimodal pdfs, which are still highly informative and can be used in hierarchical (population) modeling.

  10. Bayesian Treatment of Uncertainty in Environmental Modeling: Optimization, Sampling and Data Assimilation Using the DREAM Software Package

    NASA Astrophysics Data System (ADS)

    Vrugt, J. A.

    2012-12-01

    In the past decade much progress has been made in the treatment of uncertainty in earth systems modeling. Whereas initial approaches has focused mostly on quantification of parameter and predictive uncertainty, recent methods attempt to disentangle the effects of parameter, forcing (input) data, model structural and calibration data errors. In this talk I will highlight some of our recent work involving theory, concepts and applications of Bayesian parameter and/or state estimation. In particular, new methods for sequential Monte Carlo (SMC) and Markov Chain Monte Carlo (MCMC) simulation will be presented with emphasis on massively parallel distributed computing and quantification of model structural errors. The theoretical and numerical developments will be illustrated using model-data synthesis problems in hydrology, hydrogeology and geophysics.

  11. Reconciling a geophysical model to data using a Markov chain Monte Carlo algorithm: An application to the Yellow Sea-Korean Peninsula region

    NASA Astrophysics Data System (ADS)

    Pasyanos, Michael E.; Franz, Gregory A.; Ramirez, Abelardo L.

    2006-03-01

    In an effort to build seismic models that are the most consistent with multiple data sets we have applied a new probabilistic inverse technique. This method uses a Markov chain Monte Carlo (MCMC) algorithm to sample models from a prior distribution and test them against multiple data types to generate a posterior distribution. While computationally expensive, this approach has several advantages over deterministic models, notably the seamless reconciliation of different data types that constrain the model, the proper handling of both data and model uncertainties, and the ability to easily incorporate a variety of prior information, all in a straightforward, natural fashion. A real advantage of the technique is that it provides a more complete picture of the solution space. By mapping out the posterior probability density function, we can avoid simplistic assumptions about the model space and allow alternative solutions to be identified, compared, and ranked. Here we use this method to determine the crust and upper mantle structure of the Yellow Sea and Korean Peninsula region. The model is parameterized as a series of seven layers in a regular latitude-longitude grid, each of which is characterized by thickness and seismic parameters (Vp, Vs, and density). We use surface wave dispersion and body wave traveltime data to drive the model. We find that when properly tuned (i.e., the Markov chains have had adequate time to fully sample the model space and the inversion has converged), the technique behaves as expected. The posterior model reflects the prior information at the edge of the model where there is little or no data to constrain adjustments, but the range of acceptable models is significantly reduced in data-rich regions, producing values of sediment thickness, crustal thickness, and upper mantle velocities consistent with expectations based on knowledge of the regional tectonic setting.

  12. Honest Importance Sampling with Multiple Markov Chains

    PubMed Central

    Tan, Aixin; Doss, Hani; Hobert, James P.

    2017-01-01

    Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π1, is used to estimate an expectation with respect to another, π. The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π1 is replaced by a Harris ergodic Markov chain with invariant density π1, then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π1, …, πk, are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection. PMID:28701855

  13. Honest Importance Sampling with Multiple Markov Chains.

    PubMed

    Tan, Aixin; Doss, Hani; Hobert, James P

    2015-01-01

    Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π 1 , is used to estimate an expectation with respect to another, π . The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π 1 is replaced by a Harris ergodic Markov chain with invariant density π 1 , then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π 1 , …, π k , are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection.

  14. Markov Chain Monte Carlo estimation of species distributions: a case study of the swift fox in western Kansas

    USGS Publications Warehouse

    Sargeant, Glen A.; Sovada, Marsha A.; Slivinski, Christiane C.; Johnson, Douglas H.

    2005-01-01

    Accurate maps of species distributions are essential tools for wildlife research and conservation. Unfortunately, biologists often are forced to rely on maps derived from observed occurrences recorded opportunistically during observation periods of variable length. Spurious inferences are likely to result because such maps are profoundly affected by the duration and intensity of observation and by methods used to delineate distributions, especially when detection is uncertain. We conducted a systematic survey of swift fox (Vulpes velox) distribution in western Kansas, USA, and used Markov chain Monte Carlo (MCMC) image restoration to rectify these problems. During 1997–1999, we searched 355 townships (ca. 93 km) 1–3 times each for an average cost of $7,315 per year and achieved a detection rate (probability of detecting swift foxes, if present, during a single search) of = 0.69 (95% Bayesian confidence interval [BCI] = [0.60, 0.77]). Our analysis produced an estimate of the underlying distribution, rather than a map of observed occurrences, that reflected the uncertainty associated with estimates of model parameters. To evaluate our results, we analyzed simulated data with similar properties. Results of our simulations suggest negligible bias and good precision when probabilities of detection on ≥1 survey occasions (cumulative probabilities of detection) exceed 0.65. Although the use of MCMC image restoration has been limited by theoretical and computational complexities, alternatives do not possess the same advantages. Image models accommodate uncertain detection, do not require spatially independent data or a census of map units, and can be used to estimate species distributions directly from observations without relying on habitat covariates or parameters that must be estimated subjectively. These features facilitate economical surveys of large regions, the detection of temporal trends in distribution, and assessments of landscape-level relations between species and habitats. Requirements for the use of MCMC image restoration include study areas that can be partitioned into regular grids of mapping units, spatially contagious species distributions, reliable methods for identifying target species, and cumulative probabilities of detection ≥0.65.

  15. Markov chain Monte Carlo estimation of species distributions: A case study of the swift fox in western Kansas

    USGS Publications Warehouse

    Sargeant, G.A.; Sovada, M.A.; Slivinski, C.C.; Johnson, D.H.

    2005-01-01

    Accurate maps of species distributions are essential tools for wildlife research and conservation. Unfortunately, biologists often are forced to rely on maps derived from observed occurrences recorded opportunistically during observation periods of variable length. Spurious inferences are likely to result because such maps are profoundly affected by the duration and intensity of observation and by methods used to delineate distributions, especially when detection is uncertain. We conducted a systematic survey of swift fox (Vulpes velox) distribution in western Kansas, USA, and used Markov chain Monte Carlo (MCMC) image restoration to rectify these problems. During 1997-1999, we searched 355 townships (ca. 93 km2) 1-3 times each for an average cost of $7,315 per year and achieved a detection rate (probability of detecting swift foxes, if present, during a single search) of ?? = 0.69 (95% Bayesian confidence interval [BCI] = [0.60, 0.77]). Our analysis produced an estimate of the underlying distribution, rather than a map of observed occurrences, that reflected the uncertainty associated with estimates of model parameters. To evaluate our results, we analyzed simulated data with similar properties. Results of our simulations suggest negligible bias and good precision when probabilities of detection on ???1 survey occasions (cumulative probabilities of detection) exceed 0.65. Although the use of MCMC image restoration has been limited by theoretical and computational complexities, alternatives do not possess the same advantages. Image models accommodate uncertain detection, do not require spatially independent data or a census of map units, and can be used to estimate species distributions directly from observations without relying on habitat covariates or parameters that must be estimated subjectively. These features facilitate economical surveys of large regions, the detection of temporal trends in distribution, and assessments of landscape-level relations between species and habitats. Requirements for the use of MCMC image restoration include study areas that can be partitioned into regular grids of mapping units, spatially contagious species distributions, reliable methods for identifying target species, and cumulative probabilities of detection ???0.65.

  16. Use of Markov Chain Monte Carlo analysis with a physiologically-based pharmacokinetic model of methylmercury to estimate exposures in US women of childbearing age.

    PubMed

    Allen, Bruce C; Hack, C Eric; Clewell, Harvey J

    2007-08-01

    A Bayesian approach, implemented using Markov Chain Monte Carlo (MCMC) analysis, was applied with a physiologically-based pharmacokinetic (PBPK) model of methylmercury (MeHg) to evaluate the variability of MeHg exposure in women of childbearing age in the U.S. population. The analysis made use of the newly available National Health and Nutrition Survey (NHANES) blood and hair mercury concentration data for women of age 16-49 years (sample size, 1,582). Bayesian analysis was performed to estimate the population variability in MeHg exposure (daily ingestion rate) implied by the variation in blood and hair concentrations of mercury in the NHANES database. The measured variability in the NHANES blood and hair data represents the result of a process that includes interindividual variation in exposure to MeHg and interindividual variation in the pharmacokinetics (distribution, clearance) of MeHg. The PBPK model includes a number of pharmacokinetic parameters (e.g., tissue volumes, partition coefficients, rate constants for metabolism and elimination) that can vary from individual to individual within the subpopulation of interest. Using MCMC analysis, it was possible to combine prior distributions of the PBPK model parameters with the NHANES blood and hair data, as well as with kinetic data from controlled human exposures to MeHg, to derive posterior distributions that refine the estimates of both the population exposure distribution and the pharmacokinetic parameters. In general, based on the populations surveyed by NHANES, the results of the MCMC analysis indicate that a small fraction, less than 1%, of the U.S. population of women of childbearing age may have mercury exposures greater than the EPA RfD for MeHg of 0.1 microg/kg/day, and that there are few, if any, exposures greater than the ATSDR MRL of 0.3 microg/kg/day. The analysis also indicates that typical exposures may be greater than previously estimated from food consumption surveys, but that the variability in exposure within the population of U.S. women of childbearing age may be less than previously assumed.

  17. Hydrostratigraphy characterization of the Floridan aquifer system using ambient seismic noise

    NASA Astrophysics Data System (ADS)

    James, Stephanie R.; Screaton, Elizabeth J.; Russo, Raymond M.; Panning, Mark P.; Bremner, Paul M.; Stanciu, A. Christian; Torpey, Megan E.; Hongsresawat, Sutatcha; Farrell, Matthew E.

    2017-05-01

    We investigated a new technique for aquifer characterization that uses cross-correlation of ambient seismic noise to determine seismic velocity structure of the Floridan aquifer system (FAS). Accurate characterization of aquifer systems is vital to hydrogeological research and groundwater management but is difficult due to limited subsurface data and heterogeneity. Previous research on the carbonate FAS found that confining units and high permeability flow zones have distinct seismic velocities. We deployed an array of 9 short period seismometers from 11/2013 to 3/2014 in Indian Lake State Forest near Ocala, Florida, to image the hydrostratigraphy of the aquifer system using ambient seismic noise. We find that interstation distance strongly influences the upper and lower frequency limits of the data set. Seismic waves propagating within 1.5 and 7 wavelengths between stations were optimal for reliable group velocity measurements and both an upper and lower wavelength threshold was used. A minimum of 100-250 hr of signal was needed to maximize signal-to-noise ratio and to allow cross-correlation convergence. We averaged measurements of group velocity between station pairs at each frequency band to create a network average dispersion curve. A family of 1-D shear-wave velocity profiles that best represents the network average dispersion was then generated using a Markov Chain Monte Carlo (MCMC) algorithm. The MCMC algorithm was implemented with either a fixed number of layers, or as transdimensional in which the number of layers was a free parameter. Results from both algorithms require a prominent velocity increase at ∼200 m depth. A shallower velocity increase at ∼60 m depth was also observed, but only in model ensembles created by collecting models with the lowest overall misfit to the observed data. A final round of modelling with additional prior constraints based on initial results and well logs produced a mean shear-wave velocity profile taken as the preferred solution for the study site. The velocity increases at ∼200 and ∼60 m depth are consistent with the top surfaces of two semi-confining units of the study area and the depths of high-resistivity dolomite units seen in geophysical logs and cores from the study site. Our results suggest that correlation of ambient seismic noise holds promise for hydrogeological investigations. However, complexities in the cross-correlations at high frequencies and short traveltimes at low frequencies added uncertainty to the data set.

  18. Bayesian linkage and segregation analysis: factoring the problem.

    PubMed

    Matthysse, S

    2000-01-01

    Complex segregation analysis and linkage methods are mathematical techniques for the genetic dissection of complex diseases. They are used to delineate complex modes of familial transmission and to localize putative disease susceptibility loci to specific chromosomal locations. The computational problem of Bayesian linkage and segregation analysis is one of integration in high-dimensional spaces. In this paper, three available techniques for Bayesian linkage and segregation analysis are discussed: Markov Chain Monte Carlo (MCMC), importance sampling, and exact calculation. The contribution of each to the overall integration will be explicitly discussed.

  19. Appraisal of jump distributions in ensemble-based sampling algorithms

    NASA Astrophysics Data System (ADS)

    Dejanic, Sanda; Scheidegger, Andreas; Rieckermann, Jörg; Albert, Carlo

    2017-04-01

    Sampling Bayesian posteriors of model parameters is often required for making model-based probabilistic predictions. For complex environmental models, standard Monte Carlo Markov Chain (MCMC) methods are often infeasible because they require too many sequential model runs. Therefore, we focused on ensemble methods that use many Markov chains in parallel, since they can be run on modern cluster architectures. Little is known about how to choose the best performing sampler, for a given application. A poor choice can lead to an inappropriate representation of posterior knowledge. We assessed two different jump moves, the stretch and the differential evolution move, underlying, respectively, the software packages EMCEE and DREAM, which are popular in different scientific communities. For the assessment, we used analytical posteriors with features as they often occur in real posteriors, namely high dimensionality, strong non-linear correlations or multimodality. For posteriors with non-linear features, standard convergence diagnostics based on sample means can be insufficient. Therefore, we resorted to an entropy-based convergence measure. We assessed the samplers by means of their convergence speed, robustness and effective sample sizes. For posteriors with strongly non-linear features, we found that the stretch move outperforms the differential evolution move, w.r.t. all three aspects.

  20. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    NASA Astrophysics Data System (ADS)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  1. Extracting Lane Geometry and Topology Information from Vehicle Fleet Trajectories in Complex Urban Scenarios Using a Reversible Jump Mcmc Method

    NASA Astrophysics Data System (ADS)

    Roeth, O.; Zaum, D.; Brenner, C.

    2017-05-01

    Highly automated driving (HAD) requires maps not only of high spatial precision but also of yet unprecedented actuality. Traditionally small highly specialized fleets of measurement vehicles are used to generate such maps. Nevertheless, for achieving city-wide or even nation-wide coverage, automated map update mechanisms based on very large vehicle fleet data gain importance since highly frequent measurements are only to be obtained using such an approach. Furthermore, the processing of imprecise mass data in contrast to few dedicated highly accurate measurements calls for a high degree of automation. We present a method for the generation of lane-accurate road network maps from vehicle trajectory data (GPS or better). Our approach therefore allows for exploiting today's connected vehicle fleets for the generation of HAD maps. The presented algorithm is based on elementary building blocks which guarantees useful lane models and uses a Reversible Jump Markov chain Monte Carlo method to explore the models parameters in order to reconstruct the one most likely emitting the input data. The approach is applied to a challenging urban real-world scenario of different trajectory accuracy levels and is evaluated against a LIDAR-based ground truth map.

  2. Constrained proper sampling of conformations of transition state ensemble of protein folding

    PubMed Central

    Lin, Ming; Zhang, Jian; Lu, Hsiao-Mei; Chen, Rong; Liang, Jie

    2011-01-01

    Characterizing the conformations of protein in the transition state ensemble (TSE) is important for studying protein folding. A promising approach pioneered by Vendruscolo [Nature (London) 409, 641 (2001)] to study TSE is to generate conformations that satisfy all constraints imposed by the experimentally measured ϕ values that provide information about the native likeness of the transition states. Faísca [J. Chem. Phys. 129, 095108 (2008)] generated conformations of TSE based on the criterion that, starting from a TS conformation, the probabilities of folding and unfolding are about equal through Markov Chain Monte Carlo (MCMC) simulations. In this study, we use the technique of constrained sequential Monte Carlo method [Lin , J. Chem. Phys. 129, 094101 (2008); Zhang Proteins 66, 61 (2007)] to generate TSE conformations of acylphosphatase of 98 residues that satisfy the ϕ-value constraints, as well as the criterion that each conformation has a folding probability of 0.5 by Monte Carlo simulations. We adopt a two stage process and first generate 5000 contact maps satisfying the ϕ-value constraints. Each contact map is then used to generate 1000 properly weighted conformations. After clustering similar conformations, we obtain a set of properly weighted samples of 4185 candidate clusters. Representative conformation of each of these cluster is then selected and 50 runs of Markov chain Monte Carlo (MCMC) simulation are carried using a regrowth move set. We then select a subset of 1501 conformations that have equal probabilities to fold and to unfold as the set of TSE. These 1501 samples characterize well the distribution of transition state ensemble conformations of acylphosphatase. Compared with previous studies, our approach can access much wider conformational space and can objectively generate conformations that satisfy the ϕ-value constraints and the criterion of 0.5 folding probability without bias. In contrast to previous studies, our results show that transition state conformations are very diverse and are far from nativelike when measured in cartesian root-mean-square deviation (cRMSD): the average cRMSD between TSE conformations and the native structure is 9.4 Å  for this short protein, instead of 6 Å reported in previous studies. In addition, we found that the average fraction of native contacts in the TSE is 0.37, with enrichment in native-like β-sheets and a shortage of long range contacts, suggesting such contacts form at a later stage of folding. We further calculate the first passage time of folding of TSE conformations through calculation of physical time associated with the regrowth moves in MCMC simulation through mapping such moves to a Markovian state model, whose transition time was obtained by Langevin dynamics simulations. Our results indicate that despite the large structural diversity of the TSE, they are characterized by similar folding time. Our approach is general and can be used to study TSE in other macromolecules. PMID:21341875

  3. Fitting mechanistic epidemic models to data: A comparison of simple Markov chain Monte Carlo approaches.

    PubMed

    Li, Michael; Dushoff, Jonathan; Bolker, Benjamin M

    2018-07-01

    Simple mechanistic epidemic models are widely used for forecasting and parameter estimation of infectious diseases based on noisy case reporting data. Despite the widespread application of models to emerging infectious diseases, we know little about the comparative performance of standard computational-statistical frameworks in these contexts. Here we build a simple stochastic, discrete-time, discrete-state epidemic model with both process and observation error and use it to characterize the effectiveness of different flavours of Bayesian Markov chain Monte Carlo (MCMC) techniques. We use fits to simulated data, where parameters (and future behaviour) are known, to explore the limitations of different platforms and quantify parameter estimation accuracy, forecasting accuracy, and computational efficiency across combinations of modeling decisions (e.g. discrete vs. continuous latent states, levels of stochasticity) and computational platforms (JAGS, NIMBLE, Stan).

  4. Explicitly integrating parameter, input, and structure uncertainties into Bayesian Neural Networks for probabilistic hydrologic forecasting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Xuesong; Liang, Faming; Yu, Beibei

    2011-11-09

    Estimating uncertainty of hydrologic forecasting is valuable to water resources and other relevant decision making processes. Recently, Bayesian Neural Networks (BNNs) have been proved powerful tools for quantifying uncertainty of streamflow forecasting. In this study, we propose a Markov Chain Monte Carlo (MCMC) framework to incorporate the uncertainties associated with input, model structure, and parameter into BNNs. This framework allows the structure of the neural networks to change by removing or adding connections between neurons and enables scaling of input data by using rainfall multipliers. The results show that the new BNNs outperform the BNNs that only consider uncertainties associatedmore » with parameter and model structure. Critical evaluation of posterior distribution of neural network weights, number of effective connections, rainfall multipliers, and hyper-parameters show that the assumptions held in our BNNs are not well supported. Further understanding of characteristics of different uncertainty sources and including output error into the MCMC framework are expected to enhance the application of neural networks for uncertainty analysis of hydrologic forecasting.« less

  5. Hierarchical models and the analysis of bird survey information

    USGS Publications Warehouse

    Sauer, J.R.; Link, W.A.

    2003-01-01

    Management of birds often requires analysis of collections of estimates. We describe a hierarchical modeling approach to the analysis of these data, in which parameters associated with the individual species estimates are treated as random variables, and probability statements are made about the species parameters conditioned on the data. A Markov-Chain Monte Carlo (MCMC) procedure is used to fit the hierarchical model. This approach is computer intensive, and is based upon simulation. MCMC allows for estimation both of parameters and of derived statistics. To illustrate the application of this method, we use the case in which we are interested in attributes of a collection of estimates of population change. Using data for 28 species of grassland-breeding birds from the North American Breeding Bird Survey, we estimate the number of species with increasing populations, provide precision-adjusted rankings of species trends, and describe a measure of population stability as the probability that the trend for a species is within a certain interval. Hierarchical models can be applied to a variety of bird survey applications, and we are investigating their use in estimation of population change from survey data.

  6. HMO selection and Medicare costs: Bayesian MCMC estimation of a robust panel data tobit model with survival.

    PubMed

    Hamilton, B H

    1999-08-01

    The fraction of US Medicare recipients enrolled in health maintenance organizations (HMOs) has increased substantially over the past 10 years. However, the impact of HMOs on health care costs is still hotly debated. In particular, it is argued that HMOs achieve cost reduction through 'cream-skimming' and enrolling relatively healthy patients. This paper develops a Bayesian panel data tobit model of HMO selection and Medicare expenditures for recent US retirees that accounts for mortality over the course of the panel. The model is estimated using Markov Chain Monte Carlo (MCMC) simulation methods, and is novel in that a multivariate t-link is used in place of normality to allow for the heavy-tailed distributions often found in health care expenditure data. The findings indicate that HMOs select individuals who are less likely to have positive health care expenditures prior to enrollment. However, there is no evidence that HMOs disenrol high cost patients. The results also indicate the importance of accounting for survival over the panel, since high mortality probabilities are associated with higher health care expenditures in the last year of life.

  7. No Control Genes Required: Bayesian Analysis of qRT-PCR Data

    PubMed Central

    Matz, Mikhail V.; Wright, Rachel M.; Scott, James G.

    2013-01-01

    Background Model-based analysis of data from quantitative reverse-transcription PCR (qRT-PCR) is potentially more powerful and versatile than traditional methods. Yet existing model-based approaches cannot properly deal with the higher sampling variances associated with low-abundant targets, nor do they provide a natural way to incorporate assumptions about the stability of control genes directly into the model-fitting process. Results In our method, raw qPCR data are represented as molecule counts, and described using generalized linear mixed models under Poisson-lognormal error. A Markov Chain Monte Carlo (MCMC) algorithm is used to sample from the joint posterior distribution over all model parameters, thereby estimating the effects of all experimental factors on the expression of every gene. The Poisson-based model allows for the correct specification of the mean-variance relationship of the PCR amplification process, and can also glean information from instances of no amplification (zero counts). Our method is very flexible with respect to control genes: any prior knowledge about the expected degree of their stability can be directly incorporated into the model. Yet the method provides sensible answers without such assumptions, or even in the complete absence of control genes. We also present a natural Bayesian analogue of the “classic” analysis, which uses standard data pre-processing steps (logarithmic transformation and multi-gene normalization) but estimates all gene expression changes jointly within a single model. The new methods are considerably more flexible and powerful than the standard delta-delta Ct analysis based on pairwise t-tests. Conclusions Our methodology expands the applicability of the relative-quantification analysis protocol all the way to the lowest-abundance targets, and provides a novel opportunity to analyze qRT-PCR data without making any assumptions concerning target stability. These procedures have been implemented as the MCMC.qpcr package in R. PMID:23977043

  8. Gene genealogies for genetic association mapping, with application to Crohn's disease

    PubMed Central

    Burkett, Kelly M.; Greenwood, Celia M. T.; McNeney, Brad; Graham, Jinko

    2013-01-01

    A gene genealogy describes relationships among haplotypes sampled from a population. Knowledge of the gene genealogy for a set of haplotypes is useful for estimation of population genetic parameters and it also has potential application in finding disease-predisposing genetic variants. As the true gene genealogy is unknown, Markov chain Monte Carlo (MCMC) approaches have been used to sample genealogies conditional on data at multiple genetic markers. We previously implemented an MCMC algorithm to sample from an approximation to the distribution of the gene genealogy conditional on haplotype data. Our approach samples ancestral trees, recombination and mutation rates at a genomic focal point. In this work, we describe how our sampler can be used to find disease-predisposing genetic variants in samples of cases and controls. We use a tree-based association statistic that quantifies the degree to which case haplotypes are more closely related to each other around the focal point than control haplotypes, without relying on a disease model. As the ancestral tree is a latent variable, so is the tree-based association statistic. We show how the sampler can be used to estimate the posterior distribution of the latent test statistic and corresponding latent p-values, which together comprise a fuzzy p-value. We illustrate the approach on a publicly-available dataset from a study of Crohn's disease that consists of genotypes at multiple SNP markers in a small genomic region. We estimate the posterior distribution of the tree-based association statistic and the recombination rate at multiple focal points in the region. Reassuringly, the posterior mean recombination rates estimated at the different focal points are consistent with previously published estimates. The tree-based association approach finds multiple sub-regions where the case haplotypes are more genetically related than the control haplotypes, and that there may be one or multiple disease-predisposing loci. PMID:24348515

  9. SVD/MCMC Data Analysis Pipeline for Global Redshifted 21-cm Spectrum Observations of the Cosmic Dawn and Dark Ages

    NASA Astrophysics Data System (ADS)

    Burns, Jack O.; Tauscher, Keith; Rapetti, David; Mirocha, Jordan; Switzer, Eric

    2018-01-01

    We have designed a complete data analysis pipeline for constraining Cosmic Dawn physics using sky-averaged spectra in the VHF range (40-200 MHz) obtained either from the ground (e.g., the Experiment to Detect Global Epoch of Reionization Signal, EDGES; and the Cosmic Twilight Polarimeter, CTP) or from orbit above the lunar farside (e.g., the Dark Ages Radio Explorer, DARE). In the case of DARE, we avoid Earth-based RFI, ionospheric effects, and radio solar emissions (when observing at night). To extract the 21-cm spectrum, we parametrize the cosmological signal and systematics with two separate sets of modes defined through Singular Value Decomposition (SVD) of training set curves. The training set for the 21-cm spin-flip brightness temperatures is composed of theoretical models of the first stars, galaxies and black holes created by varying physical parameters within the ares code. The systematics training set is created using sky and beam data to model the beam-weighted foregrounds (which are about four orders of magnitude larger than the signal) as well as expected lab data to model receiver systematics. To constrain physical parameters determining the 21-cm spectrum, we apply to the extracted signal a series of consecutive fitting techniques including two usages of a Markov Chain Monte Carlo (MCMC) algorithm. Importantly, our pipeline efficiently utilizes the significant differences between the foreground and the 21-cm signal in spatial and spectral variations. In addition, it incorporates for the first time polarization data, dramatically improving the constraining power. We are currently validating this end-to-end pipeline using detailed simulations of the signal, foregrounds and instruments. This work was directly supported by the NASA Solar System Exploration Research Virtual Institute cooperative agreement number 80ARC017M0006 and funding from the NASA Ames Research Center cooperative agreement NNX16AF59G.

  10. Real-time characterization of partially observed epidemics using surrogate models.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Safta, Cosmin; Ray, Jaideep; Lefantzi, Sophia

    We present a statistical method, predicated on the use of surrogate models, for the 'real-time' characterization of partially observed epidemics. Observations consist of counts of symptomatic patients, diagnosed with the disease, that may be available in the early epoch of an ongoing outbreak. Characterization, in this context, refers to estimation of epidemiological parameters that can be used to provide short-term forecasts of the ongoing epidemic, as well as to provide gross information on the dynamics of the etiologic agent in the affected population e.g., the time-dependent infection rate. The characterization problem is formulated as a Bayesian inverse problem, and epidemiologicalmore » parameters are estimated as distributions using a Markov chain Monte Carlo (MCMC) method, thus quantifying the uncertainty in the estimates. In some cases, the inverse problem can be computationally expensive, primarily due to the epidemic simulator used inside the inversion algorithm. We present a method, based on replacing the epidemiological model with computationally inexpensive surrogates, that can reduce the computational time to minutes, without a significant loss of accuracy. The surrogates are created by projecting the output of an epidemiological model on a set of polynomial chaos bases; thereafter, computations involving the surrogate model reduce to evaluations of a polynomial. We find that the epidemic characterizations obtained with the surrogate models is very close to that obtained with the original model. We also find that the number of projections required to construct a surrogate model is O(10)-O(10{sup 2}) less than the number of samples required by the MCMC to construct a stationary posterior distribution; thus, depending upon the epidemiological models in question, it may be possible to omit the offline creation and caching of surrogate models, prior to their use in an inverse problem. The technique is demonstrated on synthetic data as well as observations from the 1918 influenza pandemic collected at Camp Custer, Michigan.« less

  11. Bayesian parameter estimation for nonlinear modelling of biological pathways.

    PubMed

    Ghasemi, Omid; Lindsey, Merry L; Yang, Tianyi; Nguyen, Nguyen; Huang, Yufei; Jin, Yu-Fang

    2011-01-01

    The availability of temporal measurements on biological experiments has significantly promoted research areas in systems biology. To gain insight into the interaction and regulation of biological systems, mathematical frameworks such as ordinary differential equations have been widely applied to model biological pathways and interpret the temporal data. Hill equations are the preferred formats to represent the reaction rate in differential equation frameworks, due to their simple structures and their capabilities for easy fitting to saturated experimental measurements. However, Hill equations are highly nonlinearly parameterized functions, and parameters in these functions cannot be measured easily. Additionally, because of its high nonlinearity, adaptive parameter estimation algorithms developed for linear parameterized differential equations cannot be applied. Therefore, parameter estimation in nonlinearly parameterized differential equation models for biological pathways is both challenging and rewarding. In this study, we propose a Bayesian parameter estimation algorithm to estimate parameters in nonlinear mathematical models for biological pathways using time series data. We used the Runge-Kutta method to transform differential equations to difference equations assuming a known structure of the differential equations. This transformation allowed us to generate predictions dependent on previous states and to apply a Bayesian approach, namely, the Markov chain Monte Carlo (MCMC) method. We applied this approach to the biological pathways involved in the left ventricle (LV) response to myocardial infarction (MI) and verified our algorithm by estimating two parameters in a Hill equation embedded in the nonlinear model. We further evaluated our estimation performance with different parameter settings and signal to noise ratios. Our results demonstrated the effectiveness of the algorithm for both linearly and nonlinearly parameterized dynamic systems. Our proposed Bayesian algorithm successfully estimated parameters in nonlinear mathematical models for biological pathways. This method can be further extended to high order systems and thus provides a useful tool to analyze biological dynamics and extract information using temporal data.

  12. A Robust Deconvolution Method based on Transdimensional Hierarchical Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Kolb, J.; Lekic, V.

    2012-12-01

    Analysis of P-S and S-P conversions allows us to map receiver side crustal and lithospheric structure. This analysis often involves deconvolution of the parent wave field from the scattered wave field as a means of suppressing source-side complexity. A variety of deconvolution techniques exist including damped spectral division, Wiener filtering, iterative time-domain deconvolution, and the multitaper method. All of these techniques require estimates of noise characteristics as input parameters. We present a deconvolution method based on transdimensional Hierarchical Bayesian inference in which both noise magnitude and noise correlation are used as parameters in calculating the likelihood probability distribution. Because the noise for P-S and S-P conversion analysis in terms of receiver functions is a combination of both background noise - which is relatively easy to characterize - and signal-generated noise - which is much more difficult to quantify - we treat measurement errors as an known quantity, characterized by a probability density function whose mean and variance are model parameters. This transdimensional Hierarchical Bayesian approach has been successfully used previously in the inversion of receiver functions in terms of shear and compressional wave speeds of an unknown number of layers [1]. In our method we used a Markov chain Monte Carlo (MCMC) algorithm to find the receiver function that best fits the data while accurately assessing the noise parameters. In order to parameterize the receiver function we model the receiver function as an unknown number of Gaussians of unknown amplitude and width. The algorithm takes multiple steps before calculating the acceptance probability of a new model, in order to avoid getting trapped in local misfit minima. Using both observed and synthetic data, we show that the MCMC deconvolution method can accurately obtain a receiver function as well as an estimate of the noise parameters given the parent and daughter components. Furthermore, we demonstrate that this new approach is far less susceptible to generating spurious features even at high noise levels. Finally, the method yields not only the most-likely receiver function, but also quantifies its full uncertainty. [1] Bodin, T., M. Sambridge, H. Tkalčić, P. Arroucau, K. Gallagher, and N. Rawlinson (2012), Transdimensional inversion of receiver functions and surface wave dispersion, J. Geophys. Res., 117, B02301

  13. On how to avoid input and structural uncertainties corrupt the inference of hydrological parameters using a Bayesian framework

    NASA Astrophysics Data System (ADS)

    Hernández, Mario R.; Francés, Félix

    2015-04-01

    One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the application of BJI with a GA error model outperforms the hydrological parameters robustness (diminishing the divergence model phenomenon) and improves the reliability of the streamflow predictive distribution, in respect of the results of a bad error model as SLS. Finally, the most likely prediction in a validation period, for both BJI+GA and SLS error models shows a similar performance.

  14. Orbital fitting of imaged planetary companions with high eccentricities and unbound orbits. Their application to Fomalhaut b and PZ Telecopii B

    NASA Astrophysics Data System (ADS)

    Beust, H.; Bonnefoy, M.; Maire, A.-L.; Ehrenreich, D.; Lagrange, A.-M.; Chauvin, G.

    2016-03-01

    Context. Regular follow-up of imaged companions to main-sequence stars often allows a projected orbital motion to be detected. Markov chain Monte Carlo (MCMC) has become very popular recent years for fitting and constraining their orbits. Some of these imaged companions appear to move on very eccentric, possibly unbound orbits. This is, in particular, the case for the exoplanet Fomalhaut b and the brown dwarf companion PZ Tel B on which we focus here. Aims: For these orbits, standard MCMC codes that assume only bound orbits may be inappropriate. Our goal is to develop a new MCMC implementation that is able to handle both bound and unbound orbits in a continuous manner, and to apply this to the cases of Fomalhaut b and PZ Tel B. Methods: We present here this code, based on the use of universal Keplerian variables and Stumpff functions. We present two versions of this code, the second one using a different set of angular variables that were designed to avoid degeneracies arising when the projected orbital motion is quasi-radial, as is the case for PZ Tel B. We also present additional observations of PZ Tel B. Results: The code is applied to Fomalhaut b and PZ Tel B. We confirm previous results in relation to, but we show that on the sole basis of the astrometric data, open orbital solutions are also possible. The eccentricity distribution nevertheless still peaks around ~0.9 in the bound regime. We present a first successful orbital fit of PZ Tel B, which shows in particular that, while both bound and unbound orbital solutions are equally possible, the eccentricity distribution presents a sharp peak very close to e = 1, meaning a quasi-parabolic orbit. Conclusions: It has recently been suggested that the presence of unseen inner companions to imaged ones may lead orbital fitting algorithms to artificially give very high eccentricities. We show that this caveat is unlikely to apply to Fomalhaut b. Concerning PZ Tel B, we derive a possible solution, which involves an inner ~12 MJup companion, that would mimic a e = 1 orbit, despite a real eccentricity of around 0.7, but a dynamical analysis reveals that this type of system would not be stable. We thus conclude that our orbital fit is robust. Based on observations collected at the European Organisation for Astronomical Research in the Southern Hemisphere, Chile (Program ID: 085.C-0867(B) and 085.C-0277(B)).

  15. Adaptive MCMC in Bayesian phylogenetics: an application to analyzing partitioned data in BEAST.

    PubMed

    Baele, Guy; Lemey, Philippe; Rambaut, Andrew; Suchard, Marc A

    2017-06-15

    Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. guy.baele@kuleuven.be. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  16. Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons

    PubMed Central

    Buesing, Lars; Bill, Johannes; Nessler, Bernhard; Maass, Wolfgang

    2011-01-01

    The organization of computations in networks of spiking neurons in the brain is still largely unknown, in particular in view of the inherently stochastic features of their firing activity and the experimentally observed trial-to-trial variability of neural systems in the brain. In principle there exists a powerful computational framework for stochastic computations, probabilistic inference by sampling, which can explain a large number of macroscopic experimental data in neuroscience and cognitive science. But it has turned out to be surprisingly difficult to create a link between these abstract models for stochastic computations and more detailed models of the dynamics of networks of spiking neurons. Here we create such a link and show that under some conditions the stochastic firing activity of networks of spiking neurons can be interpreted as probabilistic inference via Markov chain Monte Carlo (MCMC) sampling. Since common methods for MCMC sampling in distributed systems, such as Gibbs sampling, are inconsistent with the dynamics of spiking neurons, we introduce a different approach based on non-reversible Markov chains that is able to reflect inherent temporal processes of spiking neuronal activity through a suitable choice of random variables. We propose a neural network model and show by a rigorous theoretical analysis that its neural activity implements MCMC sampling of a given distribution, both for the case of discrete and continuous time. This provides a step towards closing the gap between abstract functional models of cortical computation and more detailed models of networks of spiking neurons. PMID:22096452

  17. Two-dimensional probabilistic inversion of plane-wave electromagnetic data: methodology, model constraints and joint inversion with electrical resistivity data

    NASA Astrophysics Data System (ADS)

    Rosas-Carbajal, Marina; Linde, Niklas; Kalscheuer, Thomas; Vrugt, Jasper A.

    2014-03-01

    Probabilistic inversion methods based on Markov chain Monte Carlo (MCMC) simulation are well suited to quantify parameter and model uncertainty of nonlinear inverse problems. Yet, application of such methods to CPU-intensive forward models can be a daunting task, particularly if the parameter space is high dimensional. Here, we present a 2-D pixel-based MCMC inversion of plane-wave electromagnetic (EM) data. Using synthetic data, we investigate how model parameter uncertainty depends on model structure constraints using different norms of the likelihood function and the model constraints, and study the added benefits of joint inversion of EM and electrical resistivity tomography (ERT) data. Our results demonstrate that model structure constraints are necessary to stabilize the MCMC inversion results of a highly discretized model. These constraints decrease model parameter uncertainty and facilitate model interpretation. A drawback is that these constraints may lead to posterior distributions that do not fully include the true underlying model, because some of its features exhibit a low sensitivity to the EM data, and hence are difficult to resolve. This problem can be partly mitigated if the plane-wave EM data is augmented with ERT observations. The hierarchical Bayesian inverse formulation introduced and used herein is able to successfully recover the probabilistic properties of the measurement data errors and a model regularization weight. Application of the proposed inversion methodology to field data from an aquifer demonstrates that the posterior mean model realization is very similar to that derived from a deterministic inversion with similar model constraints.

  18. RESPONDENT-DRIVEN SAMPLING AS MARKOV CHAIN MONTE CARLO

    PubMed Central

    GOEL, SHARAD; SALGANIK, MATTHEW J.

    2013-01-01

    Respondent-driven sampling (RDS) is a recently introduced, and now widely used, technique for estimating disease prevalence in hidden populations. RDS data are collected through a snowball mechanism, in which current sample members recruit future sample members. In this paper we present respondent-driven sampling as Markov chain Monte Carlo (MCMC) importance sampling, and we examine the effects of community structure and the recruitment procedure on the variance of RDS estimates. Past work has assumed that the variance of RDS estimates is primarily affected by segregation between healthy and infected individuals. We examine an illustrative model to show that this is not necessarily the case, and that bottlenecks anywhere in the networks can substantially affect estimates. We also show that variance is inflated by a common design feature in which sample members are encouraged to recruit multiple future sample members. The paper concludes with suggestions for implementing and evaluating respondent-driven sampling studies. PMID:19572381

  19. Dose calculation accuracy of the Monte Carlo algorithm for CyberKnife compared with other commercially available dose calculation algorithms.

    PubMed

    Sharma, Subhash; Ott, Joseph; Williams, Jamone; Dickow, Danny

    2011-01-01

    Monte Carlo dose calculation algorithms have the potential for greater accuracy than traditional model-based algorithms. This enhanced accuracy is particularly evident in regions of lateral scatter disequilibrium, which can develop during treatments incorporating small field sizes and low-density tissue. A heterogeneous slab phantom was used to evaluate the accuracy of several commercially available dose calculation algorithms, including Monte Carlo dose calculation for CyberKnife, Analytical Anisotropic Algorithm and Pencil Beam convolution for the Eclipse planning system, and convolution-superposition for the Xio planning system. The phantom accommodated slabs of varying density; comparisons between planned and measured dose distributions were accomplished with radiochromic film. The Monte Carlo algorithm provided the most accurate comparison between planned and measured dose distributions. In each phantom irradiation, the Monte Carlo predictions resulted in gamma analysis comparisons >97%, using acceptance criteria of 3% dose and 3-mm distance to agreement. In general, the gamma analysis comparisons for the other algorithms were <95%. The Monte Carlo dose calculation algorithm for CyberKnife provides more accurate dose distribution calculations in regions of lateral electron disequilibrium than commercially available model-based algorithms. This is primarily because of the ability of Monte Carlo algorithms to implicitly account for tissue heterogeneities, density scaling functions; and/or effective depth correction factors are not required. Copyright © 2011 American Association of Medical Dosimetrists. Published by Elsevier Inc. All rights reserved.

  20. Development and Tuning of a 3D Stochastic Inversion Methodology to the European Arctic

    DTIC Science & Technology

    2010-09-01

    from previous studies covering the region, in particular from Breivik et al. (2002). Our MCMC algorithm shown in Figure 3 has two major components...criteria, Geophys. J. Int., 156: 483–496, doi:10.1111/j.1365-246X.2004.570 02070.x. Breivik , A., R. Mjelde, P. Grogan, H. Shimamura, Y. Murai, Y

  1. Physical time scale in kinetic Monte Carlo simulations of continuous-time Markov chains.

    PubMed

    Serebrinsky, Santiago A

    2011-03-01

    We rigorously establish a physical time scale for a general class of kinetic Monte Carlo algorithms for the simulation of continuous-time Markov chains. This class of algorithms encompasses rejection-free (or BKL) and rejection (or "standard") algorithms. For rejection algorithms, it was formerly considered that the availability of a physical time scale (instead of Monte Carlo steps) was empirical, at best. Use of Monte Carlo steps as a time unit now becomes completely unnecessary.

  2. Potential uncertainty reduction in model-averaged benchmark dose estimates informed by an additional dose study.

    PubMed

    Shao, Kan; Small, Mitchell J

    2011-10-01

    A methodology is presented for assessing the information value of an additional dosage experiment in existing bioassay studies. The analysis demonstrates the potential reduction in the uncertainty of toxicity metrics derived from expanded studies, providing insights for future studies. Bayesian methods are used to fit alternative dose-response models using Markov chain Monte Carlo (MCMC) simulation for parameter estimation and Bayesian model averaging (BMA) is used to compare and combine the alternative models. BMA predictions for benchmark dose (BMD) are developed, with uncertainty in these predictions used to derive the lower bound BMDL. The MCMC and BMA results provide a basis for a subsequent Monte Carlo analysis that backcasts the dosage where an additional test group would have been most beneficial in reducing the uncertainty in the BMD prediction, along with the magnitude of the expected uncertainty reduction. Uncertainty reductions are measured in terms of reduced interval widths of predicted BMD values and increases in BMDL values that occur as a result of this reduced uncertainty. The methodology is illustrated using two existing data sets for TCDD carcinogenicity, fitted with two alternative dose-response models (logistic and quantal-linear). The example shows that an additional dose at a relatively high value would have been most effective for reducing the uncertainty in BMA BMD estimates, with predicted reductions in the widths of uncertainty intervals of approximately 30%, and expected increases in BMDL values of 5-10%. The results demonstrate that dose selection for studies that subsequently inform dose-response models can benefit from consideration of how these models will be fit, combined, and interpreted. © 2011 Society for Risk Analysis.

  3. Regression without truth with Markov chain Monte-Carlo

    NASA Astrophysics Data System (ADS)

    Madan, Hennadii; Pernuš, Franjo; Likar, Boštjan; Å piclin, Žiga

    2017-03-01

    Regression without truth (RWT) is a statistical technique for estimating error model parameters of each method in a group of methods used for measurement of a certain quantity. A very attractive aspect of RWT is that it does not rely on a reference method or "gold standard" data, which is otherwise difficult RWT was used for a reference-free performance comparison of several methods for measuring left ventricular ejection fraction (EF), i.e. a percentage of blood leaving the ventricle each time the heart contracts, and has since been applied for various other quantitative imaging biomarkerss (QIBs). Herein, we show how Markov chain Monte-Carlo (MCMC), a computational technique for drawing samples from a statistical distribution with probability density function known only up to a normalizing coefficient, can be used to augment RWT to gain a number of important benefits compared to the original approach based on iterative optimization. For instance, the proposed MCMC-based RWT enables the estimation of joint posterior distribution of the parameters of the error model, straightforward quantification of uncertainty of the estimates, estimation of true value of the measurand and corresponding credible intervals (CIs), does not require a finite support for prior distribution of the measureand generally has a much improved robustness against convergence to non-global maxima. The proposed approach is validated using synthetic data that emulate the EF data for 45 patients measured with 8 different methods. The obtained results show that 90% CI of the corresponding parameter estimates contain the true values of all error model parameters and the measurand. A potential real-world application is to take measurements of a certain QIB several different methods and then use the proposed framework to compute the estimates of the true values and their uncertainty, a vital information for diagnosis based on QIB.

  4. Assessing Mediational Models: Testing and Interval Estimation for Indirect Effects.

    PubMed

    Biesanz, Jeremy C; Falk, Carl F; Savalei, Victoria

    2010-08-06

    Theoretical models specifying indirect or mediated effects are common in the social sciences. An indirect effect exists when an independent variable's influence on the dependent variable is mediated through an intervening variable. Classic approaches to assessing such mediational hypotheses ( Baron & Kenny, 1986 ; Sobel, 1982 ) have in recent years been supplemented by computationally intensive methods such as bootstrapping, the distribution of the product methods, and hierarchical Bayesian Markov chain Monte Carlo (MCMC) methods. These different approaches for assessing mediation are illustrated using data from Dunn, Biesanz, Human, and Finn (2007). However, little is known about how these methods perform relative to each other, particularly in more challenging situations, such as with data that are incomplete and/or nonnormal. This article presents an extensive Monte Carlo simulation evaluating a host of approaches for assessing mediation. We examine Type I error rates, power, and coverage. We study normal and nonnormal data as well as complete and incomplete data. In addition, we adapt a method, recently proposed in statistical literature, that does not rely on confidence intervals (CIs) to test the null hypothesis of no indirect effect. The results suggest that the new inferential method-the partial posterior p value-slightly outperforms existing ones in terms of maintaining Type I error rates while maximizing power, especially with incomplete data. Among confidence interval approaches, the bias-corrected accelerated (BC a ) bootstrapping approach often has inflated Type I error rates and inconsistent coverage and is not recommended; In contrast, the bootstrapped percentile confidence interval and the hierarchical Bayesian MCMC method perform best overall, maintaining Type I error rates, exhibiting reasonable power, and producing stable and accurate coverage rates.

  5. Uncertainty analysis for fluorescence tomography with Monte Carlo method

    NASA Astrophysics Data System (ADS)

    Reinbacher-Köstinger, Alice; Freiberger, Manuel; Scharfetter, Hermann

    2011-07-01

    Fluorescence tomography seeks to image an inaccessible fluorophore distribution inside an object like a small animal by injecting light at the boundary and measuring the light emitted by the fluorophore. Optical parameters (e.g. the conversion efficiency or the fluorescence life-time) of certain fluorophores depend on physiologically interesting quantities like the pH value or the oxygen concentration in the tissue, which allows functional rather than just anatomical imaging. To reconstruct the concentration and the life-time from the boundary measurements, a nonlinear inverse problem has to be solved. It is, however, difficult to estimate the uncertainty of the reconstructed parameters in case of iterative algorithms and a large number of degrees of freedom. Uncertainties in fluorescence tomography applications arise from model inaccuracies, discretization errors, data noise and a priori errors. Thus, a Markov chain Monte Carlo method (MCMC) was used to consider all these uncertainty factors exploiting Bayesian formulation of conditional probabilities. A 2-D simulation experiment was carried out for a circular object with two inclusions. Both inclusions had a 2-D Gaussian distribution of the concentration and constant life-time inside of a representative area of the inclusion. Forward calculations were done with the diffusion approximation of Boltzmann's transport equation. The reconstruction results show that the percent estimation error of the lifetime parameter is by a factor of approximately 10 lower than that of the concentration. This finding suggests that lifetime imaging may provide more accurate information than concentration imaging only. The results must be interpreted with caution, however, because the chosen simulation setup represents a special case and a more detailed analysis remains to be done in future to clarify if the findings can be generalized.

  6. Medical imaging feasibility in body fluids using Markov chains

    NASA Astrophysics Data System (ADS)

    Kavehrad, M.; Armstrong, A. D.

    2017-02-01

    A relatively wide field-of-view and high resolution imaging is necessary for navigating the scope within the body, inspecting tissue, diagnosing disease, and guiding surgical interventions. As the large number of modes available in the multimode fibers (MMF) provides higher resolution, MMFs could replace the millimeters-thick bundles of fibers and lenses currently used in endoscopes. However, attributes of body fluids and obscurants such as blood, impose perennial limitations on resolution and reliability of optical imaging inside human body. To design and evaluate optimum imaging techniques that operate under realistic body fluids conditions, a good understanding of the channel (medium) behavior is necessary. In most prior works, Monte-Carlo Ray Tracing (MCRT) algorithm has been used to analyze the channel behavior. This task is quite numerically intensive. The focus of this paper is on investigating the possibility of simplifying this task by a direct extraction of state transition matrices associated with standard Markov modeling from the MCRT computer simulations programs. We show that by tracing a photon's trajectory in the body fluids via a Markov chain model, the angular distribution can be calculated by simple matrix multiplications. We also demonstrate that the new approach produces result that are close to those obtained by MCRT and other known methods. Furthermore, considering the fact that angular, spatial, and temporal distributions of energy are inter-related, mixing time of Monte- Carlo Markov Chain (MCMC) for different types of liquid concentrations is calculated based on Eigen-analysis of the state transition matrix and possibility of imaging in scattering media are investigated. To this end, we have started to characterize the body fluids that reduce the resolution of imaging [1].

  7. Modelling the spread of American foulbrood in honeybees

    PubMed Central

    Datta, Samik; Bull, James C.; Budge, Giles E.; Keeling, Matt J.

    2013-01-01

    We investigate the spread of American foulbrood (AFB), a disease caused by the bacterium Paenibacillus larvae, that affects bees and can be extremely damaging to beehives. Our dataset comes from an inspection period carried out during an AFB epidemic of honeybee colonies on the island of Jersey during the summer of 2010. The data include the number of hives of honeybees, location and owner of honeybee apiaries across the island. We use a spatial SIR model with an underlying owner network to simulate the epidemic and characterize the epidemic using a Markov chain Monte Carlo (MCMC) scheme to determine model parameters and infection times (including undetected ‘occult’ infections). Likely methods of infection spread can be inferred from the analysis, with both distance- and owner-based transmissions being found to contribute to the spread of AFB. The results of the MCMC are corroborated by simulating the epidemic using a stochastic SIR model, resulting in aggregate levels of infection that are comparable to the data. We use this stochastic SIR model to simulate the impact of different control strategies on controlling the epidemic. It is found that earlier inspections result in smaller epidemics and a higher likelihood of AFB extinction. PMID:24026473

  8. Modeling Nitrogen Dynamics in a Waste Stabilization Pond System Using Flexible Modeling Environment with MCMC.

    PubMed

    Mukhtar, Hussnain; Lin, Yu-Pin; Shipin, Oleg V; Petway, Joy R

    2017-07-12

    This study presents an approach for obtaining realization sets of parameters for nitrogen removal in a pilot-scale waste stabilization pond (WSP) system. The proposed approach was designed for optimal parameterization, local sensitivity analysis, and global uncertainty analysis of a dynamic simulation model for the WSP by using the R software package Flexible Modeling Environment (R-FME) with the Markov chain Monte Carlo (MCMC) method. Additionally, generalized likelihood uncertainty estimation (GLUE) was integrated into the FME to evaluate the major parameters that affect the simulation outputs in the study WSP. Comprehensive modeling analysis was used to simulate and assess nine parameters and concentrations of ON-N, NH₃-N and NO₃-N. Results indicate that the integrated FME-GLUE-based model, with good Nash-Sutcliffe coefficients (0.53-0.69) and correlation coefficients (0.76-0.83), successfully simulates the concentrations of ON-N, NH₃-N and NO₃-N. Moreover, the Arrhenius constant was the only parameter sensitive to model performances of ON-N and NH₃-N simulations. However, Nitrosomonas growth rate, the denitrification constant, and the maximum growth rate at 20 °C were sensitive to ON-N and NO₃-N simulation, which was measured using global sensitivity.

  9. Incorporating approximation error in surrogate based Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Zeng, L.; Li, W.; Wu, L.

    2015-12-01

    There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vrugt, Jasper A; Robinson, Bruce A; Ter Braak, Cajo J F

    In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented usingmore » the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.« less

  11. Triadic split-merge sampler

    NASA Astrophysics Data System (ADS)

    van Rossum, Anne C.; Lin, Hai Xiang; Dubbeldam, Johan; van der Herik, H. Jaap

    2018-04-01

    In machine vision typical heuristic methods to extract parameterized objects out of raw data points are the Hough transform and RANSAC. Bayesian models carry the promise to optimally extract such parameterized objects given a correct definition of the model and the type of noise at hand. A category of solvers for Bayesian models are Markov chain Monte Carlo methods. Naive implementations of MCMC methods suffer from slow convergence in machine vision due to the complexity of the parameter space. Towards this blocked Gibbs and split-merge samplers have been developed that assign multiple data points to clusters at once. In this paper we introduce a new split-merge sampler, the triadic split-merge sampler, that perform steps between two and three randomly chosen clusters. This has two advantages. First, it reduces the asymmetry between the split and merge steps. Second, it is able to propose a new cluster that is composed out of data points from two different clusters. Both advantages speed up convergence which we demonstrate on a line extraction problem. We show that the triadic split-merge sampler outperforms the conventional split-merge sampler. Although this new MCMC sampler is demonstrated in this machine vision context, its application extend to the very general domain of statistical inference.

  12. RECONSTRUCTING THREE-DIMENSIONAL JET GEOMETRY FROM TWO-DIMENSIONAL IMAGES

    NASA Astrophysics Data System (ADS)

    Avachat, Sayali; Perlman, Eric S.; Li, Kunyang; Kosak, Katie

    2018-01-01

    Relativistic jets in AGN are one of the most interesting and complex structures in the Universe. Some of the jets can be spread over hundreds of kilo parsecs from the central engine and display various bends, knots and hotspots. Observations of the jets can prove helpful in understanding the emission and particle acceleration processes from sub-arcsec to kilo parsec scales and the role of magnetic field in it. The M87 jet has many bright knots as well as regions of small and large bends. We attempt to model the jet geometry using the observed 2 dimensional structure. The radio and optical images of the jet show evidence of presence of helical magnetic field throughout. Using the observed structure in the sky frame, our goal is to gain an insight into the intrinsic 3 dimensional geometry in the jets frame. The structure of the bends in jet's frame may be quite different than what we see in the sky frame. The knowledge of the intrinsic structure will be helpful in understanding the appearance of the magnetic field and hence polarization morphology. To achieve this, we are using numerical methods to solve the non-linear equations based on the jet geometry. We are using the Log Likelihood method and algorithm based on Markov Chain Monte Carlo (MCMC) simulations.

  13. Demonstration of emulator-based Bayesian calibration of safety analysis codes: Theory and formulation

    DOE PAGES

    Yurko, Joseph P.; Buongiorno, Jacopo; Youngblood, Robert

    2015-05-28

    System codes for simulation of safety performance of nuclear plants may contain parameters whose values are not known very accurately. New information from tests or operating experience is incorporated into safety codes by a process known as calibration, which reduces uncertainty in the output of the code and thereby improves its support for decision-making. The work reported here implements several improvements on classic calibration techniques afforded by modern analysis techniques. The key innovation has come from development of code surrogate model (or code emulator) construction and prediction algorithms. Use of a fast emulator makes the calibration processes used here withmore » Markov Chain Monte Carlo (MCMC) sampling feasible. This study uses Gaussian Process (GP) based emulators, which have been used previously to emulate computer codes in the nuclear field. The present work describes the formulation of an emulator that incorporates GPs into a factor analysis-type or pattern recognition-type model. This “function factorization” Gaussian Process (FFGP) model allows overcoming limitations present in standard GP emulators, thereby improving both accuracy and speed of the emulator-based calibration process. Calibration of a friction-factor example using a Method of Manufactured Solution is performed to illustrate key properties of the FFGP based process.« less

  14. Assessment of Agricultural Water Management in Punjab, India using Bayesian Methods

    NASA Astrophysics Data System (ADS)

    Russo, T. A.; Devineni, N.; Lall, U.; Sidhu, R.

    2013-12-01

    The success of the Green Revolution in Punjab, India is threatened by the declining water table (approx. 1 m/yr). Punjab, a major agricultural supplier for the rest of India, supports irrigation with a canal system and groundwater, which is vastly over-exploited. Groundwater development in many districts is greater than 200% the annual recharge rate. The hydrologic data required to complete a mass-balance model are not available for this region, therefore we use Bayesian methods to estimate hydrologic properties and irrigation requirements. Using the known values of precipitation, total canal water delivery, crop yield, and water table elevation, we solve for each unknown parameter (often a coefficient) using a Markov chain Monte Carlo (MCMC) algorithm. Results provide regional estimates of irrigation requirements and groundwater recharge rates under observed climate conditions (1972 to 2002). Model results are used to estimate future water availability and demand to help inform agriculture management decisions under projected climate conditions. We find that changing cropping patterns for the region can maintain food production while balancing groundwater pumping with natural recharge. This computational method can be applied in data-scarce regions across the world, where agricultural water management is required to resolve competition between food security and changing resource availability.

  15. Receptive Field Inference with Localized Priors

    PubMed Central

    Park, Mijung; Pillow, Jonathan W.

    2011-01-01

    The linear receptive field describes a mapping from sensory stimuli to a one-dimensional variable governing a neuron's spike response. However, traditional receptive field estimators such as the spike-triggered average converge slowly and often require large amounts of data. Bayesian methods seek to overcome this problem by biasing estimates towards solutions that are more likely a priori, typically those with small, smooth, or sparse coefficients. Here we introduce a novel Bayesian receptive field estimator designed to incorporate locality, a powerful form of prior information about receptive field structure. The key to our approach is a hierarchical receptive field model that flexibly adapts to localized structure in both spacetime and spatiotemporal frequency, using an inference method known as empirical Bayes. We refer to our method as automatic locality determination (ALD), and show that it can accurately recover various types of smooth, sparse, and localized receptive fields. We apply ALD to neural data from retinal ganglion cells and V1 simple cells, and find it achieves error rates several times lower than standard estimators. Thus, estimates of comparable accuracy can be achieved with substantially less data. Finally, we introduce a computationally efficient Markov Chain Monte Carlo (MCMC) algorithm for fully Bayesian inference under the ALD prior, yielding accurate Bayesian confidence intervals for small or noisy datasets. PMID:22046110

  16. Bayesian Hierarchical Random Intercept Model Based on Three Parameter Gamma Distribution

    NASA Astrophysics Data System (ADS)

    Wirawati, Ika; Iriawan, Nur; Irhamah

    2017-06-01

    Hierarchical data structures are common throughout many areas of research. Beforehand, the existence of this type of data was less noticed in the analysis. The appropriate statistical analysis to handle this type of data is the hierarchical linear model (HLM). This article will focus only on random intercept model (RIM), as a subclass of HLM. This model assumes that the intercept of models in the lowest level are varied among those models, and their slopes are fixed. The differences of intercepts were suspected affected by some variables in the upper level. These intercepts, therefore, are regressed against those upper level variables as predictors. The purpose of this paper would demonstrate a proven work of the proposed two level RIM of the modeling on per capita household expenditure in Maluku Utara, which has five characteristics in the first level and three characteristics of districts/cities in the second level. The per capita household expenditure data in the first level were captured by the three parameters Gamma distribution. The model, therefore, would be more complex due to interaction of many parameters for representing the hierarchical structure and distribution pattern of the data. To simplify the estimation processes of parameters, the computational Bayesian method couple with Markov Chain Monte Carlo (MCMC) algorithm and its Gibbs Sampling are employed.

  17. RNA folding kinetics using Monte Carlo and Gillespie algorithms.

    PubMed

    Clote, Peter; Bayegan, Amir H

    2018-04-01

    RNA secondary structure folding kinetics is known to be important for the biological function of certain processes, such as the hok/sok system in E. coli. Although linear algebra provides an exact computational solution of secondary structure folding kinetics with respect to the Turner energy model for tiny ([Formula: see text]20 nt) RNA sequences, the folding kinetics for larger sequences can only be approximated by binning structures into macrostates in a coarse-grained model, or by repeatedly simulating secondary structure folding with either the Monte Carlo algorithm or the Gillespie algorithm. Here we investigate the relation between the Monte Carlo algorithm and the Gillespie algorithm. We prove that asymptotically, the expected time for a K-step trajectory of the Monte Carlo algorithm is equal to [Formula: see text] times that of the Gillespie algorithm, where [Formula: see text] denotes the Boltzmann expected network degree. If the network is regular (i.e. every node has the same degree), then the mean first passage time (MFPT) computed by the Monte Carlo algorithm is equal to MFPT computed by the Gillespie algorithm multiplied by [Formula: see text]; however, this is not true for non-regular networks. In particular, RNA secondary structure folding kinetics, as computed by the Monte Carlo algorithm, is not equal to the folding kinetics, as computed by the Gillespie algorithm, although the mean first passage times are roughly correlated. Simulation software for RNA secondary structure folding according to the Monte Carlo and Gillespie algorithms is publicly available, as is our software to compute the expected degree of the network of secondary structures of a given RNA sequence-see http://bioinformatics.bc.edu/clote/RNAexpNumNbors .

  18. A Bayesian Retrieval of Greenland Ice Sheet Internal Temperature from Ultra-wideband Software-defined Microwave Radiometer (UWBRAD) Measurements

    NASA Astrophysics Data System (ADS)

    Duan, Y.; Durand, M. T.; Jezek, K. C.; Yardim, C.; Bringer, A.; Aksoy, M.; Johnson, J. T.

    2017-12-01

    The ultra-wideband software-defined microwave radiometer (UWBRAD) is designed to provide ice sheet internal temperature product via measuring low frequency microwave emission. Twelve channels ranging from 0.5 to 2.0 GHz are covered by the instrument. A Greenland air-borne demonstration was demonstrated in September 2016, provided first demonstration of Ultra-wideband radiometer observations of geophysical scenes, including ice sheets. Another flight is planned for September 2017 for acquiring measurements in central ice sheet. A Bayesian framework is designed to retrieve the ice sheet internal temperature from simulated UWBRAD brightness temperature (Tb) measurements over Greenland flight path with limited prior information of the ground. A 1-D heat-flow model, the Robin Model, was used to model the ice sheet internal temperature profile with ground information. Synthetic UWBRAD Tb observations was generated via the partially coherent radiation transfer model, which utilizes the Robin model temperature profile and an exponential fit of ice density from Borehole measurement as input, and corrupted with noise. The effective surface temperature, geothermal heat flux, the variance of upper layer ice density, and the variance of fine scale density variation at deeper ice sheet were treated as unknown variables within the retrieval framework. Each parameter is defined with its possible range and set to be uniformly distributed. The Markov Chain Monte Carlo (MCMC) approach is applied to make the unknown parameters randomly walk in the parameter space. We investigate whether the variables can be improved over priors using the MCMC approach and contribute to the temperature retrieval theoretically. UWBRAD measurements near camp century from 2016 was also treated with the MCMC to examine the framework with scattering effect. The fine scale density fluctuation is an important parameter. It is the most sensitive yet highly unknown parameter in the estimation framework. Including the fine scale density fluctuation greatly improved the retrieval results. The ice sheet vertical temperature profile, especially the 10m temperature, can be well retrieved via the MCMC process. Future retrieval work will apply the Bayesian approach to UWBRAD airborne measurements.

  19. Skill (or lack thereof) of data-model fusion techniques to provide an early warning signal for an approaching tipping point.

    PubMed

    Singh, Riddhi; Quinn, Julianne D; Reed, Patrick M; Keller, Klaus

    2018-01-01

    Many coupled human-natural systems have the potential to exhibit a highly nonlinear threshold response to external forcings resulting in fast transitions to undesirable states (such as eutrophication in a lake). Often, there are considerable uncertainties that make identifying the threshold challenging. Thus, rapid learning is critical for guiding management actions to avoid abrupt transitions. Here, we adopt the shallow lake problem as a test case to compare the performance of four common data assimilation schemes to predict an approaching transition. In order to demonstrate the complex interactions between management strategies and the ability of the data assimilation schemes to predict eutrophication, we also analyze our results across two different management strategies governing phosphorus emissions into the shallow lake. The compared data assimilation schemes are: ensemble Kalman filtering (EnKF), particle filtering (PF), pre-calibration (PC), and Markov Chain Monte Carlo (MCMC) estimation. While differing in their core assumptions, each data assimilation scheme is based on Bayes' theorem and updates prior beliefs about a system based on new information. For large computational investments, EnKF, PF and MCMC show similar skill in capturing the observed phosphorus in the lake (measured as expected root mean squared prediction error). EnKF, followed by PF, displays the highest learning rates at low computational cost, thus providing a more reliable signal of an impending transition. MCMC approaches the true probability of eutrophication only after a strong signal of an impending transition emerges from the observations. Overall, we find that learning rates are greatest near regions of abrupt transitions, posing a challenge to early learning and preemptive management of systems with such abrupt transitions.

  20. Skill (or lack thereof) of data-model fusion techniques to provide an early warning signal for an approaching tipping point

    PubMed Central

    Quinn, Julianne D.; Reed, Patrick M.; Keller, Klaus

    2018-01-01

    Many coupled human-natural systems have the potential to exhibit a highly nonlinear threshold response to external forcings resulting in fast transitions to undesirable states (such as eutrophication in a lake). Often, there are considerable uncertainties that make identifying the threshold challenging. Thus, rapid learning is critical for guiding management actions to avoid abrupt transitions. Here, we adopt the shallow lake problem as a test case to compare the performance of four common data assimilation schemes to predict an approaching transition. In order to demonstrate the complex interactions between management strategies and the ability of the data assimilation schemes to predict eutrophication, we also analyze our results across two different management strategies governing phosphorus emissions into the shallow lake. The compared data assimilation schemes are: ensemble Kalman filtering (EnKF), particle filtering (PF), pre-calibration (PC), and Markov Chain Monte Carlo (MCMC) estimation. While differing in their core assumptions, each data assimilation scheme is based on Bayes’ theorem and updates prior beliefs about a system based on new information. For large computational investments, EnKF, PF and MCMC show similar skill in capturing the observed phosphorus in the lake (measured as expected root mean squared prediction error). EnKF, followed by PF, displays the highest learning rates at low computational cost, thus providing a more reliable signal of an impending transition. MCMC approaches the true probability of eutrophication only after a strong signal of an impending transition emerges from the observations. Overall, we find that learning rates are greatest near regions of abrupt transitions, posing a challenge to early learning and preemptive management of systems with such abrupt transitions. PMID:29389938

  1. Multi-objective calibration and uncertainty analysis of hydrologic models; A comparative study between formal and informal methods

    NASA Astrophysics Data System (ADS)

    Shafii, M.; Tolson, B.; Matott, L. S.

    2012-04-01

    Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.

  2. Accounting for the measurement error of spectroscopically inferred soil carbon data for improved precision of spatial predictions.

    PubMed

    Somarathna, P D S N; Minasny, Budiman; Malone, Brendan P; Stockmann, Uta; McBratney, Alex B

    2018-08-01

    Spatial modelling of environmental data commonly only considers spatial variability as the single source of uncertainty. In reality however, the measurement errors should also be accounted for. In recent years, infrared spectroscopy has been shown to offer low cost, yet invaluable information needed for digital soil mapping at meaningful spatial scales for land management. However, spectrally inferred soil carbon data are known to be less accurate compared to laboratory analysed measurements. This study establishes a methodology to filter out the measurement error variability by incorporating the measurement error variance in the spatial covariance structure of the model. The study was carried out in the Lower Hunter Valley, New South Wales, Australia where a combination of laboratory measured, and vis-NIR and MIR inferred topsoil and subsoil soil carbon data are available. We investigated the applicability of residual maximum likelihood (REML) and Markov Chain Monte Carlo (MCMC) simulation methods to generate parameters of the Matérn covariance function directly from the data in the presence of measurement error. The results revealed that the measurement error can be effectively filtered-out through the proposed technique. When the measurement error was filtered from the data, the prediction variance almost halved, which ultimately yielded a greater certainty in spatial predictions of soil carbon. Further, the MCMC technique was successfully used to define the posterior distribution of measurement error. This is an important outcome, as the MCMC technique can be used to estimate the measurement error if it is not explicitly quantified. Although this study dealt with soil carbon data, this method is amenable for filtering the measurement error of any kind of continuous spatial environmental data. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Comparison of multipoint linkage analyses for quantitative traits in the CEPH data: parametric LOD scores, variance components LOD scores, and Bayes factors.

    PubMed

    Sung, Yun Ju; Di, Yanming; Fu, Audrey Q; Rothstein, Joseph H; Sieh, Weiva; Tong, Liping; Thompson, Elizabeth A; Wijsman, Ellen M

    2007-01-01

    We performed multipoint linkage analyses with multiple programs and models for several gene expression traits in the Centre d'Etude du Polymorphisme Humain families. All analyses provided consistent results for both peak location and shape. Variance-components (VC) analysis gave wider peaks and Bayes factors gave fewer peaks. Among programs from the MORGAN package, lm_multiple performed better than lm_markers, resulting in less Markov-chain Monte Carlo (MCMC) variability between runs, and the program lm_twoqtl provided higher LOD scores by also including either a polygenic component or an additional quantitative trait locus.

  4. Comparison of multipoint linkage analyses for quantitative traits in the CEPH data: parametric LOD scores, variance components LOD scores, and Bayes factors

    PubMed Central

    Sung, Yun Ju; Di, Yanming; Fu, Audrey Q; Rothstein, Joseph H; Sieh, Weiva; Tong, Liping; Thompson, Elizabeth A; Wijsman, Ellen M

    2007-01-01

    We performed multipoint linkage analyses with multiple programs and models for several gene expression traits in the Centre d'Etude du Polymorphisme Humain families. All analyses provided consistent results for both peak location and shape. Variance-components (VC) analysis gave wider peaks and Bayes factors gave fewer peaks. Among programs from the MORGAN package, lm_multiple performed better than lm_markers, resulting in less Markov-chain Monte Carlo (MCMC) variability between runs, and the program lm_twoqtl provided higher LOD scores by also including either a polygenic component or an additional quantitative trait locus. PMID:18466597

  5. Cell-veto Monte Carlo algorithm for long-range systems.

    PubMed

    Kapfer, Sebastian C; Krauth, Werner

    2016-09-01

    We present a rigorous efficient event-chain Monte Carlo algorithm for long-range interacting particle systems. Using a cell-veto scheme within the factorized Metropolis algorithm, we compute each single-particle move with a fixed number of operations. For slowly decaying potentials such as Coulomb interactions, screening line charges allow us to take into account periodic boundary conditions. We discuss the performance of the cell-veto Monte Carlo algorithm for general inverse-power-law potentials, and illustrate how it provides a new outlook on one of the prominent bottlenecks in large-scale atomistic Monte Carlo simulations.

  6. Quantum speedup of Monte Carlo methods.

    PubMed

    Montanaro, Ashley

    2015-09-08

    Monte Carlo methods use random sampling to estimate numerical quantities which are hard to compute deterministically. One important example is the use in statistical physics of rapidly mixing Markov chains to approximately compute partition functions. In this work, we describe a quantum algorithm which can accelerate Monte Carlo methods in a very general setting. The algorithm estimates the expected output value of an arbitrary randomized or quantum subroutine with bounded variance, achieving a near-quadratic speedup over the best possible classical algorithm. Combining the algorithm with the use of quantum walks gives a quantum speedup of the fastest known classical algorithms with rigorous performance bounds for computing partition functions, which use multiple-stage Markov chain Monte Carlo techniques. The quantum algorithm can also be used to estimate the total variation distance between probability distributions efficiently.

  7. Quantum speedup of Monte Carlo methods

    PubMed Central

    Montanaro, Ashley

    2015-01-01

    Monte Carlo methods use random sampling to estimate numerical quantities which are hard to compute deterministically. One important example is the use in statistical physics of rapidly mixing Markov chains to approximately compute partition functions. In this work, we describe a quantum algorithm which can accelerate Monte Carlo methods in a very general setting. The algorithm estimates the expected output value of an arbitrary randomized or quantum subroutine with bounded variance, achieving a near-quadratic speedup over the best possible classical algorithm. Combining the algorithm with the use of quantum walks gives a quantum speedup of the fastest known classical algorithms with rigorous performance bounds for computing partition functions, which use multiple-stage Markov chain Monte Carlo techniques. The quantum algorithm can also be used to estimate the total variation distance between probability distributions efficiently. PMID:26528079

  8. Unfolding Neutron Spectrum with Markov Chain Monte Carlo at MIT Research Reactor with He-3 Neutral Current Detectors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leder, A.; Anderson, A. J.; Billard, J.

    2018-02-02

    The Ricochet experiment seeks to measure Coherent (neutral-current) Elastic Neutrino-Nucleus Scattering (CEνNS) using dark-matter-style detectors with sub-keV thresholds placed near a neutrino source, such as the MIT (research) Reactor (MITR), which operates at 5.5 MW generating approximately 2.2 × 1018 ν/second in its core. Currently, Ricochet is characterizing the backgrounds at MITR, the main component of which comes in the form of neutrons emitted from the core simultaneous with the neutrino signal. To characterize this background, we wrapped Bonner cylinders around a 32He thermal neutron detector, whose data was then unfolded via a Markov Chain Monte Carlo (MCMC) to produce a neutron energymore » spectrum across several orders of magnitude. We discuss the resulting spectrum and its implications for deploying Ricochet at the MITR site as well as the feasibility of reducing this background level via the addition of polyethylene shielding around the detector setup.« less

  9. Unfolding neutron spectrum with Markov Chain Monte Carlo at MIT research Reactor with He-3 Neutral Current Detectors

    NASA Astrophysics Data System (ADS)

    Leder, A.; Anderson, A. J.; Billard, J.; Figueroa-Feliciano, E.; Formaggio, J. A.; Hasselkus, C.; Newman, E.; Palladino, K.; Phuthi, M.; Winslow, L.; Zhang, L.

    2018-02-01

    The Ricochet experiment seeks to measure Coherent (neutral-current) Elastic Neutrino-Nucleus Scattering (CEνNS) using dark-matter-style detectors with sub-keV thresholds placed near a neutrino source, such as the MIT (research) Reactor (MITR), which operates at 5.5 MW generating approximately 2.2 × 1018 ν/second in its core. Currently, Ricochet is characterizing the backgrounds at MITR, the main component of which comes in the form of neutrons emitted from the core simultaneous with the neutrino signal. To characterize this background, we wrapped Bonner cylinders around a 32He thermal neutron detector, whose data was then unfolded via a Markov Chain Monte Carlo (MCMC) to produce a neutron energy spectrum across several orders of magnitude. We discuss the resulting spectrum and its implications for deploying Ricochet at the MITR site as well as the feasibility of reducing this background level via the addition of polyethylene shielding around the detector setup.

  10. Measurement of the low-energy quenching factor in germanium using an Y 88 / Be photoneutron source

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scholz, B. J.; Chavarria, A. E.; Collar, J. I.

    2016-12-01

    We employ an 88Y/Be photoneutron source to derive the quenching factor for neutron-induced nuclear recoils in germanium, probing recoil energies from a few hundred eVnr to 8.5 keV nr. A comprehensive Monte Carlo simulation of our setup is compared to experimental data employing a Lindhard model with a free electronic energy loss k and an adiabatic correction for sub-keVnr nuclear recoils. The best fit k=0.179±0.001 obtained using a Monte Carlo Markov chain (MCMC) ensemble sampler is in good agreement with previous measurements, confirming the adequacy of the Lindhard model to describe the stopping of few-keV ions in germanium crystals at a temperaturemore » of ~77 K. This value of k corresponds to a quenching factor of 13.7% to 25.3% for nuclear recoil energies between 0.3 and 8.5 keV nr, respectively.« less

  11. The Rational Hybrid Monte Carlo algorithm

    NASA Astrophysics Data System (ADS)

    Clark, Michael

    2006-12-01

    The past few years have seen considerable progress in algorithmic development for the generation of gauge fields including the effects of dynamical fermions. The Rational Hybrid Monte Carlo (RHMC) algorithm, where Hybrid Monte Carlo is performed using a rational approximation in place the usual inverse quark matrix kernel is one of these developments. This algorithm has been found to be extremely beneficial in many areas of lattice QCD (chiral fermions, finite temperature, Wilson fermions etc.). We review the algorithm and some of these benefits, and we compare against other recent algorithm developements. We conclude with an update of the Berlin wall plot comparing costs of all popular fermion formulations.

  12. Do race, ethnicity, and psychiatric diagnoses matter in the prevalence of multiple chronic medical conditions?

    PubMed

    Cabassa, Leopoldo J; Humensky, Jennifer; Druss, Benjamin; Lewis-Fernández, Roberto; Gomes, Arminda P; Wang, Shuai; Blanco, Carlos

    2013-06-01

    The proportion of people in the United States with multiple chronic medical conditions (MCMC) is increasing. Yet, little is known about the relationship that race, ethnicity, and psychiatric disorders have on the prevalence of MCMCs in the general population. This study used data from wave 2 of the National Epidemiologic Survey on Alcohol and Related Conditions (N=33,107). Multinomial logistic regression models adjusting for sociodemographic variables, body mass index, and quality of life were used to examine differences in the 12-month prevalence of MCMC by race/ethnicity, psychiatric diagnosis, and the interactions between race/ethnicity and psychiatric diagnosis. Compared to non-Hispanic Whites, Hispanics reported lower odds of MCMC and African Americans reported higher odds of MCMC after adjusting for covariates. People with psychiatric disorders reported higher odds of MCMC compared with people without psychiatric disorders. There were significant interactions between race and psychiatric diagnosis associated with rates of MCMC. In the presence of certain psychiatric disorders, the odds of MCMC were higher among African Americans with psychiatric disorders compared to non-Hispanic Whites with similar psychiatric disorders. Our study results indicate that race, ethnicity, and psychiatric disorders are associated with the prevalence of MCMC. As the rates of MCMC rise, it is critical to identify which populations are at increased risk and how to best direct services to address their health care needs.

  13. Efficient inference for genetic association studies with multiple outcomes.

    PubMed

    Ruffieux, Helene; Davison, Anthony C; Hager, Jorg; Irincheeva, Irina

    2017-10-01

    Combined inference for heterogeneous high-dimensional data is critical in modern biology, where clinical and various kinds of molecular data may be available from a single study. Classical genetic association studies regress a single clinical outcome on many genetic variants one by one, but there is an increasing demand for joint analysis of many molecular outcomes and genetic variants in order to unravel functional interactions. Unfortunately, most existing approaches to joint modeling are either too simplistic to be powerful or are impracticable for computational reasons. Inspired by Richardson and others (2010, Bayesian Statistics 9), we consider a sparse multivariate regression model that allows simultaneous selection of predictors and associated responses. As Markov chain Monte Carlo (MCMC) inference on such models can be prohibitively slow when the number of genetic variants exceeds a few thousand, we propose a variational inference approach which produces posterior information very close to that of MCMC inference, at a much reduced computational cost. Extensive numerical experiments show that our approach outperforms popular variable selection methods and tailored Bayesian procedures, dealing within hours with problems involving hundreds of thousands of genetic variants and tens to hundreds of clinical or molecular outcomes. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  14. Modeling two strains of disease via aggregate-level infectivity curves.

    PubMed

    Romanescu, Razvan; Deardon, Rob

    2016-04-01

    Well formulated models of disease spread, and efficient methods to fit them to observed data, are powerful tools for aiding the surveillance and control of infectious diseases. Our project considers the problem of the simultaneous spread of two related strains of disease in a context where spatial location is the key driver of disease spread. We start our modeling work with the individual level models (ILMs) of disease transmission, and extend these models to accommodate the competing spread of the pathogens in a two-tier hierarchical population (whose levels we refer to as 'farm' and 'animal'). The postulated interference mechanism between the two strains is a period of cross-immunity following infection. We also present a framework for speeding up the computationally intensive process of fitting the ILM to data, typically done using Markov chain Monte Carlo (MCMC) in a Bayesian framework, by turning the inference into a two-stage process. First, we approximate the number of animals infected on a farm over time by infectivity curves. These curves are fit to data sampled from farms, using maximum likelihood estimation, then, conditional on the fitted curves, Bayesian MCMC inference proceeds for the remaining parameters. Finally, we use posterior predictive distributions of salient epidemic summary statistics, in order to assess the model fitted.

  15. Measurements of Kepler Planet Masses and Eccentricities from Transit Timing Variations: Analytic and N-body Results

    NASA Astrophysics Data System (ADS)

    Hadden, Sam; Lithwick, Yoram

    2015-12-01

    Several Kepler planets reside in multi-planet systems where gravitational interactions result in transit timing variations (TTVs) that provide exquisitely sensitive probes of their masses of and orbits. Measuring these planets' masses and orbits constrains their bulk compositions and can provide clues about their formation. However, inverting TTV measurements in order to infer planet properties can be challenging: it involves fitting a nonlinear model with a large number of parameters to noisy data, often with significant degeneracies between parameters. I present results from two complementary approaches to TTV inversion: Markov chain Monte Carlo simulations that use N-body integrations to compute transit times and a simplified analytic model for computing the TTVs of planets near mean motion resonances. The analytic model allows for straightforward interpretations of N-body results and provides an independent estimate of parameter uncertainties that can be compared to MCMC results which may be sensitive to factors such as priors. We have conducted extensive MCMC simulations along with analytic fits to model the TTVs of dozens of Kepler multi-planet systems. We find that the bulk of these sub-Jovian planets have low densities that necessitate significant gaseous envelopes. We also find that the planets' eccentricities are generally small but often definitively non-zero.

  16. Modeling Nitrogen Dynamics in a Waste Stabilization Pond System Using Flexible Modeling Environment with MCMC

    PubMed Central

    Mukhtar, Hussnain; Lin, Yu-Pin; Shipin, Oleg V.; Petway, Joy R.

    2017-01-01

    This study presents an approach for obtaining realization sets of parameters for nitrogen removal in a pilot-scale waste stabilization pond (WSP) system. The proposed approach was designed for optimal parameterization, local sensitivity analysis, and global uncertainty analysis of a dynamic simulation model for the WSP by using the R software package Flexible Modeling Environment (R-FME) with the Markov chain Monte Carlo (MCMC) method. Additionally, generalized likelihood uncertainty estimation (GLUE) was integrated into the FME to evaluate the major parameters that affect the simulation outputs in the study WSP. Comprehensive modeling analysis was used to simulate and assess nine parameters and concentrations of ON-N, NH3-N and NO3-N. Results indicate that the integrated FME-GLUE-based model, with good Nash–Sutcliffe coefficients (0.53–0.69) and correlation coefficients (0.76–0.83), successfully simulates the concentrations of ON-N, NH3-N and NO3-N. Moreover, the Arrhenius constant was the only parameter sensitive to model performances of ON-N and NH3-N simulations. However, Nitrosomonas growth rate, the denitrification constant, and the maximum growth rate at 20 °C were sensitive to ON-N and NO3-N simulation, which was measured using global sensitivity. PMID:28704958

  17. Comparing hierarchical models via the marginalized deviance information criterion.

    PubMed

    Quintero, Adrian; Lesaffre, Emmanuel

    2018-07-20

    Hierarchical models are extensively used in pharmacokinetics and longitudinal studies. When the estimation is performed from a Bayesian approach, model comparison is often based on the deviance information criterion (DIC). In hierarchical models with latent variables, there are several versions of this statistic: the conditional DIC (cDIC) that incorporates the latent variables in the focus of the analysis and the marginalized DIC (mDIC) that integrates them out. Regardless of the asymptotic and coherency difficulties of cDIC, this alternative is usually used in Markov chain Monte Carlo (MCMC) methods for hierarchical models because of practical convenience. The mDIC criterion is more appropriate in most cases but requires integration of the likelihood, which is computationally demanding and not implemented in Bayesian software. Therefore, we consider a method to compute mDIC by generating replicate samples of the latent variables that need to be integrated out. This alternative can be easily conducted from the MCMC output of Bayesian packages and is widely applicable to hierarchical models in general. Additionally, we propose some approximations in order to reduce the computational complexity for large-sample situations. The method is illustrated with simulated data sets and 2 medical studies, evidencing that cDIC may be misleading whilst mDIC appears pertinent. Copyright © 2018 John Wiley & Sons, Ltd.

  18. Analytic continuation of quantum Monte Carlo data by stochastic analytical inference.

    PubMed

    Fuchs, Sebastian; Pruschke, Thomas; Jarrell, Mark

    2010-05-01

    We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data which is strictly based on principles of Bayesian statistical inference. Within this framework we are able to obtain an explicit expression for the calculation of a weighted average over possible energy spectra, which can be evaluated by standard Monte Carlo simulations, yielding as by-product also the distribution function as function of the regularization parameter. Our algorithm thus avoids the usual ad hoc assumptions introduced in similar algorithms to fix the regularization parameter. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum-entropy calculation.

  19. SU-F-T-619: Dose Evaluation of Specific Patient Plans Based On Monte Carlo Algorithm for a CyberKnife Stereotactic Radiosurgery System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Piao, J; PLA 302 Hospital, Beijing; Xu, S

    2016-06-15

    Purpose: This study will use Monte Carlo to simulate the Cyberknife system, and intend to develop the third-party tool to evaluate the dose verification of specific patient plans in TPS. Methods: By simulating the treatment head using the BEAMnrc and DOSXYZnrc software, the comparison between the calculated and measured data will be done to determine the beam parameters. The dose distribution calculated in the Raytracing, Monte Carlo algorithms of TPS (Multiplan Ver4.0.2) and in-house Monte Carlo simulation method for 30 patient plans, which included 10 head, lung and liver cases in each, were analyzed. The γ analysis with the combinedmore » 3mm/3% criteria would be introduced to quantitatively evaluate the difference of the accuracy between three algorithms. Results: More than 90% of the global error points were less than 2% for the comparison of the PDD and OAR curves after determining the mean energy and FWHM.The relative ideal Monte Carlo beam model had been established. Based on the quantitative evaluation of dose accuracy for three algorithms, the results of γ analysis shows that the passing rates (84.88±9.67% for head,98.83±1.05% for liver,98.26±1.87% for lung) of PTV in 30 plans between Monte Carlo simulation and TPS Monte Carlo algorithms were good. And the passing rates (95.93±3.12%,99.84±0.33% in each) of PTV in head and liver plans between Monte Carlo simulation and TPS Ray-tracing algorithms were also good. But the difference of DVHs in lung plans between Monte Carlo simulation and Ray-tracing algorithms was obvious, and the passing rate (51.263±38.964%) of γ criteria was not good. It is feasible that Monte Carlo simulation was used for verifying the dose distribution of patient plans. Conclusion: Monte Carlo simulation algorithm developed in the CyberKnife system of this study can be used as a reference tool for the third-party tool, which plays an important role in dose verification of patient plans. This work was supported in part by the grant from Chinese Natural Science Foundation (Grant No. 11275105). Thanks for the support from Accuray Corp.« less

  20. Estimation of under-reporting in epidemics using approximations.

    PubMed

    Gamado, Kokouvi; Streftaris, George; Zachary, Stan

    2017-06-01

    Under-reporting in epidemics, when it is ignored, leads to under-estimation of the infection rate and therefore of the reproduction number. In the case of stochastic models with temporal data, a usual approach for dealing with such issues is to apply data augmentation techniques through Bayesian methodology. Departing from earlier literature approaches implemented using reversible jump Markov chain Monte Carlo (RJMCMC) techniques, we make use of approximations to obtain faster estimation with simple MCMC. Comparisons among the methods developed here, and with the RJMCMC approach, are carried out and highlight that approximation-based methodology offers useful alternative inference tools for large epidemics, with a good trade-off between time cost and accuracy.

  1. Cosmological Constraint on Brans-Dicke Theory

    NASA Astrophysics Data System (ADS)

    Chen, Xuelei; Wu, Fengquan

    We develop the covariant formalism of the cosmological perturbation theory for the Brans-Dicke gravity, and use it to calculate the cosmic microwave background (CMB) anisotropy and large scale structure (LSS) power spectrum. We introduce a new parameter ζ which is related to the Brans-Dicke parameter ζ = ln(1/ω + 1), and use the Markov-Chain Monte Carlo (MCMC) method to explore the parameter space. Using the latest CMB data published by WMAP, ACBAR, CBI, Boomerang teams, and the LSS data from the SDSS survey DR4, we find that the the 2σ (95.5%) bound on ζ is about |ζ| > 10-2, or |ω| > 102, the precise limit depends somewhat on the prior used.

  2. Long-term optical flux and colour variability in quasars

    NASA Astrophysics Data System (ADS)

    Sukanya, N.; Stalin, C. S.; Jeyakumar, S.; Praveen, D.; Dhani, Arnab; Damle, R.

    2016-02-01

    We have used optical V and R band observations from the Massive Compact Halo Object (MACHO) project on a sample of 59 quasars behind the Magellanic clouds to study their long term optical flux and colour variations. These quasars, lying in the redshift range of 0.2 < z < 2.8 and having apparent V band magnitudes between 16.6 and 20.1 mag, have observations ranging from 49 to 1353 epochs spanning over 7.5 yr with frequency of sampling between 2 to 10 days. All the quasars show variability during the observing period. The normalised excess variance (Fvar) in V and R bands are in the range 0.2% < FVvar < 1.6% and 0.1% < FRvar < 1.5% respectively. In a large fraction of the sources, Fvar is larger in the V band compared to the R band. From the z-transformed discrete cross-correlation function analysis, we find that there is no lag between the V and R band variations. Adopting the Markov Chain Monte Carlo (MCMC) approach, and properly taking into account the correlation between the errors in colours and magnitudes, it is found that the majority of sources show a bluer when brighter trend, while a minor fraction of quasars show the opposite behaviour. This is similar to the results obtained from another two independent algorithms, namely the weighted linear least squares fit (FITEXY) and the bivariate correlated errors and intrinsic scatter regression (BCES). However, the ordinary least squares (OLS) fit, normally used in the colour variability studies of quasars, indicates that all the quasars studied here show a bluer when brighter trend. It is therefore very clear that the OLS algorithm cannot be used for the study of colour variability in quasars.

  3. Using Bayesian methods to predict climate impacts on groundwater availability and agricultural production in Punjab, India

    NASA Astrophysics Data System (ADS)

    Russo, T. A.; Devineni, N.; Lall, U.

    2015-12-01

    Lasting success of the Green Revolution in Punjab, India relies on continued availability of local water resources. Supplying primarily rice and wheat for the rest of India, Punjab supports crop irrigation with a canal system and groundwater, which is vastly over-exploited. The detailed data required to physically model future impacts on water supplies agricultural production is not readily available for this region, therefore we use Bayesian methods to estimate hydrologic properties and irrigation requirements for an under-constrained mass balance model. Using measured values of historical precipitation, total canal water delivery, crop yield, and water table elevation, we present a method using a Markov chain Monte Carlo (MCMC) algorithm to solve for a distribution of values for each unknown parameter in a conceptual mass balance model. Due to heterogeneity across the state, and the resolution of input data, we estimate model parameters at the district-scale using spatial pooling. The resulting model is used to predict the impact of precipitation change scenarios on groundwater availability under multiple cropping options. Predicted groundwater declines vary across the state, suggesting that crop selection and water management strategies should be determined at a local scale. This computational method can be applied in data-scarce regions across the world, where water resource management is required to resolve competition between food security and available resources in a changing climate.

  4. Non-stationary hydrologic frequency analysis using B-spline quantile regression

    NASA Astrophysics Data System (ADS)

    Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.

    2017-11-01

    Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.

  5. Water management can reinforce plant competition in salt-affected semi-arid wetlands

    NASA Astrophysics Data System (ADS)

    Coletti, Janaine Z.; Vogwill, Ryan; Hipsey, Matthew R.

    2017-09-01

    The diversity of vegetation in semi-arid, ephemeral wetlands is determined by niche availability and species competition, both of which are influenced by changes in water availability and salinity. Here, we hypothesise that ignoring physiological differences and competition between species when managing wetland hydrologic regimes can lead to a decrease in vegetation diversity, even when the overall wetland carrying capacity is improved. Using an ecohydrological model capable of resolving water-vegetation-salt feedbacks, we investigate why water surface and groundwater management interventions to combat vegetation decline have been more beneficial to Casuarina obesa than to Melaleuca strobophylla, the co-dominant tree species in Lake Toolibin, a salt-affected wetland in Western Australia. The simulations reveal that in trying to reduce the negative effect of salinity, the management interventions have created an environment favouring C. obesa by intensifying the climate-induced trend that the wetland has been experiencing of lower water availability and higher root-zone salinity. By testing alternative scenarios, we show that interventions that improve M. strobophylla biomass are possible by promoting hydrologic conditions that are less specific to the niche requirements of C. obesa. Modelling uncertainties were explored via a Markov Chain Monte Carlo (MCMC) algorithm. Overall, the study demonstrates the importance of including species differentiation and competition in ecohydrological models that form the basis for wetland management.

  6. Estimates of the topographic uplift of the Southern African Plateau from the African Superswell through petrologically-consistent thermo-chemical modelling of the geoid, SHF, Rayleigh and Love dispersion curves and MT data

    NASA Astrophysics Data System (ADS)

    Jones, Alan G.; Afonso, Juan Carlos; Fullea, Javier

    2015-04-01

    The deep mantle African Superswell is thought to cause up to 500 m of the uplift of the Southern African Plateau. We investigate this phenomenon through stochastic thermo-chemical inversion modelling of the geoid, surface heat flow, Rayleigh and Love dispersion curves and MT data, in a manner that is fully petrologically-consistent. We invert for a three layer crustal velocity, density and thermal structure, but assume the resistivity layering (based on prior inversion of the MT data alone). Inversions are performed using an improved Delayed Rejection and Adaptive Metropolis (DRAM) type Markov chain Monte Carlo (MCMC) algorithm. We demonstrate that a single layer lithosphere can fit most of the data, but not the MT responses. We further demonstrate that modelling the seismic data alone, without the constraint of requiring reasonable oxide chemistry or of fitting the geoid, permits wildly acceptable elevations and with very poorly defined lithosphere-asthenosphere boundary (LAB). We parameterise the lithosphere into three layers, and bound the permitted oxide chemistry of each layer consistent with known chemical layering. We find acceptable models, from 5 million tested in each case, that fit all responses and yield a posteriori elevation distributions centred on 900-950 m, suggesting dynamic support from the lower mantle of some 400 m.

  7. Anisotropic Lithospheric layering in the North American craton, revealed by Bayesian inversion of short and long period data

    NASA Astrophysics Data System (ADS)

    Roy, Corinna; Calo, Marco; Bodin, Thomas; Romanowicz, Barbara

    2016-04-01

    Competing hypotheses for the formation and evolution of continents are highly under debate, including the theory of underplating by hot plumes or accretion by shallow subduction in continental or arc settings. In order to support these hypotheses, documenting structural layering in the cratonic lithosphere becomes especially important. Recent studies of seismic-wave receiver function data have detected a structural boundary under continental cratons at 100-140 km depths, which is too shallow to be consistent with the lithosphere-asthenosphere boundary, as inferred from seismic tomography and other geophysical studies. This leads to the conclusion that 1) the cratonic lithosphere may be thinner than expected, contradicting tomographic and other geophysical or geochemical inferences, or 2) that the receiver function studies detect a mid-lithospheric discontinuity rather than the LAB. On the other hand, several recent studies documented significant changes in the direction of azimuthal anisotropy with depth that suggest layering in the anisotropic structure of the stable part of the North American continent. In particular, Yuan and Romanowicz (2010) combined long period surface wave and overtone data with core refracted shear wave (SKS) splitting measurements in a joint tomographic inversion. A question that arises is whether the anisotropic layering observed coincides with that obtained from receiver function studies. To address this question, we use a trans-dimensional Markov-chain Monte Carlo (MCMC) algorithm to generate probabilistic 1D radially and azimuthal anisotropic shear wave velocity profiles for selected stations in North America. In the algorithm we jointly invert short period (Ps Receiver Functions, surface wave dispersion for Love and Rayleigh waves) and long period data (SKS waveforms). By including three different data types, which sample different volumes of the Earth and have different sensitivities to 
structure, we overcome the problem of incompatible interpretations of models provided by only one data set. The resulting 1D profiles include both isotropic and anisotropic discontinuities in the upper mantle (above 350 km depth). The huge advantage of our procedure is the avoidance of any intermediate processing steps such as numerical deconvolution or the calculation of splitting parameters, which can be very sensitive to noise. Additionally, the number of layers, as well as the data noise and the presence of anisotropy are treated as unknowns in the transdimensional Monte Carlo Markov chain algorithm. We recently demonstrated the power of this approach in the case of two stations located in different tectonic settings (Bodin et al., 2015, submitted). Here we extend this approach to a broader range of settings within the north American continent.

  8. Molecular Monte Carlo Simulations Using Graphics Processing Units: To Waste Recycle or Not?

    PubMed

    Kim, Jihan; Rodgers, Jocelyn M; Athènes, Manuel; Smit, Berend

    2011-10-11

    In the waste recycling Monte Carlo (WRMC) algorithm, (1) multiple trial states may be simultaneously generated and utilized during Monte Carlo moves to improve the statistical accuracy of the simulations, suggesting that such an algorithm may be well posed for implementation in parallel on graphics processing units (GPUs). In this paper, we implement two waste recycling Monte Carlo algorithms in CUDA (Compute Unified Device Architecture) using uniformly distributed random trial states and trial states based on displacement random-walk steps, and we test the methods on a methane-zeolite MFI framework system to evaluate their utility. We discuss the specific implementation details of the waste recycling GPU algorithm and compare the methods to other parallel algorithms optimized for the framework system. We analyze the relationship between the statistical accuracy of our simulations and the CUDA block size to determine the efficient allocation of the GPU hardware resources. We make comparisons between the GPU and the serial CPU Monte Carlo implementations to assess speedup over conventional microprocessors. Finally, we apply our optimized GPU algorithms to the important problem of determining free energy landscapes, in this case for molecular motion through the zeolite LTA.

  9. Off-diagonal expansion quantum Monte Carlo

    NASA Astrophysics Data System (ADS)

    Albash, Tameem; Wagenbreth, Gene; Hen, Itay

    2017-12-01

    We propose a Monte Carlo algorithm designed to simulate quantum as well as classical systems at equilibrium, bridging the algorithmic gap between quantum and classical thermal simulation algorithms. The method is based on a decomposition of the quantum partition function that can be viewed as a series expansion about its classical part. We argue that the algorithm not only provides a theoretical advancement in the field of quantum Monte Carlo simulations, but is optimally suited to tackle quantum many-body systems that exhibit a range of behaviors from "fully quantum" to "fully classical," in contrast to many existing methods. We demonstrate the advantages, sometimes by orders of magnitude, of the technique by comparing it against existing state-of-the-art schemes such as path integral quantum Monte Carlo and stochastic series expansion. We also illustrate how our method allows for the unification of quantum and classical thermal parallel tempering techniques into a single algorithm and discuss its practical significance.

  10. Off-diagonal expansion quantum Monte Carlo.

    PubMed

    Albash, Tameem; Wagenbreth, Gene; Hen, Itay

    2017-12-01

    We propose a Monte Carlo algorithm designed to simulate quantum as well as classical systems at equilibrium, bridging the algorithmic gap between quantum and classical thermal simulation algorithms. The method is based on a decomposition of the quantum partition function that can be viewed as a series expansion about its classical part. We argue that the algorithm not only provides a theoretical advancement in the field of quantum Monte Carlo simulations, but is optimally suited to tackle quantum many-body systems that exhibit a range of behaviors from "fully quantum" to "fully classical," in contrast to many existing methods. We demonstrate the advantages, sometimes by orders of magnitude, of the technique by comparing it against existing state-of-the-art schemes such as path integral quantum Monte Carlo and stochastic series expansion. We also illustrate how our method allows for the unification of quantum and classical thermal parallel tempering techniques into a single algorithm and discuss its practical significance.

  11. Maximal compression of the redshift-space galaxy power spectrum and bispectrum

    NASA Astrophysics Data System (ADS)

    Gualdi, Davide; Manera, Marc; Joachimi, Benjamin; Lahav, Ofer

    2018-05-01

    We explore two methods of compressing the redshift-space galaxy power spectrum and bispectrum with respect to a chosen set of cosmological parameters. Both methods involve reducing the dimension of the original data vector (e.g. 1000 elements) to the number of cosmological parameters considered (e.g. seven ) using the Karhunen-Loève algorithm. In the first case, we run MCMC sampling on the compressed data vector in order to recover the 1D and 2D posterior distributions. The second option, approximately 2000 times faster, works by orthogonalizing the parameter space through diagonalization of the Fisher information matrix before the compression, obtaining the posterior distributions without the need of MCMC sampling. Using these methods for future spectroscopic redshift surveys like DESI, Euclid, and PFS would drastically reduce the number of simulations needed to compute accurate covariance matrices with minimal loss of constraining power. We consider a redshift bin of a DESI-like experiment. Using the power spectrum combined with the bispectrum as a data vector, both compression methods on average recover the 68 {per cent} credible regions to within 0.7 {per cent} and 2 {per cent} of those resulting from standard MCMC sampling, respectively. These confidence intervals are also smaller than the ones obtained using only the power spectrum by 81 per cent, 80 per cent, and 82 per cent respectively, for the bias parameter b1, the growth rate f, and the scalar amplitude parameter As.

  12. Assessment of parameter uncertainty in hydrological model using a Markov-Chain-Monte-Carlo-based multilevel-factorial-analysis method

    NASA Astrophysics Data System (ADS)

    Zhang, Junlong; Li, Yongping; Huang, Guohe; Chen, Xi; Bao, Anming

    2016-07-01

    Without a realistic assessment of parameter uncertainty, decision makers may encounter difficulties in accurately describing hydrologic processes and assessing relationships between model parameters and watershed characteristics. In this study, a Markov-Chain-Monte-Carlo-based multilevel-factorial-analysis (MCMC-MFA) method is developed, which can not only generate samples of parameters from a well constructed Markov chain and assess parameter uncertainties with straightforward Bayesian inference, but also investigate the individual and interactive effects of multiple parameters on model output through measuring the specific variations of hydrological responses. A case study is conducted for addressing parameter uncertainties in the Kaidu watershed of northwest China. Effects of multiple parameters and their interactions are quantitatively investigated using the MCMC-MFA with a three-level factorial experiment (totally 81 runs). A variance-based sensitivity analysis method is used to validate the results of parameters' effects. Results disclose that (i) soil conservation service runoff curve number for moisture condition II (CN2) and fraction of snow volume corresponding to 50% snow cover (SNO50COV) are the most significant factors to hydrological responses, implying that infiltration-excess overland flow and snow water equivalent represent important water input to the hydrological system of the Kaidu watershed; (ii) saturate hydraulic conductivity (SOL_K) and soil evaporation compensation factor (ESCO) have obvious effects on hydrological responses; this implies that the processes of percolation and evaporation would impact hydrological process in this watershed; (iii) the interactions of ESCO and SNO50COV as well as CN2 and SNO50COV have an obvious effect, implying that snow cover can impact the generation of runoff on land surface and the extraction of soil evaporative demand in lower soil layers. These findings can help enhance the hydrological model's capability for simulating/predicting water resources.

  13. Multiscale Monte Carlo equilibration: Pure Yang-Mills theory

    DOE PAGES

    Endres, Michael G.; Brower, Richard C.; Orginos, Kostas; ...

    2015-12-29

    In this study, we present a multiscale thermalization algorithm for lattice gauge theory, which enables efficient parallel generation of uncorrelated gauge field configurations. The algorithm combines standard Monte Carlo techniques with ideas drawn from real space renormalization group and multigrid methods. We demonstrate the viability of the algorithm for pure Yang-Mills gauge theory for both heat bath and hybrid Monte Carlo evolution, and show that it ameliorates the problem of topological freezing up to controllable lattice spacing artifacts.

  14. Mining geographic variations of Plasmodium vivax for active surveillance: a case study in China.

    PubMed

    Shi, Benyun; Tan, Qi; Zhou, Xiao-Nong; Liu, Jiming

    2015-05-27

    Geographic variations of an infectious disease characterize the spatial differentiation of disease incidences caused by various impact factors, such as environmental, demographic, and socioeconomic factors. Some factors may directly determine the force of infection of the disease (namely, explicit factors), while many other factors may indirectly affect the number of disease incidences via certain unmeasurable processes (namely, implicit factors). In this study, the impact of heterogeneous factors on geographic variations of Plasmodium vivax incidences is systematically investigate in Tengchong, Yunnan province, China. A space-time model that resembles a P. vivax transmission model and a hidden time-dependent process, is presented by taking into consideration both explicit and implicit factors. Specifically, the transmission model is built upon relevant demographic, environmental, and biophysical factors to describe the local infections of P. vivax. While the hidden time-dependent process is assessed by several socioeconomic factors to account for the imported cases of P. vivax. To quantitatively assess the impact of heterogeneous factors on geographic variations of P. vivax infections, a Markov chain Monte Carlo (MCMC) simulation method is developed to estimate the model parameters by fitting the space-time model to the reported spatial-temporal disease incidences. Since there is no ground-truth information available, the performance of the MCMC method is first evaluated against a synthetic dataset. The results show that the model parameters can be well estimated using the proposed MCMC method. Then, the proposed model is applied to investigate the geographic variations of P. vivax incidences among all 18 towns in Tengchong, Yunnan province, China. Based on the geographic variations, the 18 towns can be further classify into five groups with similar socioeconomic causality for P. vivax incidences. Although this study focuses mainly on the transmission of P. vivax, the proposed space-time model is general and can readily be extended to investigate geographic variations of other diseases. Practically, such a computational model will offer new insights into active surveillance and strategic planning for disease surveillance and control.

  15. An MCMC determination of the primordial helium abundance

    NASA Astrophysics Data System (ADS)

    Aver, Erik; Olive, Keith A.; Skillman, Evan D.

    2012-04-01

    Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these parameter determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and continue to better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, & Stasińska (2007). To improve the reliability of the determination, a high quality dataset is needed. In pursuit of this, a variety of cuts are explored. The efficacy of the He I λ4026 emission line as a constraint on the solutions is first examined, revealing the introduction of systematic bias through its absence. As a clear measure of the quality of the physical solution, a χ2 analysis proves instrumental in the selection of data compatible with the theoretical model. Nearly two-thirds of the observations fall outside a standard 95% confidence level cut, which highlights the care necessary in selecting systems and warrants further investigation into potential deficiencies of the model or data. In addition, the method also allows us to exclude systems for which parameter estimations are statistical outliers. As a result, the final selected dataset gains in reliability and exhibits improved consistency. Regression to zero metallicity yields Yp = 0.2534 ± 0.0083, in broad agreement with the WMAP result. The inclusion of more observations shows promise for further reducing the uncertainty, but more high quality spectra are required.

  16. A Bayesian approach to modeling 2D gravity data using polygon states

    NASA Astrophysics Data System (ADS)

    Titus, W. J.; Titus, S.; Davis, J. R.

    2015-12-01

    We present a Bayesian Markov chain Monte Carlo (MCMC) method for the 2D gravity inversion of a localized subsurface object with constant density contrast. Our models have four parameters: the density contrast, the number of vertices in a polygonal approximation of the object, an upper bound on the ratio of the perimeter squared to the area, and the vertices of a polygon container that bounds the object. Reasonable parameter values can be estimated prior to inversion using a forward model and geologic information. In addition, we assume that the field data have a common random uncertainty that lies between two bounds but that it has no systematic uncertainty. Finally, we assume that there is no uncertainty in the spatial locations of the measurement stations. For any set of model parameters, we use MCMC methods to generate an approximate probability distribution of polygons for the object. We then compute various probability distributions for the object, including the variance between the observed and predicted fields (an important quantity in the MCMC method), the area, the center of area, and the occupancy probability (the probability that a spatial point lies within the object). In addition, we compare probabilities of different models using parallel tempering, a technique which also mitigates trapping in local optima that can occur in certain model geometries. We apply our method to several synthetic data sets generated from objects of varying shape and location. We also analyze a natural data set collected across the Rio Grande Gorge Bridge in New Mexico, where the object (i.e. the air below the bridge) is known and the canyon is approximately 2D. Although there are many ways to view results, the occupancy probability proves quite powerful. We also find that the choice of the container is important. In particular, large containers should be avoided, because the more closely a container confines the object, the better the predictions match properties of object.

  17. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  18. An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling

    NASA Astrophysics Data System (ADS)

    Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton

    2013-10-01

    Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.

  19. Collision of Physics and Software in the Monte Carlo Application Toolkit (MCATK)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sweezy, Jeremy Ed

    2016-01-21

    The topic is presented in a series of slides organized as follows: MCATK overview, development strategy, available algorithms, problem modeling (sources, geometry, data, tallies), parallelism, miscellaneous tools/features, example MCATK application, recent areas of research, and summary and future work. MCATK is a C++ component-based Monte Carlo neutron-gamma transport software library with continuous energy neutron and photon transport. Designed to build specialized applications and to provide new functionality in existing general-purpose Monte Carlo codes like MCNP, it reads ACE formatted nuclear data generated by NJOY. The motivation behind MCATK was to reduce costs. MCATK physics involves continuous energy neutron & gammamore » transport with multi-temperature treatment, static eigenvalue (k eff and α) algorithms, time-dependent algorithm, and fission chain algorithms. MCATK geometry includes mesh geometries and solid body geometries. MCATK provides verified, unit-test Monte Carlo components, flexibility in Monte Carlo application development, and numerous tools such as geometry and cross section plotters.« less

  20. Evaluation of an analytic linear Boltzmann transport equation solver for high-density inhomogeneities

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lloyd, S. A. M.; Ansbacher, W.; Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia V8W 3P6

    2013-01-15

    Purpose: Acuros external beam (Acuros XB) is a novel dose calculation algorithm implemented through the ECLIPSE treatment planning system. The algorithm finds a deterministic solution to the linear Boltzmann transport equation, the same equation commonly solved stochastically by Monte Carlo methods. This work is an evaluation of Acuros XB, by comparison with Monte Carlo, for dose calculation applications involving high-density materials. Existing non-Monte Carlo clinical dose calculation algorithms, such as the analytic anisotropic algorithm (AAA), do not accurately model dose perturbations due to increased electron scatter within high-density volumes. Methods: Acuros XB, AAA, and EGSnrc based Monte Carlo are usedmore » to calculate dose distributions from 18 MV and 6 MV photon beams delivered to a cubic water phantom containing a rectangular high density (4.0-8.0 g/cm{sup 3}) volume at its center. The algorithms are also used to recalculate a clinical prostate treatment plan involving a unilateral hip prosthesis, originally evaluated using AAA. These results are compared graphically and numerically using gamma-index analysis. Radio-chromic film measurements are presented to augment Monte Carlo and Acuros XB dose perturbation data. Results: Using a 2% and 1 mm gamma-analysis, between 91.3% and 96.8% of Acuros XB dose voxels containing greater than 50% the normalized dose were in agreement with Monte Carlo data for virtual phantoms involving 18 MV and 6 MV photons, stainless steel and titanium alloy implants and for on-axis and oblique field delivery. A similar gamma-analysis of AAA against Monte Carlo data showed between 80.8% and 87.3% agreement. Comparing Acuros XB and AAA evaluations of a clinical prostate patient plan involving a unilateral hip prosthesis, Acuros XB showed good overall agreement with Monte Carlo while AAA underestimated dose on the upstream medial surface of the prosthesis due to electron scatter from the high-density material. Film measurements support the dose perturbations demonstrated by Monte Carlo and Acuros XB data. Conclusions: Acuros XB is shown to perform as well as Monte Carlo methods and better than existing clinical algorithms for dose calculations involving high-density volumes.« less

  1. SU-E-T-188: Film Dosimetry Verification of Monte Carlo Generated Electron Treatment Plans

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Enright, S; Asprinio, A; Lu, L

    2014-06-01

    Purpose: The purpose of this study was to compare dose distributions from film measurements to Monte Carlo generated electron treatment plans. Irradiation with electrons offers the advantages of dose uniformity in the target volume and of minimizing the dose to deeper healthy tissue. Using the Monte Carlo algorithm will improve dose accuracy in regions with heterogeneities and irregular surfaces. Methods: Dose distributions from GafChromic{sup ™} EBT3 films were compared to dose distributions from the Electron Monte Carlo algorithm in the Eclipse{sup ™} radiotherapy treatment planning system. These measurements were obtained for 6MeV, 9MeV and 12MeV electrons at two depths. Allmore » phantoms studied were imported into Eclipse by CT scan. A 1 cm thick solid water template with holes for bonelike and lung-like plugs was used. Different configurations were used with the different plugs inserted into the holes. Configurations with solid-water plugs stacked on top of one another were also used to create an irregular surface. Results: The dose distributions measured from the film agreed with those from the Electron Monte Carlo treatment plan. Accuracy of Electron Monte Carlo algorithm was also compared to that of Pencil Beam. Dose distributions from Monte Carlo had much higher pass rates than distributions from Pencil Beam when compared to the film. The pass rate for Monte Carlo was in the 80%–99% range, where the pass rate for Pencil Beam was as low as 10.76%. Conclusion: The dose distribution from Monte Carlo agreed with the measured dose from the film. When compared to the Pencil Beam algorithm, pass rates for Monte Carlo were much higher. Monte Carlo should be used over Pencil Beam for regions with heterogeneities and irregular surfaces.« less

  2. A probabilistic model framework for evaluating year-to-year variation in crop productivity

    NASA Astrophysics Data System (ADS)

    Yokozawa, M.; Iizumi, T.; Tao, F.

    2008-12-01

    Most models describing the relation between crop productivity and weather condition have so far been focused on mean changes of crop yield. For keeping stable food supply against abnormal weather as well as climate change, evaluating the year-to-year variations in crop productivity rather than the mean changes is more essential. We here propose a new framework of probabilistic model based on Bayesian inference and Monte Carlo simulation. As an example, we firstly introduce a model on paddy rice production in Japan. It is called PRYSBI (Process- based Regional rice Yield Simulator with Bayesian Inference; Iizumi et al., 2008). The model structure is the same as that of SIMRIW, which was developed and used widely in Japan. The model includes three sub- models describing phenological development, biomass accumulation and maturing of rice crop. These processes are formulated to include response nature of rice plant to weather condition. This model inherently was developed to predict rice growth and yield at plot paddy scale. We applied it to evaluate the large scale rice production with keeping the same model structure. Alternatively, we assumed the parameters as stochastic variables. In order to let the model catch up actual yield at larger scale, model parameters were determined based on agricultural statistical data of each prefecture of Japan together with weather data averaged over the region. The posterior probability distribution functions (PDFs) of parameters included in the model were obtained using Bayesian inference. The MCMC (Markov Chain Monte Carlo) algorithm was conducted to numerically solve the Bayesian theorem. For evaluating the year-to-year changes in rice growth/yield under this framework, we firstly iterate simulations with set of parameter values sampled from the estimated posterior PDF of each parameter and then take the ensemble mean weighted with the posterior PDFs. We will also present another example for maize productivity in China. The framework proposed here provides us information on uncertainties, possibilities and limitations on future improvements in crop model as well.

  3. Harnessing the theoretical foundations of the exponential and beta-Poisson dose-response models to quantify parameter uncertainty using Markov Chain Monte Carlo.

    PubMed

    Schmidt, Philip J; Pintar, Katarina D M; Fazil, Aamir M; Topp, Edward

    2013-09-01

    Dose-response models are the essential link between exposure assessment and computed risk values in quantitative microbial risk assessment, yet the uncertainty that is inherent to computed risks because the dose-response model parameters are estimated using limited epidemiological data is rarely quantified. Second-order risk characterization approaches incorporating uncertainty in dose-response model parameters can provide more complete information to decisionmakers by separating variability and uncertainty to quantify the uncertainty in computed risks. Therefore, the objective of this work is to develop procedures to sample from posterior distributions describing uncertainty in the parameters of exponential and beta-Poisson dose-response models using Bayes's theorem and Markov Chain Monte Carlo (in OpenBUGS). The theoretical origins of the beta-Poisson dose-response model are used to identify a decomposed version of the model that enables Bayesian analysis without the need to evaluate Kummer confluent hypergeometric functions. Herein, it is also established that the beta distribution in the beta-Poisson dose-response model cannot address variation among individual pathogens, criteria to validate use of the conventional approximation to the beta-Poisson model are proposed, and simple algorithms to evaluate actual beta-Poisson probabilities of infection are investigated. The developed MCMC procedures are applied to analysis of a case study data set, and it is demonstrated that an important region of the posterior distribution of the beta-Poisson dose-response model parameters is attributable to the absence of low-dose data. This region includes beta-Poisson models for which the conventional approximation is especially invalid and in which many beta distributions have an extreme shape with questionable plausibility. © Her Majesty the Queen in Right of Canada 2013. Reproduced with the permission of the Minister of the Public Health Agency of Canada.

  4. What are hierarchical models and how do we analyze them?

    USGS Publications Warehouse

    Royle, Andy

    2016-01-01

    In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)

  5. INFERENCE FOR INDIVIDUAL-LEVEL MODELS OF INFECTIOUS DISEASES IN LARGE POPULATIONS.

    PubMed

    Deardon, Rob; Brooks, Stephen P; Grenfell, Bryan T; Keeling, Matthew J; Tildesley, Michael J; Savill, Nicholas J; Shaw, Darren J; Woolhouse, Mark E J

    2010-01-01

    Individual Level Models (ILMs), a new class of models, are being applied to infectious epidemic data to aid in the understanding of the spatio-temporal dynamics of infectious diseases. These models are highly flexible and intuitive, and can be parameterised under a Bayesian framework via Markov chain Monte Carlo (MCMC) methods. Unfortunately, this parameterisation can be difficult to implement due to intense computational requirements when calculating the full posterior for large, or even moderately large, susceptible populations, or when missing data are present. Here we detail a methodology that can be used to estimate parameters for such large, and/or incomplete, data sets. This is done in the context of a study of the UK 2001 foot-and-mouth disease (FMD) epidemic.

  6. On the validity of cosmological Fisher matrix forecasts

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolz, Laura; Kilbinger, Martin; Weller, Jochen

    2012-09-01

    We present a comparison of Fisher matrix forecasts for cosmological probes with Monte Carlo Markov Chain (MCMC) posterior likelihood estimation methods. We analyse the performance of future Dark Energy Task Force (DETF) stage-III and stage-IV dark-energy surveys using supernovae, baryon acoustic oscillations and weak lensing as probes. We concentrate in particular on the dark-energy equation of state parameters w{sub 0} and w{sub a}. For purely geometrical probes, and especially when marginalising over w{sub a}, we find considerable disagreement between the two methods, since in this case the Fisher matrix can not reproduce the highly non-elliptical shape of the likelihood function.more » More quantitatively, the Fisher method underestimates the marginalized errors for purely geometrical probes between 30%-70%. For cases including structure formation such as weak lensing, we find that the posterior probability contours from the Fisher matrix estimation are in good agreement with the MCMC contours and the forecasted errors only changing on the 5% level. We then explore non-linear transformations resulting in physically-motivated parameters and investigate whether these parameterisations exhibit a Gaussian behaviour. We conclude that for the purely geometrical probes and, more generally, in cases where it is not known whether the likelihood is close to Gaussian, the Fisher matrix is not the appropriate tool to produce reliable forecasts.« less

  7. Using Bayesian neural networks to classify forest scenes

    NASA Astrophysics Data System (ADS)

    Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni

    1998-10-01

    We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stroeer, Alexander; Veitch, John

    The Laser Interferometer Space Antenna (LISA) defines new demands on data analysis efforts in its all-sky gravitational wave survey, recording simultaneously thousands of galactic compact object binary foreground sources and tens to hundreds of background sources like binary black hole mergers and extreme-mass ratio inspirals. We approach this problem with an adaptive and fully automatic Reversible Jump Markov Chain Monte Carlo sampler, able to sample from the joint posterior density function (as established by Bayes theorem) for a given mixture of signals ''out of the box'', handling the total number of signals as an additional unknown parameter beside the unknownmore » parameters of each individual source and the noise floor. We show in examples from the LISA Mock Data Challenge implementing the full response of LISA in its TDI description that this sampler is able to extract monochromatic Double White Dwarf signals out of colored instrumental noise and additional foreground and background noise successfully in a global fitting approach. We introduce 2 examples with fixed number of signals (MCMC sampling), and 1 example with unknown number of signals (RJ-MCMC), the latter further promoting the idea behind an experimental adaptation of the model indicator proposal densities in the main sampling stage. We note that the experienced runtimes and degeneracies in parameter extraction limit the shown examples to the extraction of a low but realistic number of signals.« less

  9. Renyi entanglement entropy of interacting fermions calculated using the continuous-time quantum Monte Carlo method.

    PubMed

    Wang, Lei; Troyer, Matthias

    2014-09-12

    We present a new algorithm for calculating the Renyi entanglement entropy of interacting fermions using the continuous-time quantum Monte Carlo method. The algorithm only samples the interaction correction of the entanglement entropy, which by design ensures the efficient calculation of weakly interacting systems. Combined with Monte Carlo reweighting, the algorithm also performs well for systems with strong interactions. We demonstrate the potential of this method by studying the quantum entanglement signatures of the charge-density-wave transition of interacting fermions on a square lattice.

  10. A Bayesian destructive weighted Poisson cure rate model and an application to a cutaneous melanoma data.

    PubMed

    Rodrigues, Josemar; Cancho, Vicente G; de Castro, Mário; Balakrishnan, N

    2012-12-01

    In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis--latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de São Carlos, São Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de São Carlos, São Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.

  11. Event-chain Monte Carlo algorithms for three- and many-particle interactions

    NASA Astrophysics Data System (ADS)

    Harland, J.; Michel, M.; Kampmann, T. A.; Kierfeld, J.

    2017-02-01

    We generalize the rejection-free event-chain Monte Carlo algorithm from many-particle systems with pairwise interactions to systems with arbitrary three- or many-particle interactions. We introduce generalized lifting probabilities between particles and obtain a general set of equations for lifting probabilities, the solution of which guarantees maximal global balance. We validate the resulting three-particle event-chain Monte Carlo algorithms on three different systems by comparison with conventional local Monte Carlo simulations: i) a test system of three particles with a three-particle interaction that depends on the enclosed triangle area; ii) a hard-needle system in two dimensions, where needle interactions constitute three-particle interactions of the needle end points; iii) a semiflexible polymer chain with a bending energy, which constitutes a three-particle interaction of neighboring chain beads. The examples demonstrate that the generalization to many-particle interactions broadens the applicability of event-chain algorithms considerably.

  12. Scalable Domain Decomposed Monte Carlo Particle Transport

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    O'Brien, Matthew Joseph

    2013-12-05

    In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.

  13. Spatial analysis of paediatric swimming pool submersions by housing type.

    PubMed

    Shenoi, Rohit P; Levine, Ned; Jones, Jennifer L; Frost, Mary H; Koerner, Christine E; Fraser, John J

    2015-08-01

    Drowning is a major cause of unintentional childhood death. The relationship between childhood swimming pool submersions, neighbourhood sociodemographics, housing type and swimming pool location was examined in Harris County, Texas. Childhood pool submersion incidents were examined for spatial clustering using the Nearest Neighbor Hierarchical Cluster (Nnh) algorithm. To relate submersions to predictive factors, an Markov Chain Monte Carlo (MCMC) Poisson-Lognormal-Conditional Autoregressive (CAR) spatial regression model was tested at the census tract level. There were 260 submersions; 49 were fatal. Forty-two per cent occurred at single-family residences and 36% at multifamily residential buildings. The risk of a submersion was 2.7 times higher for a child at a multifamily than a single-family residence and 28 times more likely in a multifamily swimming pool than a single family pool. However, multifamily submersions were clustered because of the concentration of such buildings with pools. Spatial clustering did not occur in single-family residences. At the tract level, submersions in single-family and multifamily residences were best predicted by the number of pools by housing type and the number of children aged 0-17 by housing type. Paediatric swimming pool submersions in multifamily buildings are spatially clustered. The likelihood of submersions is higher for children who live in multifamily buildings with pools than those who live in single-family homes with pools. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  14. Modelling malaria incidence by an autoregressive distributed lag model with spatial component.

    PubMed

    Laguna, Francisco; Grillet, María Eugenia; León, José R; Ludeña, Carenne

    2017-08-01

    The influence of climatic variables on the dynamics of human malaria has been widely highlighted. Also, it is known that this mosquito-borne infection varies in space and time. However, when the data is spatially incomplete most popular spatio-temporal methods of analysis cannot be applied directly. In this paper, we develop a two step methodology to model the spatio-temporal dependence of malaria incidence on local rainfall, temperature, and humidity as well as the regional sea surface temperatures (SST) in the northern coast of Venezuela. First, we fit an autoregressive distributed lag model (ARDL) to the weekly data, and then, we adjust a linear separable spacial vectorial autoregressive model (VAR) to the residuals of the ARDL. Finally, the model parameters are tuned using a Markov Chain Monte Carlo (MCMC) procedure derived from the Metropolis-Hastings algorithm. Our results show that the best model to account for the variations of malaria incidence from 2001 to 2008 in 10 endemic Municipalities in North-Eastern Venezuela is a logit model that included the accumulated local precipitation in combination with the local maximum temperature of the preceding month as positive regressors. Additionally, we show that although malaria dynamics is highly heterogeneous in space, a detailed analysis of the estimated spatial parameters in our model yield important insights regarding the joint behavior of the disease incidence across the different counties in our study. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Baking a mass-spectrometry data PIE with McMC and simulated annealing: predicting protein post-translational modifications from integrated top-down and bottom-up data.

    PubMed

    Jefferys, Stuart R; Giddings, Morgan C

    2011-03-15

    Post-translational modifications are vital to the function of proteins, but are hard to study, especially since several modified isoforms of a protein may be present simultaneously. Mass spectrometers are a great tool for investigating modified proteins, but the data they provide is often incomplete, ambiguous and difficult to interpret. Combining data from multiple experimental techniques-especially bottom-up and top-down mass spectrometry-provides complementary information. When integrated with background knowledge this allows a human expert to interpret what modifications are present and where on a protein they are located. However, the process is arduous and for high-throughput applications needs to be automated. This article explores a data integration methodology based on Markov chain Monte Carlo and simulated annealing. Our software, the Protein Inference Engine (the PIE) applies these algorithms using a modular approach, allowing multiple types of data to be considered simultaneously and for new data types to be added as needed. Even for complicated data representing multiple modifications and several isoforms, the PIE generates accurate modification predictions, including location. When applied to experimental data collected on the L7/L12 ribosomal protein the PIE was able to make predictions consistent with manual interpretation for several different L7/L12 isoforms using a combination of bottom-up data with experimentally identified intact masses. Software, demo projects and source can be downloaded from http://pie.giddingslab.org/

  16. Merging parallel tempering with sequential geostatistical resampling for improved posterior exploration of high-dimensional subsurface categorical fields

    NASA Astrophysics Data System (ADS)

    Laloy, Eric; Linde, Niklas; Jacques, Diederik; Mariethoz, Grégoire

    2016-04-01

    The sequential geostatistical resampling (SGR) algorithm is a Markov chain Monte Carlo (MCMC) scheme for sampling from possibly non-Gaussian, complex spatially-distributed prior models such as geologic facies or categorical fields. In this work, we highlight the limits of standard SGR for posterior inference of high-dimensional categorical fields with realistically complex likelihood landscapes and benchmark a parallel tempering implementation (PT-SGR). Our proposed PT-SGR approach is demonstrated using synthetic (error corrupted) data from steady-state flow and transport experiments in categorical 7575- and 10,000-dimensional 2D conductivity fields. In both case studies, every SGR trial gets trapped in a local optima while PT-SGR maintains an higher diversity in the sampled model states. The advantage of PT-SGR is most apparent in an inverse transport problem where the posterior distribution is made bimodal by construction. PT-SGR then converges towards the appropriate data misfit much faster than SGR and partly recovers the two modes. In contrast, for the same computational resources SGR does not fit the data to the appropriate error level and hardly produces a locally optimal solution that looks visually similar to one of the two reference modes. Although PT-SGR clearly surpasses SGR in performance, our results also indicate that using a small number (16-24) of temperatures (and thus parallel cores) may not permit complete sampling of the posterior distribution by PT-SGR within a reasonable computational time (less than 1-2 weeks).

  17. The behavior of Metropolis-coupled Markov chains when sampling rugged phylogenetic distributions.

    PubMed

    Brown, Jeremy M; Thomson, Robert C

    2018-02-15

    Bayesian phylogenetic inference involves sampling from posterior distributions of trees, which sometimes exhibit local optima, or peaks, separated by regions of low posterior density. Markov chain Monte Carlo (MCMC) algorithms are the most widely used numerical method for generating samples from these posterior distributions, but they are susceptible to entrapment on individual optima in rugged distributions when they are unable to easily cross through or jump across regions of low posterior density. Ruggedness of posterior distributions can result from a variety of factors, including unmodeled variation in evolutionary processes and unrecognized variation in the true topology across sites or genes. Ruggedness can also become exaggerated when constraints are placed on topologies that require the presence or absence of particular bipartitions (often referred to as positive or negative constraints, respectively). These types of constraints are frequently employed when conducting tests of topological hypotheses (Bergsten et al. 2013; Brown and Thomson 2017). Negative constraints can lead to particularly rugged distributions when the data strongly support a forbidden clade, because monophyly of the clade can be disrupted by inserting outgroup taxa in many different ways. However, topological moves between the alternative disruptions are very difficult, because they require swaps between the inserted outgroup taxa while the data constrain taxa from the forbidden clade to remain close together on the tree. While this precise form of ruggedness is particular to negative constraints, trees with high posterior density can be separated by similarly complicated topological rearrangements, even in the absence of constraints.

  18. Conditional Spectral Analysis of Replicated Multiple Time Series with Application to Nocturnal Physiology.

    PubMed

    Krafty, Robert T; Rosen, Ori; Stoffer, David S; Buysse, Daniel J; Hall, Martica H

    2017-01-01

    This article considers the problem of analyzing associations between power spectra of multiple time series and cross-sectional outcomes when data are observed from multiple subjects. The motivating application comes from sleep medicine, where researchers are able to non-invasively record physiological time series signals during sleep. The frequency patterns of these signals, which can be quantified through the power spectrum, contain interpretable information about biological processes. An important problem in sleep research is drawing connections between power spectra of time series signals and clinical characteristics; these connections are key to understanding biological pathways through which sleep affects, and can be treated to improve, health. Such analyses are challenging as they must overcome the complicated structure of a power spectrum from multiple time series as a complex positive-definite matrix-valued function. This article proposes a new approach to such analyses based on a tensor-product spline model of Cholesky components of outcome-dependent power spectra. The approach exibly models power spectra as nonparametric functions of frequency and outcome while preserving geometric constraints. Formulated in a fully Bayesian framework, a Whittle likelihood based Markov chain Monte Carlo (MCMC) algorithm is developed for automated model fitting and for conducting inference on associations between outcomes and spectral measures. The method is used to analyze data from a study of sleep in older adults and uncovers new insights into how stress and arousal are connected to the amount of time one spends in bed.

  19. Bayesian Population Genomic Inference of Crossing Over and Gene Conversion

    PubMed Central

    Padhukasahasram, Badri; Rannala, Bruce

    2011-01-01

    Meiotic recombination is a fundamental cellular mechanism in sexually reproducing organisms and its different forms, crossing over and gene conversion both play an important role in shaping genetic variation in populations. Here, we describe a coalescent-based full-likelihood Markov chain Monte Carlo (MCMC) method for jointly estimating the crossing-over, gene-conversion, and mean tract length parameters from population genomic data under a Bayesian framework. Although computationally more expensive than methods that use approximate likelihoods, the relative efficiency of our method is expected to be optimal in theory. Furthermore, it is also possible to obtain a posterior sample of genealogies for the data using this method. We first check the performance of the new method on simulated data and verify its correctness. We also extend the method for inference under models with variable gene-conversion and crossing-over rates and demonstrate its ability to identify recombination hotspots. Then, we apply the method to two empirical data sets that were sequenced in the telomeric regions of the X chromosome of Drosophila melanogaster. Our results indicate that gene conversion occurs more frequently than crossing over in the su-w and su-s gene sequences while the local rates of crossing over as inferred by our program are not low. The mean tract lengths for gene-conversion events are estimated to be ∼70 bp and 430 bp, respectively, for these data sets. Finally, we discuss ideas and optimizations for reducing the execution time of our algorithm. PMID:21840857

  20. Convergence analysis of surrogate-based methods for Bayesian inverse problems

    NASA Astrophysics Data System (ADS)

    Yan, Liang; Zhang, Yuan-Xiang

    2017-12-01

    The major challenges in the Bayesian inverse problems arise from the need for repeated evaluations of the forward model, as required by Markov chain Monte Carlo (MCMC) methods for posterior sampling. Many attempts at accelerating Bayesian inference have relied on surrogates for the forward model, typically constructed through repeated forward simulations that are performed in an offline phase. Although such approaches can be quite effective at reducing computation cost, there has been little analysis of the approximation on posterior inference. In this work, we prove error bounds on the Kullback-Leibler (KL) distance between the true posterior distribution and the approximation based on surrogate models. Our rigorous error analysis show that if the forward model approximation converges at certain rate in the prior-weighted L 2 norm, then the posterior distribution generated by the approximation converges to the true posterior at least two times faster in the KL sense. The error bound on the Hellinger distance is also provided. To provide concrete examples focusing on the use of the surrogate model based methods, we present an efficient technique for constructing stochastic surrogate models to accelerate the Bayesian inference approach. The Christoffel least squares algorithms, based on generalized polynomial chaos, are used to construct a polynomial approximation of the forward solution over the support of the prior distribution. The numerical strategy and the predicted convergence rates are then demonstrated on the nonlinear inverse problems, involving the inference of parameters appearing in partial differential equations.

  1. Stochastic approach for radionuclides quantification

    NASA Astrophysics Data System (ADS)

    Clement, A.; Saurel, N.; Perrin, G.

    2018-01-01

    Gamma spectrometry is a passive non-destructive assay used to quantify radionuclides present in more or less complex objects. Basic methods using empirical calibration with a standard in order to quantify the activity of nuclear materials by determining the calibration coefficient are useless on non-reproducible, complex and single nuclear objects such as waste packages. Package specifications as composition or geometry change from one package to another and involve a high variability of objects. Current quantification process uses numerical modelling of the measured scene with few available data such as geometry or composition. These data are density, material, screen, geometric shape, matrix composition, matrix and source distribution. Some of them are strongly dependent on package data knowledge and operator backgrounds. The French Commissariat à l'Energie Atomique (CEA) is developing a new methodology to quantify nuclear materials in waste packages and waste drums without operator adjustment and internal package configuration knowledge. This method suggests combining a global stochastic approach which uses, among others, surrogate models available to simulate the gamma attenuation behaviour, a Bayesian approach which considers conditional probability densities of problem inputs, and Markov Chains Monte Carlo algorithms (MCMC) which solve inverse problems, with gamma ray emission radionuclide spectrum, and outside dimensions of interest objects. The methodology is testing to quantify actinide activity in different kind of matrix, composition, and configuration of sources standard in terms of actinide masses, locations and distributions. Activity uncertainties are taken into account by this adjustment methodology.

  2. Spreaders and Sponges define metastasis in lung cancer: A Markov chain Monte Carlo Mathematical Model

    PubMed Central

    Newton, Paul K.; Mason, Jeremy; Bethel, Kelly; Bazhenova, Lyudmila; Nieva, Jorge; Norton, Larry; Kuhn, Peter

    2013-01-01

    The classic view of metastatic cancer progression is that it is a unidirectional process initiated at the primary tumor site, progressing to variably distant metastatic sites in a fairly predictable, though not perfectly understood, fashion. A Markov chain Monte Carlo mathematical approach can determine a pathway diagram that classifies metastatic tumors as ‘spreaders’ or ‘sponges’ and orders the timescales of progression from site to site. In light of recent experimental evidence highlighting the potential significance of self-seeding of primary tumors, we use a Markov chain Monte Carlo (MCMC) approach, based on large autopsy data sets, to quantify the stochastic, systemic, and often multi-directional aspects of cancer progression. We quantify three types of multi-directional mechanisms of progression: (i) self-seeding of the primary tumor; (ii) re-seeding of the primary tumor from a metastatic site (primary re-seeding); and (iii) re-seeding of metastatic tumors (metastasis re-seeding). The model shows that the combined characteristics of the primary and the first metastatic site to which it spreads largely determine the future pathways and timescales of systemic disease. For lung cancer, the main ‘spreaders’ of systemic disease are the adrenal gland and kidney, whereas the main ‘sponges’ are regional lymph nodes, liver, and bone. Lung is a significant self-seeder, although it is a ‘sponge’ site with respect to progression characteristics. PMID:23447576

  3. Implementation of Monte Carlo Dose calculation for CyberKnife treatment planning

    NASA Astrophysics Data System (ADS)

    Ma, C.-M.; Li, J. S.; Deng, J.; Fan, J.

    2008-02-01

    Accurate dose calculation is essential to advanced stereotactic radiosurgery (SRS) and stereotactic radiotherapy (SRT) especially for treatment planning involving heterogeneous patient anatomy. This paper describes the implementation of a fast Monte Carlo dose calculation algorithm in SRS/SRT treatment planning for the CyberKnife® SRS/SRT system. A superposition Monte Carlo algorithm is developed for this application. Photon mean free paths and interaction types for different materials and energies as well as the tracks of secondary electrons are pre-simulated using the MCSIM system. Photon interaction forcing and splitting are applied to the source photons in the patient calculation and the pre-simulated electron tracks are repeated with proper corrections based on the tissue density and electron stopping powers. Electron energy is deposited along the tracks and accumulated in the simulation geometry. Scattered and bremsstrahlung photons are transported, after applying the Russian roulette technique, in the same way as the primary photons. Dose calculations are compared with full Monte Carlo simulations performed using EGS4/MCSIM and the CyberKnife treatment planning system (TPS) for lung, head & neck and liver treatments. Comparisons with full Monte Carlo simulations show excellent agreement (within 0.5%). More than 10% differences in the target dose are found between Monte Carlo simulations and the CyberKnife TPS for SRS/SRT lung treatment while negligible differences are shown in head and neck and liver for the cases investigated. The calculation time using our superposition Monte Carlo algorithm is reduced up to 62 times (46 times on average for 10 typical clinical cases) compared to full Monte Carlo simulations. SRS/SRT dose distributions calculated by simple dose algorithms may be significantly overestimated for small lung target volumes, which can be improved by accurate Monte Carlo dose calculations.

  4. Hybrid dose calculation: a dose calculation algorithm for microbeam radiation therapy

    NASA Astrophysics Data System (ADS)

    Donzelli, Mattia; Bräuer-Krisch, Elke; Oelfke, Uwe; Wilkens, Jan J.; Bartzsch, Stefan

    2018-02-01

    Microbeam radiation therapy (MRT) is still a preclinical approach in radiation oncology that uses planar micrometre wide beamlets with extremely high peak doses, separated by a few hundred micrometre wide low dose regions. Abundant preclinical evidence demonstrates that MRT spares normal tissue more effectively than conventional radiation therapy, at equivalent tumour control. In order to launch first clinical trials, accurate and efficient dose calculation methods are an inevitable prerequisite. In this work a hybrid dose calculation approach is presented that is based on a combination of Monte Carlo and kernel based dose calculation. In various examples the performance of the algorithm is compared to purely Monte Carlo and purely kernel based dose calculations. The accuracy of the developed algorithm is comparable to conventional pure Monte Carlo calculations. In particular for inhomogeneous materials the hybrid dose calculation algorithm out-performs purely convolution based dose calculation approaches. It is demonstrated that the hybrid algorithm can efficiently calculate even complicated pencil beam and cross firing beam geometries. The required calculation times are substantially lower than for pure Monte Carlo calculations.

  5. Building an ACT-R Reader for Eye-Tracking Corpus Data.

    PubMed

    Dotlačil, Jakub

    2018-01-01

    Cognitive architectures have often been applied to data from individual experiments. In this paper, I develop an ACT-R reader that can model a much larger set of data, eye-tracking corpus data. It is shown that the resulting model has a good fit to the data for the considered low-level processes. Unlike previous related works (most prominently, Engelmann, Vasishth, Engbert & Kliegl, ), the model achieves the fit by estimating free parameters of ACT-R using Bayesian estimation and Markov-Chain Monte Carlo (MCMC) techniques, rather than by relying on the mix of manual selection + default values. The method used in the paper is generalizable beyond this particular model and data set and could be used on other ACT-R models. Copyright © 2017 Cognitive Science Society, Inc.

  6. Event-chain algorithm for the Heisenberg model: Evidence for z≃1 dynamic scaling.

    PubMed

    Nishikawa, Yoshihiko; Michel, Manon; Krauth, Werner; Hukushima, Koji

    2015-12-01

    We apply the event-chain Monte Carlo algorithm to the three-dimensional ferromagnetic Heisenberg model. The algorithm is rejection-free and also realizes an irreversible Markov chain that satisfies global balance. The autocorrelation functions of the magnetic susceptibility and the energy indicate a dynamical critical exponent z≈1 at the critical temperature, while that of the magnetization does not measure the performance of the algorithm. We show that the event-chain Monte Carlo algorithm substantially reduces the dynamical critical exponent from the conventional value of z≃2.

  7. Data decomposition of Monte Carlo particle transport simulations via tally servers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Romano, Paul K.; Siegel, Andrew R.; Forget, Benoit

    An algorithm for decomposing large tally data in Monte Carlo particle transport simulations is developed, analyzed, and implemented in a continuous-energy Monte Carlo code, OpenMC. The algorithm is based on a non-overlapping decomposition of compute nodes into tracking processors and tally servers. The former are used to simulate the movement of particles through the domain while the latter continuously receive and update tally data. A performance model for this approach is developed, suggesting that, for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead on contemporary supercomputers. An implementation of the algorithmmore » in OpenMC is then tested on the Intrepid and Titan supercomputers, supporting the key predictions of the model over a wide range of parameters. We thus conclude that the tally server algorithm is a successful approach to circumventing classical on-node memory constraints en route to unprecedentedly detailed Monte Carlo reactor simulations.« less

  8. A novel algorithm for solving the true coincident counting issues in Monte Carlo simulations for radiation spectroscopy.

    PubMed

    Guan, Fada; Johns, Jesse M; Vasudevan, Latha; Zhang, Guoqing; Tang, Xiaobin; Poston, John W; Braby, Leslie A

    2015-06-01

    Coincident counts can be observed in experimental radiation spectroscopy. Accurate quantification of the radiation source requires the detection efficiency of the spectrometer, which is often experimentally determined. However, Monte Carlo analysis can be used to supplement experimental approaches to determine the detection efficiency a priori. The traditional Monte Carlo method overestimates the detection efficiency as a result of omitting coincident counts caused mainly by multiple cascade source particles. In this study, a novel "multi-primary coincident counting" algorithm was developed using the Geant4 Monte Carlo simulation toolkit. A high-purity Germanium detector for ⁶⁰Co gamma-ray spectroscopy problems was accurately modeled to validate the developed algorithm. The simulated pulse height spectrum agreed well qualitatively with the measured spectrum obtained using the high-purity Germanium detector. The developed algorithm can be extended to other applications, with a particular emphasis on challenging radiation fields, such as counting multiple types of coincident radiations released from nuclear fission or used nuclear fuel.

  9. Mosaicing of airborne LiDAR bathymetry strips based on Monte Carlo matching

    NASA Astrophysics Data System (ADS)

    Yang, Fanlin; Su, Dianpeng; Zhang, Kai; Ma, Yue; Wang, Mingwei; Yang, Anxiu

    2017-09-01

    This study proposes a new methodology for mosaicing airborne light detection and ranging (LiDAR) bathymetry (ALB) data based on Monte Carlo matching. Various errors occur in ALB data due to imperfect system integration and other interference factors. To account for these errors, a Monte Carlo matching algorithm based on a nonlinear least-squares adjustment model is proposed. First, the raw data of strip overlap areas were filtered according to their relative drift of depths. Second, a Monte Carlo model and nonlinear least-squares adjustment model were combined to obtain seven transformation parameters. Then, the multibeam bathymetric data were used to correct the initial strip during strip mosaicing. Finally, to evaluate the proposed method, the experimental results were compared with the results of the Iterative Closest Points (ICP) and three-dimensional Normal Distributions Transform (3D-NDT) algorithms. The results demonstrate that the algorithm proposed in this study is more robust and effective. When the quality of the raw data is poor, the Monte Carlo matching algorithm can still achieve centimeter-level accuracy for overlapping areas, which meets the accuracy of bathymetry required by IHO Standards for Hydrographic Surveys Special Publication No.44.

  10. Scalable Domain Decomposed Monte Carlo Particle Transport

    NASA Astrophysics Data System (ADS)

    O'Brien, Matthew Joseph

    In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.

  11. Diffusion Monte Carlo approach versus adiabatic computation for local Hamiltonians

    NASA Astrophysics Data System (ADS)

    Bringewatt, Jacob; Dorland, William; Jordan, Stephen P.; Mink, Alan

    2018-02-01

    Most research regarding quantum adiabatic optimization has focused on stoquastic Hamiltonians, whose ground states can be expressed with only real non-negative amplitudes and thus for whom destructive interference is not manifest. This raises the question of whether classical Monte Carlo algorithms can efficiently simulate quantum adiabatic optimization with stoquastic Hamiltonians. Recent results have given counterexamples in which path-integral and diffusion Monte Carlo fail to do so. However, most adiabatic optimization algorithms, such as for solving MAX-k -SAT problems, use k -local Hamiltonians, whereas our previous counterexample for diffusion Monte Carlo involved n -body interactions. Here we present a 6-local counterexample which demonstrates that even for these local Hamiltonians there are cases where diffusion Monte Carlo cannot efficiently simulate quantum adiabatic optimization. Furthermore, we perform empirical testing of diffusion Monte Carlo on a standard well-studied class of permutation-symmetric tunneling problems and similarly find large advantages for quantum optimization over diffusion Monte Carlo.

  12. The Impact of Monte Carlo Dose Calculations on Intensity-Modulated Radiation Therapy

    NASA Astrophysics Data System (ADS)

    Siebers, J. V.; Keall, P. J.; Mohan, R.

    The effect of dose calculation accuracy for IMRT was studied by comparing different dose calculation algorithms. A head and neck IMRT plan was optimized using a superposition dose calculation algorithm. Dose was re-computed for the optimized plan using both Monte Carlo and pencil beam dose calculation algorithms to generate patient and phantom dose distributions. Tumor control probabilities (TCP) and normal tissue complication probabilities (NTCP) were computed to estimate the plan outcome. For the treatment plan studied, Monte Carlo best reproduces phantom dose measurements, the TCP was slightly lower than the superposition and pencil beam results, and the NTCP values differed little.

  13. Crustal structure across the lateral edge of the Southern Tyrrhenian slab

    NASA Astrophysics Data System (ADS)

    Pio Lucente, Francesco; Piana Agostinetti, Nicola; Di Bona, Massimo; Govoni, Aladino; Bianchi, Irene

    2015-04-01

    In the southeastern corner of the Tyrrhenian basin, in the central Mediterranean Sea, a tight alignment of earthquakes along a well-defined Benioff zone reveals the presence of one of the narrowest active trenches worldwide, where one of the last fragments of the former Tethys ocean is consumed. Seismic tomography furnishes snapshot images of the present-day position and shape of this slab. Through receiver function analysis we investigate the layered structures overlying the slab. We compute receiver functions from the P-coda of teleseismic events at 13 temporary station deployed during the "Messina 1908-2008" research project (Margheriti, 2008), and operating for an average period of 15 months each. The crustal and uppermost mantle structure has been investigated using a trans-dimensional McMC algorithm developed by Piana Agostinetti and Malinverno (2010), obtaining a 1D S-wave velocity profile for each station. At three of the stations, operating for a longer period of time, the number and the azimuthal distribution of teleseisms allowed us to stack the RF data-set with back azimuth and to compute the harmonic expansion. The analysis of the back-azimuthal harmonics gave us insight on the presence of dipping interfaces and anisotropic layers at depth. The strike and the dip of interfaces and the anisotropic parameters have been quantified using the Neighbourhood Algorithm (Sambridge, 1999). Preliminary results highlight: (1) a neat differentiation of the isotropic S-wave velocity structure passing through the slab edge, from the tip of the Calabrian arc to the Peloritani Range, and (2) the presence of crustal complexities, such as dipping interfaces and anisotropic layers, both in the upper and lower crust. Margheriti, L. (2008), Understanding Crust Dynamics and Subduction in Southern Italy, Eos Trans. AGU, 89(25), 225-226, doi:10.1029/2008EO250002. Piana Agostinetti, N. and A. Malinverno (2010) Receiver Function inversion by trans-dimensional Monte Carlo sampling, Geophys. J. Int., 181(2) 858-872, doi: 10.1111/j.1365-246X.2010.04530.x Sambridge, M. (1999), Geophysical inversion with a neighbourhood algorithm-I. Searching a parameter space, Geophys. J. Int., 138, 479-494, doi:10.1046/j.1365-246X.1999.00876.x.

  14. A Bayesian approach for parameter estimation and prediction using a computationally intensive model

    DOE PAGES

    Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...

    2015-02-05

    Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less

  15. Bayesian hierarchical modelling of continuous non‐negative longitudinal data with a spike at zero: An application to a study of birds visiting gardens in winter

    PubMed Central

    Buckland, Stephen T.; King, Ruth; Toms, Mike P.

    2015-01-01

    The development of methods for dealing with continuous data with a spike at zero has lagged behind those for overdispersed or zero‐inflated count data. We consider longitudinal ecological data corresponding to an annual average of 26 weekly maximum counts of birds, and are hence effectively continuous, bounded below by zero but also with a discrete mass at zero. We develop a Bayesian hierarchical Tweedie regression model that can directly accommodate the excess number of zeros common to this type of data, whilst accounting for both spatial and temporal correlation. Implementation of the model is conducted in a Markov chain Monte Carlo (MCMC) framework, using reversible jump MCMC to explore uncertainty across both parameter and model spaces. This regression modelling framework is very flexible and removes the need to make strong assumptions about mean‐variance relationships a priori. It can also directly account for the spike at zero, whilst being easily applicable to other types of data and other model formulations. Whilst a correlative study such as this cannot prove causation, our results suggest that an increase in an avian predator may have led to an overall decrease in the number of one of its prey species visiting garden feeding stations in the United Kingdom. This may reflect a change in behaviour of house sparrows to avoid feeding stations frequented by sparrowhawks, or a reduction in house sparrow population size as a result of sparrowhawk increase. PMID:25737026

  16. Single neuron modeling and data assimilation in BNST neurons

    NASA Astrophysics Data System (ADS)

    Farsian, Reza

    Neurons, although tiny in size, are vastly complicated systems, which are responsible for the most basic yet essential functions of any nervous system. Even the most simple models of single neurons are usually high dimensional, nonlinear, and contain many parameters and states which are unobservable in a typical neurophysiological experiment. One of the most fundamental problems in experimental neurophysiology is the estimation of these parameters and states, since knowing their values is essential in identification, model construction, and forward prediction of biological neurons. Common methods of parameter and state estimation do not perform well for neural models due to their high dimensionality and nonlinearity. In this dissertation, two alternative approaches for parameters and state estimation of biological neurons have been demonstrated: dynamical parameter estimation (DPE) and a Markov Chain Monte Carlo (MCMC) method. The first method uses elements of chaos control and synchronization theory for parameter and state estimation. MCMC is a statistical approach which uses a path integral formulation to evaluate a mean and an error bound for these unobserved parameters and states. These methods have been applied to biological system of neurons in Bed Nucleus of Stria Termialis neurons (BNST) of rats. State and parameters of neurons in both systems were estimated, and their value were used for recreating a realistic model and predicting the behavior of the neurons successfully. The knowledge of biological parameters can ultimately provide a better understanding of the internal dynamics of a neuron in order to build robust models of neuron networks.

  17. A Christoffel function weighted least squares algorithm for collocation approximations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Narayan, Akil; Jakeman, John D.; Zhou, Tao

    Here, we propose, theoretically investigate, and numerically validate an algorithm for the Monte Carlo solution of least-squares polynomial approximation problems in a collocation framework. Our investigation is motivated by applications in the collocation approximation of parametric functions, which frequently entails construction of surrogates via orthogonal polynomials. A standard Monte Carlo approach would draw samples according to the density defining the orthogonal polynomial family. Our proposed algorithm instead samples with respect to the (weighted) pluripotential equilibrium measure of the domain, and subsequently solves a weighted least-squares problem, with weights given by evaluations of the Christoffel function. We present theoretical analysis tomore » motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest.« less

  18. A Christoffel function weighted least squares algorithm for collocation approximations

    DOE PAGES

    Narayan, Akil; Jakeman, John D.; Zhou, Tao

    2016-11-28

    Here, we propose, theoretically investigate, and numerically validate an algorithm for the Monte Carlo solution of least-squares polynomial approximation problems in a collocation framework. Our investigation is motivated by applications in the collocation approximation of parametric functions, which frequently entails construction of surrogates via orthogonal polynomials. A standard Monte Carlo approach would draw samples according to the density defining the orthogonal polynomial family. Our proposed algorithm instead samples with respect to the (weighted) pluripotential equilibrium measure of the domain, and subsequently solves a weighted least-squares problem, with weights given by evaluations of the Christoffel function. We present theoretical analysis tomore » motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest.« less

  19. Unfolding neutron spectrum with Markov Chain Monte Carlo at MIT research Reactor with He-3 Neutral Current Detectors [Measuring neutron spectrum at MIT research reactor utilizing He-3 Bonner Cylinder Approach with an unfolding analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leder, A.; Anderson, A. J.; Billard, J.

    Here, the Ricochet experiment seeks to measure Coherent (neutral-current) Elastic Neutrino-Nucleus Scattering (CEνNS) using dark-matter-style detectors with sub-keV thresholds placed near a neutrino source, such as the MIT (research) Reactor (MITR), which operates at 5.5 MW generating approximately 2.2 × 10 18 ν/second in its core. Currently, Ricochet is characterizing the backgrounds at MITR, the main component of which comes in the form of neutrons emitted from the core simultaneous with the neutrino signal. To characterize this background, we wrapped Bonner cylinders around a 3 2He thermal neutron detector, whose data was then unfolded via a Markov Chain Monte Carlo (MCMC) to producemore » a neutron energy spectrum across several orders of magnitude. We discuss the resulting spectrum and its implications for deploying Ricochet at the MITR site as well as the feasibility of reducing this background level via the addition of polyethylene shielding around the detector setup.« less

  20. Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.

    PubMed

    Hack, C Eric

    2006-04-17

    Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach.

  1. Unfolding neutron spectrum with Markov Chain Monte Carlo at MIT research Reactor with He-3 Neutral Current Detectors [Measuring neutron spectrum at MIT research reactor utilizing He-3 Bonner Cylinder Approach with an unfolding analysis

    DOE PAGES

    Leder, A.; Anderson, A. J.; Billard, J.; ...

    2018-02-02

    Here, the Ricochet experiment seeks to measure Coherent (neutral-current) Elastic Neutrino-Nucleus Scattering (CEνNS) using dark-matter-style detectors with sub-keV thresholds placed near a neutrino source, such as the MIT (research) Reactor (MITR), which operates at 5.5 MW generating approximately 2.2 × 10 18 ν/second in its core. Currently, Ricochet is characterizing the backgrounds at MITR, the main component of which comes in the form of neutrons emitted from the core simultaneous with the neutrino signal. To characterize this background, we wrapped Bonner cylinders around a 3 2He thermal neutron detector, whose data was then unfolded via a Markov Chain Monte Carlo (MCMC) to producemore » a neutron energy spectrum across several orders of magnitude. We discuss the resulting spectrum and its implications for deploying Ricochet at the MITR site as well as the feasibility of reducing this background level via the addition of polyethylene shielding around the detector setup.« less

  2. Assessment of myocardial metabolic rate of glucose by means of Bayesian ICA and Markov Chain Monte Carlo methods in small animal PET imaging

    NASA Astrophysics Data System (ADS)

    Berradja, Khadidja; Boughanmi, Nabil

    2016-09-01

    In dynamic cardiac PET FDG studies the assessment of myocardial metabolic rate of glucose (MMRG) requires the knowledge of the blood input function (IF). IF can be obtained by manual or automatic blood sampling and cross calibrated with PET. These procedures are cumbersome, invasive and generate uncertainties. The IF is contaminated by spillover of radioactivity from the adjacent myocardium and this could cause important error in the estimated MMRG. In this study, we show that the IF can be extracted from the images in a rat heart study with 18F-fluorodeoxyglucose (18F-FDG) by means of Independent Component Analysis (ICA) based on Bayesian theory and Markov Chain Monte Carlo (MCMC) sampling method (BICA). Images of the heart from rats were acquired with the Sherbrooke small animal PET scanner. A region of interest (ROI) was drawn around the rat image and decomposed into blood and tissue using BICA. The Statistical study showed that there is a significant difference (p < 0.05) between MMRG obtained with IF extracted by BICA with respect to IF extracted from measured images corrupted with spillover.

  3. Snow Water Equivalent Retrieval By Markov Chain Monte Carlo Based on Memls and Hut Snow Emission Model

    NASA Astrophysics Data System (ADS)

    Pan, J.; Durand, M. T.; Vanderjagt, B. J.

    2014-12-01

    The Markov chain Monte Carlo (MCMC) method had been proved to be successful in snow water equivalent retrieval based on synthetic point-scale passive microwave brightness temperature (TB) observations. This method needs only general prior information about distribution of snow parameters, and could estimate layered snow properties, including the thickness, temperature, density and snow grain size (or exponential correlation length) of each layer. In this study, the multi-layer HUT (Helsinki University of Technology) model and the MEMLS (Microwave Emission Model of Layered Snowpacks) will be used as observation models to assimilate the observed TB into snow parameter prediction. Previous studies had shown that the multi-layer HUT model tends to underestimate TB at 37 GHz for deep snow, while the MEMLS does not show sensitivity of model bias to snow depth. Therefore, results using HUT model and MEMLS will be compared to see how the observation model will influence the retrieval of snow parameters. The radiometric measurements at 10.65, 18.7, 36.5 and 90 GHz at Sodankyla, Finland will be used as MCMC input, and the statistics of all snow property measurement will be used to calculate the prior information. 43 dry snowpits with complete measurements of all snow parameters will be used for validation. The entire dataset are from NorSREx (Nordic Snow Radar Experiment) experiments carried out by Juha Lemmetyinen, Anna Kontu and Jouni Pulliainen in FMI in 2009-2011 winters, and continued two more winters from 2011 to Spring of 2013. Besides the snow thickness and snow density that are directly related to snow water equivalent, other parameters will be compared with observations, too. For thin snow, the previous studies showed that influence of underlying soil is considerable, especially when the soil is half frozen with part of unfrozen liquid water and part of ice. Therefore, this study will also try to employ a simple frozen soil permittivity model to improve the performance of retrieval. The behavior of the Markov chain in soil parameters will be studied.

  4. Monte Carlo evaluation of Acuros XB dose calculation Algorithm for intensity modulated radiation therapy of nasopharyngeal carcinoma

    NASA Astrophysics Data System (ADS)

    Yeh, Peter C. Y.; Lee, C. C.; Chao, T. C.; Tung, C. J.

    2017-11-01

    Intensity-modulated radiation therapy is an effective treatment modality for the nasopharyngeal carcinoma. One important aspect of this cancer treatment is the need to have an accurate dose algorithm dealing with the complex air/bone/tissue interface in the head-neck region to achieve the cure without radiation-induced toxicities. The Acuros XB algorithm explicitly solves the linear Boltzmann transport equation in voxelized volumes to account for the tissue heterogeneities such as lungs, bone, air, and soft tissues in the treatment field receiving radiotherapy. With the single beam setup in phantoms, this algorithm has already been demonstrated to achieve the comparable accuracy with Monte Carlo simulations. In the present study, five nasopharyngeal carcinoma patients treated with the intensity-modulated radiation therapy were examined for their dose distributions calculated using the Acuros XB in the planning target volume and the organ-at-risk. Corresponding results of Monte Carlo simulations were computed from the electronic portal image data and the BEAMnrc/DOSXYZnrc code. Analysis of dose distributions in terms of the clinical indices indicated that the Acuros XB was in comparable accuracy with Monte Carlo simulations and better than the anisotropic analytical algorithm for dose calculations in real patients.

  5. Two new methods to fit models for network meta-analysis with random inconsistency effects.

    PubMed

    Law, Martin; Jackson, Dan; Turner, Rebecca; Rhodes, Kirsty; Viechtbauer, Wolfgang

    2016-07-28

    Meta-analysis is a valuable tool for combining evidence from multiple studies. Network meta-analysis is becoming more widely used as a means to compare multiple treatments in the same analysis. However, a network meta-analysis may exhibit inconsistency, whereby the treatment effect estimates do not agree across all trial designs, even after taking between-study heterogeneity into account. We propose two new estimation methods for network meta-analysis models with random inconsistency effects. The model we consider is an extension of the conventional random-effects model for meta-analysis to the network meta-analysis setting and allows for potential inconsistency using random inconsistency effects. Our first new estimation method uses a Bayesian framework with empirically-based prior distributions for both the heterogeneity and the inconsistency variances. We fit the model using importance sampling and thereby avoid some of the difficulties that might be associated with using Markov Chain Monte Carlo (MCMC). However, we confirm the accuracy of our importance sampling method by comparing the results to those obtained using MCMC as the gold standard. The second new estimation method we describe uses a likelihood-based approach, implemented in the metafor package, which can be used to obtain (restricted) maximum-likelihood estimates of the model parameters and profile likelihood confidence intervals of the variance components. We illustrate the application of the methods using two contrasting examples. The first uses all-cause mortality as an outcome, and shows little evidence of between-study heterogeneity or inconsistency. The second uses "ear discharge" as an outcome, and exhibits substantial between-study heterogeneity and inconsistency. Both new estimation methods give results similar to those obtained using MCMC. The extent of heterogeneity and inconsistency should be assessed and reported in any network meta-analysis. Our two new methods can be used to fit models for network meta-analysis with random inconsistency effects. They are easily implemented using the accompanying R code in the Additional file 1. Using these estimation methods, the extent of inconsistency can be assessed and reported.

  6. Inclusion of historical information in flood frequency analysis using a Bayesian MCMC technique: a case study for the power dam Orlík, Czech Republic

    NASA Astrophysics Data System (ADS)

    Gaál, Ladislav; Szolgay, Ján; Kohnová, Silvia; Hlavčová, Kamila; Viglione, Alberto

    2010-01-01

    The paper deals with at-site flood frequency estimation in the case when also information on hydrological events from the past with extraordinary magnitude are available. For the joint frequency analysis of systematic observations and historical data, respectively, the Bayesian framework is chosen, which, through adequately defined likelihood functions, allows for incorporation of different sources of hydrological information, e.g., maximum annual flood peaks, historical events as well as measurement errors. The distribution of the parameters of the fitted distribution function and the confidence intervals of the flood quantiles are derived by means of the Markov chain Monte Carlo simulation (MCMC) technique. The paper presents a sensitivity analysis related to the choice of the most influential parameters of the statistical model, which are the length of the historical period h and the perception threshold X0. These are involved in the statistical model under the assumption that except for the events termed as ‘historical’ ones, none of the (unknown) peak discharges from the historical period h should have exceeded the threshold X0. Both higher values of h and lower values of X0 lead to narrower confidence intervals of the estimated flood quantiles; however, it is emphasized that one should be prudent of selecting those parameters, in order to avoid making inferences with wrong assumptions on the unknown hydrological events having occurred in the past. The Bayesian MCMC methodology is presented on the example of the maximum discharges observed during the warm half year at the station Vltava-Kamýk (Czech Republic) in the period 1877-2002. Although the 2002 flood peak, which is related to the vast flooding that affected a large part of Central Europe at that time, occurred in the near past, in the analysis it is treated virtually as a ‘historical’ event in order to illustrate some crucial aspects of including information on extreme historical floods into at-site flood frequency analyses.

  7. Species delimitation using Bayes factors: simulations and application to the Sceloporus scalaris species group (Squamata: Phrynosomatidae).

    PubMed

    Grummer, Jared A; Bryson, Robert W; Reeder, Tod W

    2014-03-01

    Current molecular methods of species delimitation are limited by the types of species delimitation models and scenarios that can be tested. Bayes factors allow for more flexibility in testing non-nested species delimitation models and hypotheses of individual assignment to alternative lineages. Here, we examined the efficacy of Bayes factors in delimiting species through simulations and empirical data from the Sceloporus scalaris species group. Marginal-likelihood scores of competing species delimitation models, from which Bayes factor values were compared, were estimated with four different methods: harmonic mean estimation (HME), smoothed harmonic mean estimation (sHME), path-sampling/thermodynamic integration (PS), and stepping-stone (SS) analysis. We also performed model selection using a posterior simulation-based analog of the Akaike information criterion through Markov chain Monte Carlo analysis (AICM). Bayes factor species delimitation results from the empirical data were then compared with results from the reversible-jump MCMC (rjMCMC) coalescent-based species delimitation method Bayesian Phylogenetics and Phylogeography (BP&P). Simulation results show that HME and sHME perform poorly compared with PS and SS marginal-likelihood estimators when identifying the true species delimitation model. Furthermore, Bayes factor delimitation (BFD) of species showed improved performance when species limits are tested by reassigning individuals between species, as opposed to either lumping or splitting lineages. In the empirical data, BFD through PS and SS analyses, as well as the rjMCMC method, each provide support for the recognition of all scalaris group taxa as independent evolutionary lineages. Bayes factor species delimitation and BP&P also support the recognition of three previously undescribed lineages. In both simulated and empirical data sets, harmonic and smoothed harmonic mean marginal-likelihood estimators provided much higher marginal-likelihood estimates than PS and SS estimators. The AICM displayed poor repeatability in both simulated and empirical data sets, and produced inconsistent model rankings across replicate runs with the empirical data. Our results suggest that species delimitation through the use of Bayes factors with marginal-likelihood estimates via PS or SS analyses provide a useful and complementary alternative to existing species delimitation methods.

  8. Estimation of interplate coupling along Nankai trough considering the block motion model based on onland GNSS and seafloor GPS/A observation data using MCMC method

    NASA Astrophysics Data System (ADS)

    Kimura, H.; Ito, T.; Tadokoro, K.

    2017-12-01

    Introduction In southwest Japan, Philippine sea plate is subducting under the overriding plate such as Amurian plate, and mega interplate earthquakes has occurred at about 100 years interval. There is no occurrence of mega interplate earthquakes in southwest Japan, although it has passed about 70 years since the last mega interplate earthquakes: 1944 and 1946 along Nankai trough, meaning that the strain has been accumulated at plate interface. Therefore, it is essential to reveal the interplate coupling more precisely for predicting or understanding the mechanism of next occurring mega interplate earthquake. Recently, seafloor geodetic observation revealed the detailed interplate coupling distribution in expected source region of Nankai trough earthquake (e.g., Yokota et al. [2016]). In this study, we estimated interplate coupling in southwest Japan, considering block motion model and using seafloor geodetic observation data as well as onland GNSS observation data, based on Markov Chain Monte Carlo (MCMC) method. Method Observed crustal deformation is assumed that sum of rigid block motion and elastic deformation due to coupling at block boundaries. We modeled this relationship as a non-linear inverse problem that the unknown parameters are Euler pole of each block and coupling at each subfault, and solved them simultaneously based on MCMC method. Input data we used in this study are 863 onland GNSS observation data and 24 seafloor GPS/A observation data. We made some block division models based on the map of active fault tracing and selected the best model based on Akaike's Information Criterion (AIC): that is consist of 12 blocks. Result We find that the interplate coupling along Nankai trough has heterogeneous spatial distribution, strong at the depth of 0 to 20km at off Tokai region, and 0 to 30km at off Shikoku region. Moreover, we find that observed crustal deformation at off Tokai region is well explained by elastic deformation due to subducting Izu Micro Plate. We will present more details of our result, and discuss about not only interplate coupling but also rigid block motion, elastic deformation due to inland fault coupling, and resolution of estimated parameters.

  9. X-Ray Luminosity Functions of Normal Galaxies in the Great Observatories Origins Deep Survey

    NASA Astrophysics Data System (ADS)

    Ptak, Andrew; Mobasher, Bahram; Hornschemeier, Ann; Bauer, Franz; Norman, Colin

    2007-10-01

    We present soft (0.5-2 keV) X-ray luminosity functions (XLFs) in the Great Observatories Origins Deep Survey (GOODS) fields derived for galaxies at z~0.25 and 0.75. SED fitting was used to estimate photometric redshifts and separate galaxy types, resulting in a sample of 40 early-type galaxies and 46 late-type galaxies. We estimate k-corrections for both the X-ray/optical and X-ray/NIR flux ratios, which facilitates the separation of AGNs from the normal/starburst galaxies. We fit the XLFs with a power-law model using both traditional and Markov-Chain Monte Carlo (MCMC) procedures. A key advantage of the MCMC approach is that it explicitly takes into account upper limits and allows errors on ``derived'' quantities, such as luminosity densities, to be computed directly (i.e., without potentially questionable assumptions concerning the propagation of errors). The slopes of the early-type galaxy XLFs tend to be slightly flatter than the late-type galaxy XLFs, although the effect is significant at only the 90% and 97% levels for z~0.25 and 0.75. The XLFs differ between z<0.5 and z>0.5 at >99% significance levels for early-type, late-type, and all (early- and late-type) galaxies. We also fit Schechter and lognormal models to the XLFs, fitting the low- and high-redshift XLFs for a given sample simultaneously assuming only pure luminosity evolution. In the case of lognormal fits, the results of MCMC fitting of the local FIR luminosity function were used as priors for the faint- and bright-end slopes (similar to ``fixing'' these parameters at the FIR values, except here the FIR uncertainty is included). The best-fit values of the change in logL* with redshift were ΔlogL*=0.23+/-0.16 dex (for early-type galaxies) and 0.34+/-0.12 dex (for late-type galaxies), corresponding to (1+z)1.6 and (1+z)2.3. These results were insensitive to whether the Schechter or lognormal function was adopted.

  10. Automatic mesh adaptivity for hybrid Monte Carlo/deterministic neutronics modeling of difficult shielding problems

    DOE PAGES

    Ibrahim, Ahmad M.; Wilson, Paul P.H.; Sawan, Mohamed E.; ...

    2015-06-30

    The CADIS and FW-CADIS hybrid Monte Carlo/deterministic techniques dramatically increase the efficiency of neutronics modeling, but their use in the accurate design analysis of very large and geometrically complex nuclear systems has been limited by the large number of processors and memory requirements for their preliminary deterministic calculations and final Monte Carlo calculation. Three mesh adaptivity algorithms were developed to reduce the memory requirements of CADIS and FW-CADIS without sacrificing their efficiency improvement. First, a macromaterial approach enhances the fidelity of the deterministic models without changing the mesh. Second, a deterministic mesh refinement algorithm generates meshes that capture as muchmore » geometric detail as possible without exceeding a specified maximum number of mesh elements. Finally, a weight window coarsening algorithm decouples the weight window mesh and energy bins from the mesh and energy group structure of the deterministic calculations in order to remove the memory constraint of the weight window map from the deterministic mesh resolution. The three algorithms were used to enhance an FW-CADIS calculation of the prompt dose rate throughout the ITER experimental facility. Using these algorithms resulted in a 23.3% increase in the number of mesh tally elements in which the dose rates were calculated in a 10-day Monte Carlo calculation and, additionally, increased the efficiency of the Monte Carlo simulation by a factor of at least 3.4. The three algorithms enabled this difficult calculation to be accurately solved using an FW-CADIS simulation on a regular computer cluster, eliminating the need for a world-class super computer.« less

  11. A probabilistic seismic model for the European Arctic

    NASA Astrophysics Data System (ADS)

    Hauser, Juerg; Dyer, Kathleen M.; Pasyanos, Michael E.; Bungum, Hilmar; Faleide, Jan I.; Clark, Stephen A.; Schweitzer, Johannes

    2011-01-01

    The development of three-dimensional seismic models for the crust and upper mantle has traditionally focused on finding one model that provides the best fit to the data while observing some regularization constraints. In contrast to this, the inversion employed here fits the data in a probabilistic sense and thus provides a quantitative measure of model uncertainty. Our probabilistic model is based on two sources of information: (1) prior information, which is independent from the data, and (2) different geophysical data sets, including thickness constraints, velocity profiles, gravity data, surface wave group velocities, and regional body wave traveltimes. We use a Markov chain Monte Carlo (MCMC) algorithm to sample models from the prior distribution, the set of plausible models, and test them against the data to generate the posterior distribution, the ensemble of models that fit the data with assigned uncertainties. While being computationally more expensive, such a probabilistic inversion provides a more complete picture of solution space and allows us to combine various data sets. The complex geology of the European Arctic, encompassing oceanic crust, continental shelf regions, rift basins and old cratonic crust, as well as the nonuniform coverage of the region by data with varying degrees of uncertainty, makes it a challenging setting for any imaging technique and, therefore, an ideal environment for demonstrating the practical advantages of a probabilistic approach. Maps of depth to basement and depth to Moho derived from the posterior distribution are in good agreement with previously published maps and interpretations of the regional tectonic setting. The predicted uncertainties, which are as important as the absolute values, correlate well with the variations in data coverage and quality in the region. A practical advantage of our probabilistic model is that it can provide estimates for the uncertainties of observables due to model uncertainties. We will demonstrate how this can be used for the formulation of earthquake location algorithms that take model uncertainties into account when estimating location uncertainties.

  12. On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang

    2015-02-01

    The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesianmore » inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.« less

  13. Monte Carlo tests of the ELIPGRID-PC algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Davidson, J.R.

    1995-04-01

    The standard tool for calculating the probability of detecting pockets of contamination called hot spots has been the ELIPGRID computer code of Singer and Wickman. The ELIPGRID-PC program has recently made this algorithm available for an IBM{reg_sign} PC. However, no known independent validation of the ELIPGRID algorithm exists. This document describes a Monte Carlo simulation-based validation of a modified version of the ELIPGRID-PC code. The modified ELIPGRID-PC code is shown to match Monte Carlo-calculated hot-spot detection probabilities to within {plus_minus}0.5% for 319 out of 320 test cases. The one exception, a very thin elliptical hot spot located within a rectangularmore » sampling grid, differed from the Monte Carlo-calculated probability by about 1%. These results provide confidence in the ability of the modified ELIPGRID-PC code to accurately predict hot-spot detection probabilities within an acceptable range of error.« less

  14. A Monte Carlo Approach for Adaptive Testing with Content Constraints

    ERIC Educational Resources Information Center

    Belov, Dmitry I.; Armstrong, Ronald D.; Weissman, Alexander

    2008-01-01

    This article presents a new algorithm for computerized adaptive testing (CAT) when content constraints are present. The algorithm is based on shadow CAT methodology to meet content constraints but applies Monte Carlo methods and provides the following advantages over shadow CAT: (a) lower maximum item exposure rates, (b) higher utilization of the…

  15. Optimization of the Monte Carlo code for modeling of photon migration in tissue.

    PubMed

    Zołek, Norbert S; Liebert, Adam; Maniewski, Roman

    2006-10-01

    The Monte Carlo method is frequently used to simulate light transport in turbid media because of its simplicity and flexibility, allowing to analyze complicated geometrical structures. Monte Carlo simulations are, however, time consuming because of the necessity to track the paths of individual photons. The time consuming computation is mainly associated with the calculation of the logarithmic and trigonometric functions as well as the generation of pseudo-random numbers. In this paper, the Monte Carlo algorithm was developed and optimized, by approximation of the logarithmic and trigonometric functions. The approximations were based on polynomial and rational functions, and the errors of these approximations are less than 1% of the values of the original functions. The proposed algorithm was verified by simulations of the time-resolved reflectance at several source-detector separations. The results of the calculation using the approximated algorithm were compared with those of the Monte Carlo simulations obtained with an exact computation of the logarithm and trigonometric functions as well as with the solution of the diffusion equation. The errors of the moments of the simulated distributions of times of flight of photons (total number of photons, mean time of flight and variance) are less than 2% for a range of optical properties, typical of living tissues. The proposed approximated algorithm allows to speed up the Monte Carlo simulations by a factor of 4. The developed code can be used on parallel machines, allowing for further acceleration.

  16. Arbitrage and Volatility in Chinese Stock's Markets

    NASA Astrophysics Data System (ADS)

    Lu, Shu Quan; Ito, Takao; Zhang, Jianbo

    From the point of view of no-arbitrage pricing, what matters is how much volatility the stock has, for volatility measures the amount of profit that can be made from shorting stocks and purchasing options. With the short-sales constraints or in the absence of options, however, high volatility is likely to mean arbitrage from stock market. As emerging stock markets for China, investors are increasingly concerned about volatilities of Chinese two stock markets. We estimate volatility's models for Chinese stock markets' indexes using Markov chain Monte Carlo (MCMC) method and GARCH. We find that estimated values of volatility parameters are very high for all data frequencies. It suggests that stock returns are extremely volatile even at long term intervals in Chinese markets. Furthermore, this result could be considered that there seems to be arbitrage opportunities in Chinese stock markets.

  17. A High-Order Low-Order Algorithm with Exponentially Convergent Monte Carlo for Thermal Radiative Transfer

    DOE PAGES

    Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

    2016-10-21

    In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less

  18. Percentage depth dose evaluation in heterogeneous media using thermoluminescent dosimetry

    PubMed Central

    da Rosa, L.A.R.; Campos, L.T.; Alves, V.G.L.; Batista, D.V.S.; Facure, A.

    2010-01-01

    The purpose of this study is to investigate the influence of lung heterogeneity inside a soft tissue phantom on percentage depth dose (PDD). PDD curves were obtained experimentally using LiF:Mg,Ti (TLD‐100) thermoluminescent detectors and applying Eclipse treatment planning system algorithms Batho, modified Batho (M‐Batho or BMod), equivalent TAR (E‐TAR or EQTAR), and anisotropic analytical algorithm (AAA) for a 15 MV photon beam and field sizes of 1×1,2×2,5×5, and 10×10cm2. Monte Carlo simulations were performed using the DOSRZnrc user code of EGSnrc. The experimental results agree with Monte Carlo simulations for all irradiation field sizes. Comparisons with Monte Carlo calculations show that the AAA algorithm provides the best simulations of PDD curves for all field sizes investigated. However, even this algorithm cannot accurately predict PDD values in the lung for field sizes of 1×1 and 2×2cm2. An overdosage in the lung of about 40% and 20% is calculated by the AAA algorithm close to the interface soft tissue/lung for 1×1 and 2×2cm2 field sizes, respectively. It was demonstrated that differences of 100% between Monte Carlo results and the algorithms Batho, modified Batho, and equivalent TAR responses may exist inside the lung region for the 1×1cm2 field. PACS number: 87.55.kd

  19. Self-learning Monte Carlo method

    DOE PAGES

    Liu, Junwei; Qi, Yang; Meng, Zi Yang; ...

    2017-01-04

    Monte Carlo simulation is an unbiased numerical tool for studying classical and quantum many-body systems. One of its bottlenecks is the lack of a general and efficient update algorithm for large size systems close to the phase transition, for which local updates perform badly. In this Rapid Communication, we propose a general-purpose Monte Carlo method, dubbed self-learning Monte Carlo (SLMC), in which an efficient update algorithm is first learned from the training data generated in trial simulations and then used to speed up the actual simulation. Lastly, we demonstrate the efficiency of SLMC in a spin model at the phasemore » transition point, achieving a 10–20 times speedup.« less

  20. Random Numbers and Monte Carlo Methods

    NASA Astrophysics Data System (ADS)

    Scherer, Philipp O. J.

    Many-body problems often involve the calculation of integrals of very high dimension which cannot be treated by standard methods. For the calculation of thermodynamic averages Monte Carlo methods are very useful which sample the integration volume at randomly chosen points. After summarizing some basic statistics, we discuss algorithms for the generation of pseudo-random numbers with given probability distribution which are essential for all Monte Carlo methods. We show how the efficiency of Monte Carlo integration can be improved by sampling preferentially the important configurations. Finally the famous Metropolis algorithm is applied to classical many-particle systems. Computer experiments visualize the central limit theorem and apply the Metropolis method to the traveling salesman problem.

  1. Monte Carlo uncertainty analysis of dose estimates in radiochromic film dosimetry with single-channel and multichannel algorithms.

    PubMed

    Vera-Sánchez, Juan Antonio; Ruiz-Morales, Carmen; González-López, Antonio

    2018-03-01

    To provide a multi-stage model to calculate uncertainty in radiochromic film dosimetry with Monte-Carlo techniques. This new approach is applied to single-channel and multichannel algorithms. Two lots of Gafchromic EBT3 are exposed in two different Varian linacs. They are read with an EPSON V800 flatbed scanner. The Monte-Carlo techniques in uncertainty analysis provide a numerical representation of the probability density functions of the output magnitudes. From this numerical representation, traditional parameters of uncertainty analysis as the standard deviations and bias are calculated. Moreover, these numerical representations are used to investigate the shape of the probability density functions of the output magnitudes. Also, another calibration film is read in four EPSON scanners (two V800 and two 10000XL) and the uncertainty analysis is carried out with the four images. The dose estimates of single-channel and multichannel algorithms show a Gaussian behavior and low bias. The multichannel algorithms lead to less uncertainty in the final dose estimates when the EPSON V800 is employed as reading device. In the case of the EPSON 10000XL, the single-channel algorithms provide less uncertainty in the dose estimates for doses higher than four Gy. A multi-stage model has been presented. With the aid of this model and the use of the Monte-Carlo techniques, the uncertainty of dose estimates for single-channel and multichannel algorithms are estimated. The application of the model together with Monte-Carlo techniques leads to a complete characterization of the uncertainties in radiochromic film dosimetry. Copyright © 2018 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  2. Log-Concavity and Strong Log-Concavity: a review

    PubMed Central

    Saumard, Adrien; Wellner, Jon A.

    2016-01-01

    We review and formulate results concerning log-concavity and strong-log-concavity in both discrete and continuous settings. We show how preservation of log-concavity and strongly log-concavity on ℝ under convolution follows from a fundamental monotonicity result of Efron (1969). We provide a new proof of Efron's theorem using the recent asymmetric Brascamp-Lieb inequality due to Otto and Menz (2013). Along the way we review connections between log-concavity and other areas of mathematics and statistics, including concentration of measure, log-Sobolev inequalities, convex geometry, MCMC algorithms, Laplace approximations, and machine learning. PMID:27134693

  3. Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bates, Cameron Russell; Mckigney, Edward Allen

    The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.

  4. Bayesian inversion of surface-wave data for radial and azimuthal shear-wave anisotropy, with applications to central Mongolia and west-central Italy

    NASA Astrophysics Data System (ADS)

    Ravenna, Matteo; Lebedev, Sergei

    2018-04-01

    Seismic anisotropy provides important information on the deformation history of the Earth's interior. Rayleigh and Love surface-waves are sensitive to and can be used to determine both radial and azimuthal shear-wave anisotropies at depth, but parameter trade-offs give rise to substantial model non-uniqueness. Here, we explore the trade-offs between isotropic and anisotropic structure parameters and present a suite of methods for the inversion of surface-wave, phase-velocity curves for radial and azimuthal anisotropies. One Markov chain Monte Carlo (McMC) implementation inverts Rayleigh and Love dispersion curves for a radially anisotropic shear velocity profile of the crust and upper mantle. Another McMC implementation inverts Rayleigh phase velocities and their azimuthal anisotropy for profiles of vertically polarized shear velocity and its depth-dependent azimuthal anisotropy. The azimuthal anisotropy inversion is fully non-linear, with the forward problem solved numerically at different azimuths for every model realization, which ensures that any linearization biases are avoided. The computations are performed in parallel, in order to reduce the computing time. The often challenging issue of data noise estimation is addressed by means of a Hierarchical Bayesian approach, with the variance of the noise treated as an unknown during the radial anisotropy inversion. In addition to the McMC inversions, we also present faster, non-linear gradient-search inversions for the same anisotropic structure. The results of the two approaches are mutually consistent; the advantage of the McMC inversions is that they provide a measure of uncertainty of the models. Applying the method to broad-band data from the Baikal-central Mongolia region, we determine radial anisotropy from the crust down to the transition-zone depths. Robust negative anisotropy (Vsh < Vsv) in the asthenosphere, at 100-300 km depths, presents strong new evidence for a vertical component of asthenospheric flow. This is consistent with an upward flow from below the thick lithosphere of the Siberian Craton to below the thinner lithosphere of central Mongolia, likely to give rise to decompression melting and the scattered, sporadic volcanism observed in the Baikal Rift area, as proposed previously. Inversion of phase-velocity data from west-central Italy for azimuthal anisotropy reveals a clear change in the shear-wave fast-propagation direction at 70-100 km depths, near the lithosphere-asthenosphere boundary. The orientation of the fabric in the lithosphere is roughly E-W, parallel to the direction of stretching over the last 10 m.y. The orientation of the fabric in the asthenosphere is NW-SE, matching the fast directions inferred from shear-wave splitting and probably indicating the direction of the asthenospheric flow.

  5. A histogram-free multicanonical Monte Carlo algorithm for the construction of analytical density of states

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eisenbach, Markus; Li, Ying Wai

    We report a new multicanonical Monte Carlo (MC) algorithm to obtain the density of states (DOS) for physical systems with continuous state variables in statistical mechanics. Our algorithm is able to obtain an analytical form for the DOS expressed in a chosen basis set, instead of a numerical array of finite resolution as in previous variants of this class of MC methods such as the multicanonical (MUCA) sampling and Wang-Landau (WL) sampling. This is enabled by storing the visited states directly in a data set and avoiding the explicit collection of a histogram. This practice also has the advantage ofmore » avoiding undesirable artificial errors caused by the discretization and binning of continuous state variables. Our results show that this scheme is capable of obtaining converged results with a much reduced number of Monte Carlo steps, leading to a significant speedup over existing algorithms.« less

  6. Project INTEGRATE: An integrative study of brief alcohol interventions for college students.

    PubMed

    Mun, Eun-Young; de la Torre, Jimmy; Atkins, David C; White, Helene R; Ray, Anne E; Kim, Su-Young; Jiao, Yang; Clarke, Nickeisha; Huo, Yan; Larimer, Mary E; Huh, David

    2015-03-01

    This article provides an overview of a study that synthesizes multiple, independently collected alcohol intervention studies for college students into a single, multisite longitudinal data set. This research embraced innovative analytic strategies (i.e., integrative data analysis or meta-analysis using individual participant-level data), with the overall goal of answering research questions that are difficult to address in individual studies such as moderation analysis, while providing a built-in replication for the reported efficacy of brief motivational interventions for college students. Data were pooled across 24 intervention studies, of which 21 included a comparison or control condition and all included one or more treatment conditions. This yielded a sample of 12,630 participants (42% men; 58% first-year or incoming students). The majority of the sample identified as White (74%), with 12% Asian, 7% Hispanic, 2% Black, and 5% other/mixed ethnic groups. Participants were assessed 2 or more times from baseline up to 12 months, with varying assessment schedules across studies. This article describes how we combined individual participant-level data from multiple studies, and discusses the steps taken to develop commensurate measures across studies via harmonization and newly developed Markov chain Monte Carlo (MCMC) algorithms for 2-parameter logistic item response theory models and a generalized partial credit model. This innovative approach has intriguing promises, but significant barriers exist. To lower the barriers, there is a need to increase overlap in measures and timing of follow-up assessments across studies, better define treatment and control groups, and improve transparency and documentation in future single intervention studies. (c) 2015 APA, all rights reserved).

  7. How Much Can We Learn from a Single Chromatographic Experiment? A Bayesian Perspective.

    PubMed

    Wiczling, Paweł; Kaliszan, Roman

    2016-01-05

    In this work, we proposed and investigated a Bayesian inference procedure to find the desired chromatographic conditions based on known analyte properties (lipophilicity, pKa, and polar surface area) using one preliminary experiment. A previously developed nonlinear mixed effect model was used to specify the prior information about a new analyte with known physicochemical properties. Further, the prior (no preliminary data) and posterior predictive distribution (prior + one experiment) were determined sequentially to search towards the desired separation. The following isocratic high-performance reversed-phase liquid chromatographic conditions were sought: (1) retention time of a single analyte within the range of 4-6 min and (2) baseline separation of two analytes with retention times within the range of 4-10 min. The empirical posterior Bayesian distribution of parameters was estimated using the "slice sampling" Markov Chain Monte Carlo (MCMC) algorithm implemented in Matlab. The simulations with artificial analytes and experimental data of ketoprofen and papaverine were used to test the proposed methodology. The simulation experiment showed that for a single and two randomly selected analytes, there is 97% and 74% probability of obtaining a successful chromatogram using none or one preliminary experiment. The desired separation for ketoprofen and papaverine was established based on a single experiment. It was confirmed that the search for a desired separation rarely requires a large number of chromatographic analyses at least for a simple optimization problem. The proposed Bayesian-based optimization scheme is a powerful method of finding a desired chromatographic separation based on a small number of preliminary experiments.

  8. Mcmc Signal Extraction For 21-cm Global Signal Experiments

    NASA Astrophysics Data System (ADS)

    Harker, Geraint

    2012-05-01

    Measurements of the highly redshifted 21-cm line promise to provide a great deal of information about the dark ages of the Universe, the cosmic dawn and the epoch of reionization. It is generally accepted that strong astrophysical foregrounds are a major obstacle to overcome before this promise is realised, largely because of the way they are filtered through a complicated instrumental response. A great deal of work has therefore been devoted to studying foreground removal for observations with the low-frequency radio arrays which are starting to collect data. The case of so-called 'global signal' experiments has received less attention, however. I will compare the foreground fitting problem in these two types of experiments, and describe a foreground fitting methodology which has been developed for a proposed global signal experiment, the Dark Ages Radio Explorer (DARE), which will make use of the pristine radio-frequency environment over the far side of the Moon. The method, a fully Bayesian technique based on a Markov Chain Monte Carlo code will, however, be applicable more generally to other space- and ground-based experiments, including the prototype DARE antenna being deployed in Western Australia. For ground-based experiments, we must also contend with effects from the Earth's ionosphere and low-level radio-frequency interference. I will show early results from applying our algorithm to data from the prototype and the EDGES experiment. GH is a member of the LUNAR consortium, which is funded by the NASA Lunar Science Institute (via Cooperative Agreement NNA09DB30A) to investigate concepts for astrophysical observatories on the Moon.

  9. Estimation of spatially varying heat transfer coefficient from a flat plate with flush mounted heat sources using Bayesian inference

    NASA Astrophysics Data System (ADS)

    Jakkareddy, Pradeep S.; Balaji, C.

    2016-09-01

    This paper employs the Bayesian based Metropolis Hasting - Markov Chain Monte Carlo algorithm to solve inverse heat transfer problem of determining the spatially varying heat transfer coefficient from a flat plate with flush mounted discrete heat sources with measured temperatures at the bottom of the plate. The Nusselt number is assumed to be of the form Nu = aReb(x/l)c . To input reasonable values of ’a’ and ‘b’ into the inverse problem, first limited two dimensional conjugate convection simulations were done with Comsol. Based on the guidance from this different values of ‘a’ and ‘b’ are input to a computationally less complex problem of conjugate conduction in the flat plate (15mm thickness) and temperature distributions at the bottom of the plate which is a more convenient location for measuring the temperatures without disturbing the flow were obtained. Since the goal of this work is to demonstrate the eficiacy of the Bayesian approach to accurately retrieve ‘a’ and ‘b’, numerically generated temperatures with known values of ‘a’ and ‘b’ are treated as ‘surrogate’ experimental data. The inverse problem is then solved by repeatedly using the forward solutions together with the MH-MCMC aprroach. To speed up the estimation, the forward model is replaced by an artificial neural network. The mean, maximum-a-posteriori and standard deviation of the estimated parameters ‘a’ and ‘b’ are reported. The robustness of the proposed method is examined, by synthetically adding noise to the temperatures.

  10. MultiNest: Efficient and Robust Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Feroz, F.; Hobson, M. P.; Bridges, M.

    2011-09-01

    We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focusing on the extension of the vanilla LambdaCDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software is fully parallelized using MPI and includes an interface to CosmoMC. It will also be released as part of the SuperBayeS package, for the analysis of supersymmetric theories of particle physics, at this http URL.

  11. Learning an Eddy Viscosity Model Using Shrinkage and Bayesian Calibration: A Jet-in-Crossflow Case Study

    DOE PAGES

    Ray, Jaideep; Lefantzi, Sophia; Arunajatesan, Srinivasan; ...

    2017-09-07

    In this paper, we demonstrate a statistical procedure for learning a high-order eddy viscosity model (EVM) from experimental data and using it to improve the predictive skill of a Reynolds-averaged Navier–Stokes (RANS) simulator. The method is tested in a three-dimensional (3D), transonic jet-in-crossflow (JIC) configuration. The process starts with a cubic eddy viscosity model (CEVM) developed for incompressible flows. It is fitted to limited experimental JIC data using shrinkage regression. The shrinkage process removes all the terms from the model, except an intercept, a linear term, and a quadratic one involving the square of the vorticity. The shrunk eddy viscositymore » model is implemented in an RANS simulator and calibrated, using vorticity measurements, to infer three parameters. The calibration is Bayesian and is solved using a Markov chain Monte Carlo (MCMC) method. A 3D probability density distribution for the inferred parameters is constructed, thus quantifying the uncertainty in the estimate. The phenomenal cost of using a 3D flow simulator inside an MCMC loop is mitigated by using surrogate models (“curve-fits”). A support vector machine classifier (SVMC) is used to impose our prior belief regarding parameter values, specifically to exclude nonphysical parameter combinations. The calibrated model is compared, in terms of its predictive skill, to simulations using uncalibrated linear and CEVMs. Finally, we find that the calibrated model, with one quadratic term, is more accurate than the uncalibrated simulator. The model is also checked at a flow condition at which the model was not calibrated.« less

  12. Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.

    PubMed

    dos Reis, Mario; Yang, Ziheng

    2011-07-01

    The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.

  13. Learning an Eddy Viscosity Model Using Shrinkage and Bayesian Calibration: A Jet-in-Crossflow Case Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ray, Jaideep; Lefantzi, Sophia; Arunajatesan, Srinivasan

    In this paper, we demonstrate a statistical procedure for learning a high-order eddy viscosity model (EVM) from experimental data and using it to improve the predictive skill of a Reynolds-averaged Navier–Stokes (RANS) simulator. The method is tested in a three-dimensional (3D), transonic jet-in-crossflow (JIC) configuration. The process starts with a cubic eddy viscosity model (CEVM) developed for incompressible flows. It is fitted to limited experimental JIC data using shrinkage regression. The shrinkage process removes all the terms from the model, except an intercept, a linear term, and a quadratic one involving the square of the vorticity. The shrunk eddy viscositymore » model is implemented in an RANS simulator and calibrated, using vorticity measurements, to infer three parameters. The calibration is Bayesian and is solved using a Markov chain Monte Carlo (MCMC) method. A 3D probability density distribution for the inferred parameters is constructed, thus quantifying the uncertainty in the estimate. The phenomenal cost of using a 3D flow simulator inside an MCMC loop is mitigated by using surrogate models (“curve-fits”). A support vector machine classifier (SVMC) is used to impose our prior belief regarding parameter values, specifically to exclude nonphysical parameter combinations. The calibrated model is compared, in terms of its predictive skill, to simulations using uncalibrated linear and CEVMs. Finally, we find that the calibrated model, with one quadratic term, is more accurate than the uncalibrated simulator. The model is also checked at a flow condition at which the model was not calibrated.« less

  14. Quantitative identification of nitrate pollution sources and uncertainty analysis based on dual isotope approach in an agricultural watershed.

    PubMed

    Ji, Xiaoliang; Xie, Runting; Hao, Yun; Lu, Jun

    2017-10-01

    Quantitative identification of nitrate (NO 3 - -N) sources is critical to the control of nonpoint source nitrogen pollution in an agricultural watershed. Combined with water quality monitoring, we adopted the environmental isotope (δD-H 2 O, δ 18 O-H 2 O, δ 15 N-NO 3 - , and δ 18 O-NO 3 - ) analysis and the Markov Chain Monte Carlo (MCMC) mixing model to determine the proportions of riverine NO 3 - -N inputs from four potential NO 3 - -N sources, namely, atmospheric deposition (AD), chemical nitrogen fertilizer (NF), soil nitrogen (SN), and manure and sewage (M&S), in the ChangLe River watershed of eastern China. Results showed that NO 3 - -N was the main form of nitrogen in this watershed, accounting for approximately 74% of the total nitrogen concentration. A strong hydraulic interaction existed between the surface and groundwater for NO 3 - -N pollution. The variations of the isotopic composition in NO 3 - -N suggested that microbial nitrification was the dominant nitrogen transformation process in surface water, whereas significant denitrification was observed in groundwater. MCMC mixing model outputs revealed that M&S was the predominant contributor to riverine NO 3 - -N pollution (contributing 41.8% on average), followed by SN (34.0%), NF (21.9%), and AD (2.3%) sources. Finally, we constructed an uncertainty index, UI 90 , to quantitatively characterize the uncertainties inherent in NO 3 - -N source apportionment and discussed the reasons behind the uncertainties. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Bayesian Regression of Thermodynamic Models of Redox Active Materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnston, Katherine

    Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from themore » model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).« less

  16. Comprehensive cosmographic analysis by Markov chain method

    NASA Astrophysics Data System (ADS)

    Capozziello, S.; Lazkoz, R.; Salzano, V.

    2011-12-01

    We study the possibility of extracting model independent information about the dynamics of the Universe by using cosmography. We intend to explore it systematically, to learn about its limitations and its real possibilities. Here we are sticking to the series expansion approach on which cosmography is based. We apply it to different data sets: Supernovae type Ia (SNeIa), Hubble parameter extracted from differential galaxy ages, gamma ray bursts, and the baryon acoustic oscillations data. We go beyond past results in the literature extending the series expansion up to the fourth order in the scale factor, which implies the analysis of the deceleration q0, the jerk j0, and the snap s0. We use the Markov chain Monte Carlo method (MCMC) to analyze the data statistically. We also try to relate direct results from cosmography to dark energy (DE) dynamical models parametrized by the Chevallier-Polarski-Linder model, extracting clues about the matter content and the dark energy parameters. The main results are: (a) even if relying on a mathematical approximate assumption such as the scale factor series expansion in terms of time, cosmography can be extremely useful in assessing dynamical properties of the Universe; (b) the deceleration parameter clearly confirms the present acceleration phase; (c) the MCMC method can help giving narrower constraints in parameter estimation, in particular for higher order cosmographic parameters (the jerk and the snap), with respect to the literature; and (d) both the estimation of the jerk and the DE parameters reflect the possibility of a deviation from the ΛCDM cosmological model.

  17. Bayesian random-effect model for predicting outcome fraught with heterogeneity--an illustration with episodes of 44 patients with intractable epilepsy.

    PubMed

    Yen, A M-F; Liou, H-H; Lin, H-L; Chen, T H-H

    2006-01-01

    The study aimed to develop a predictive model to deal with data fraught with heterogeneity that cannot be explained by sampling variation or measured covariates. The random-effect Poisson regression model was first proposed to deal with over-dispersion for data fraught with heterogeneity after making allowance for measured covariates. Bayesian acyclic graphic model in conjunction with Markov Chain Monte Carlo (MCMC) technique was then applied to estimate the parameters of both relevant covariates and random effect. Predictive distribution was then generated to compare the predicted with the observed for the Bayesian model with and without random effect. Data from repeated measurement of episodes among 44 patients with intractable epilepsy were used as an illustration. The application of Poisson regression without taking heterogeneity into account to epilepsy data yielded a large value of heterogeneity (heterogeneity factor = 17.90, deviance = 1485, degree of freedom (df) = 83). After taking the random effect into account, the value of heterogeneity factor was greatly reduced (heterogeneity factor = 0.52, deviance = 42.5, df = 81). The Pearson chi2 for the comparison between the expected seizure frequencies and the observed ones at two and three months of the model with and without random effect were 34.27 (p = 1.00) and 1799.90 (p < 0.0001), respectively. The Bayesian acyclic model using the MCMC method was demonstrated to have great potential for disease prediction while data show over-dispersion attributed either to correlated property or to subject-to-subject variability.

  18. Bayesian hierarchical modelling of continuous non-negative longitudinal data with a spike at zero: An application to a study of birds visiting gardens in winter.

    PubMed

    Swallow, Ben; Buckland, Stephen T; King, Ruth; Toms, Mike P

    2016-03-01

    The development of methods for dealing with continuous data with a spike at zero has lagged behind those for overdispersed or zero-inflated count data. We consider longitudinal ecological data corresponding to an annual average of 26 weekly maximum counts of birds, and are hence effectively continuous, bounded below by zero but also with a discrete mass at zero. We develop a Bayesian hierarchical Tweedie regression model that can directly accommodate the excess number of zeros common to this type of data, whilst accounting for both spatial and temporal correlation. Implementation of the model is conducted in a Markov chain Monte Carlo (MCMC) framework, using reversible jump MCMC to explore uncertainty across both parameter and model spaces. This regression modelling framework is very flexible and removes the need to make strong assumptions about mean-variance relationships a priori. It can also directly account for the spike at zero, whilst being easily applicable to other types of data and other model formulations. Whilst a correlative study such as this cannot prove causation, our results suggest that an increase in an avian predator may have led to an overall decrease in the number of one of its prey species visiting garden feeding stations in the United Kingdom. This may reflect a change in behaviour of house sparrows to avoid feeding stations frequented by sparrowhawks, or a reduction in house sparrow population size as a result of sparrowhawk increase. © 2015 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Stochastic Simulation and Forecast of Hydrologic Time Series Based on Probabilistic Chaos Expansion

    NASA Astrophysics Data System (ADS)

    Li, Z.; Ghaith, M.

    2017-12-01

    Hydrological processes are characterized by many complex features, such as nonlinearity, dynamics and uncertainty. How to quantify and address such complexities and uncertainties has been a challenging task for water engineers and managers for decades. To support robust uncertainty analysis, an innovative approach for the stochastic simulation and forecast of hydrologic time series is developed is this study. Probabilistic Chaos Expansions (PCEs) are established through probabilistic collocation to tackle uncertainties associated with the parameters of traditional hydrological models. The uncertainties are quantified in model outputs as Hermite polynomials with regard to standard normal random variables. Sequentially, multivariate analysis techniques are used to analyze the complex nonlinear relationships between meteorological inputs (e.g., temperature, precipitation, evapotranspiration, etc.) and the coefficients of the Hermite polynomials. With the established relationships between model inputs and PCE coefficients, forecasts of hydrologic time series can be generated and the uncertainties in the future time series can be further tackled. The proposed approach is demonstrated using a case study in China and is compared to a traditional stochastic simulation technique, the Markov-Chain Monte-Carlo (MCMC) method. Results show that the proposed approach can serve as a reliable proxy to complicated hydrological models. It can provide probabilistic forecasting in a more computationally efficient manner, compared to the traditional MCMC method. This work provides technical support for addressing uncertainties associated with hydrological modeling and for enhancing the reliability of hydrological modeling results. Applications of the developed approach can be extended to many other complicated geophysical and environmental modeling systems to support the associated uncertainty quantification and risk analysis.

  20. Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration.

    PubMed

    Conner, Mary M; Saunders, W Carl; Bouwes, Nicolaas; Jordan, Chris

    2015-10-01

    Before-after-control-impact (BACI) designs are an effective method to evaluate natural and human-induced perturbations on ecological variables when treatment sites cannot be randomly chosen. While effect sizes of interest can be tested with frequentist methods, using Bayesian Markov chain Monte Carlo (MCMC) sampling methods, probabilities of effect sizes, such as a ≥20 % increase in density after restoration, can be directly estimated. Although BACI and Bayesian methods are used widely for assessing natural and human-induced impacts for field experiments, the application of hierarchal Bayesian modeling with MCMC sampling to BACI designs is less common. Here, we combine these approaches and extend the typical presentation of results with an easy to interpret ratio, which provides an answer to the main study question-"How much impact did a management action or natural perturbation have?" As an example of this approach, we evaluate the impact of a restoration project, which implemented beaver dam analogs, on survival and density of juvenile steelhead. Results indicated the probabilities of a ≥30 % increase were high for survival and density after the dams were installed, 0.88 and 0.99, respectively, while probabilities for a higher increase of ≥50 % were variable, 0.17 and 0.82, respectively. This approach demonstrates a useful extension of Bayesian methods that can easily be generalized to other study designs from simple (e.g., single factor ANOVA, paired t test) to more complicated block designs (e.g., crossover, split-plot). This approach is valuable for estimating the probabilities of restoration impacts or other management actions.

  1. Model averaging in linkage analysis.

    PubMed

    Matthysse, Steven

    2006-06-05

    Methods for genetic linkage analysis are traditionally divided into "model-dependent" and "model-independent," but there may be a useful place for an intermediate class, in which a broad range of possible models is considered as a parametric family. It is possible to average over model space with an empirical Bayes prior that weights models according to their goodness of fit to epidemiologic data, such as the frequency of the disease in the population and in first-degree relatives (and correlations with other traits in the pleiotropic case). For averaging over high-dimensional spaces, Markov chain Monte Carlo (MCMC) has great appeal, but it has a near-fatal flaw: it is not possible, in most cases, to provide rigorous sufficient conditions to permit the user safely to conclude that the chain has converged. A way of overcoming the convergence problem, if not of solving it, rests on a simple application of the principle of detailed balance. If the starting point of the chain has the equilibrium distribution, so will every subsequent point. The first point is chosen according to the target distribution by rejection sampling, and subsequent points by an MCMC process that has the target distribution as its equilibrium distribution. Model averaging with an empirical Bayes prior requires rapid estimation of likelihoods at many points in parameter space. Symbolic polynomials are constructed before the random walk over parameter space begins, to make the actual likelihood computations at each step of the random walk very fast. Power analysis in an illustrative case is described. (c) 2006 Wiley-Liss, Inc.

  2. Forward and inverse uncertainty quantification using multilevel Monte Carlo algorithms for an elliptic non-local equation

    DOE PAGES

    Jasra, Ajay; Law, Kody J. H.; Zhou, Yan

    2016-01-01

    Our paper considers uncertainty quantification for an elliptic nonlocal equation. In particular, it is assumed that the parameters which define the kernel in the nonlocal operator are uncertain and a priori distributed according to a probability measure. It is shown that the induced probability measure on some quantities of interest arising from functionals of the solution to the equation with random inputs is well-defined,s as is the posterior distribution on parameters given observations. As the elliptic nonlocal equation cannot be solved approximate posteriors are constructed. The multilevel Monte Carlo (MLMC) and multilevel sequential Monte Carlo (MLSMC) sampling algorithms are usedmore » for a priori and a posteriori estimation, respectively, of quantities of interest. Furthermore, these algorithms reduce the amount of work to estimate posterior expectations, for a given level of error, relative to Monte Carlo and i.i.d. sampling from the posterior at a given level of approximation of the solution of the elliptic nonlocal equation.« less

  3. Forward and inverse uncertainty quantification using multilevel Monte Carlo algorithms for an elliptic non-local equation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jasra, Ajay; Law, Kody J. H.; Zhou, Yan

    Our paper considers uncertainty quantification for an elliptic nonlocal equation. In particular, it is assumed that the parameters which define the kernel in the nonlocal operator are uncertain and a priori distributed according to a probability measure. It is shown that the induced probability measure on some quantities of interest arising from functionals of the solution to the equation with random inputs is well-defined,s as is the posterior distribution on parameters given observations. As the elliptic nonlocal equation cannot be solved approximate posteriors are constructed. The multilevel Monte Carlo (MLMC) and multilevel sequential Monte Carlo (MLSMC) sampling algorithms are usedmore » for a priori and a posteriori estimation, respectively, of quantities of interest. Furthermore, these algorithms reduce the amount of work to estimate posterior expectations, for a given level of error, relative to Monte Carlo and i.i.d. sampling from the posterior at a given level of approximation of the solution of the elliptic nonlocal equation.« less

  4. New Multi-objective Uncertainty-based Algorithm for Water Resource Models' Calibration

    NASA Astrophysics Data System (ADS)

    Keshavarz, Kasra; Alizadeh, Hossein

    2017-04-01

    Water resource models are powerful tools to support water management decision making process and are developed to deal with a broad range of issues including land use and climate change impacts analysis, water allocation, systems design and operation, waste load control and allocation, etc. These models are divided into two categories of simulation and optimization models whose calibration has been addressed in the literature where great relevant efforts in recent decades have led to two main categories of auto-calibration methods of uncertainty-based algorithms such as GLUE, MCMC and PEST and optimization-based algorithms including single-objective optimization such as SCE-UA and multi-objective optimization such as MOCOM-UA and MOSCEM-UA. Although algorithms which benefit from capabilities of both types, such as SUFI-2, were rather developed, this paper proposes a new auto-calibration algorithm which is capable of both finding optimal parameters values regarding multiple objectives like optimization-based algorithms and providing interval estimations of parameters like uncertainty-based algorithms. The algorithm is actually developed to improve quality of SUFI-2 results. Based on a single-objective, e.g. NSE and RMSE, SUFI-2 proposes a routine to find the best point and interval estimation of parameters and corresponding prediction intervals (95 PPU) of time series of interest. To assess the goodness of calibration, final results are presented using two uncertainty measures of p-factor quantifying percentage of observations covered by 95PPU and r-factor quantifying degree of uncertainty, and the analyst has to select the point and interval estimation of parameters which are actually non-dominated regarding both of the uncertainty measures. Based on the described properties of SUFI-2, two important questions are raised, answering of which are our research motivation: Given that in SUFI-2, final selection is based on the two measures or objectives and on the other hand, knowing that there is no multi-objective optimization mechanism in SUFI-2, are the final estimations Pareto-optimal? Can systematic methods be applied to select the final estimations? Dealing with these questions, a new auto-calibration algorithm was proposed where the uncertainty measures were considered as two objectives to find non-dominated interval estimations of parameters by means of coupling Monte Carlo simulation and Multi-Objective Particle Swarm Optimization. Both the proposed algorithm and SUFI-2 were applied to calibrate parameters of water resources planning model of Helleh river basin, Iran. The model is a comprehensive water quantity-quality model developed in the previous researches using WEAP software in order to analyze the impacts of different water resources management strategies including dam construction, increasing cultivation area, utilization of more efficient irrigation technologies, changing crop pattern, etc. Comparing the Pareto frontier resulted from the proposed auto-calibration algorithm with SUFI-2 results, it was revealed that the new algorithm leads to a better and also continuous Pareto frontier, even though it is more computationally expensive. Finally, Nash and Kalai-Smorodinsky bargaining methods were used to choose compromised interval estimation regarding Pareto frontier.

  5. Exact and Monte carlo resampling procedures for the Wilcoxon-Mann-Whitney and Kruskal-Wallis tests.

    PubMed

    Berry, K J; Mielke, P W

    2000-12-01

    Exact and Monte Carlo resampling FORTRAN programs are described for the Wilcoxon-Mann-Whitney rank sum test and the Kruskal-Wallis one-way analysis of variance for ranks test. The program algorithms compensate for tied values and do not depend on asymptotic approximations for probability values, unlike most algorithms contained in PC-based statistical software packages.

  6. Pattern Recognition for a Flight Dynamics Monte Carlo Simulation

    NASA Technical Reports Server (NTRS)

    Restrepo, Carolina; Hurtado, John E.

    2011-01-01

    The design, analysis, and verification and validation of a spacecraft relies heavily on Monte Carlo simulations. Modern computational techniques are able to generate large amounts of Monte Carlo data but flight dynamics engineers lack the time and resources to analyze it all. The growing amounts of data combined with the diminished available time of engineers motivates the need to automate the analysis process. Pattern recognition algorithms are an innovative way of analyzing flight dynamics data efficiently. They can search large data sets for specific patterns and highlight critical variables so analysts can focus their analysis efforts. This work combines a few tractable pattern recognition algorithms with basic flight dynamics concepts to build a practical analysis tool for Monte Carlo simulations. Current results show that this tool can quickly and automatically identify individual design parameters, and most importantly, specific combinations of parameters that should be avoided in order to prevent specific system failures. The current version uses a kernel density estimation algorithm and a sequential feature selection algorithm combined with a k-nearest neighbor classifier to find and rank important design parameters. This provides an increased level of confidence in the analysis and saves a significant amount of time.

  7. High order methods for the integration of the Bateman equations and other problems of the form of y‧ = F(y,t)y

    NASA Astrophysics Data System (ADS)

    Josey, C.; Forget, B.; Smith, K.

    2017-12-01

    This paper introduces two families of A-stable algorithms for the integration of y‧ = F (y , t) y: the extended predictor-corrector (EPC) and the exponential-linear (EL) methods. The structure of the algorithm families are described, and the method of derivation of the coefficients presented. The new algorithms are then tested on a simple deterministic problem and a Monte Carlo isotopic evolution problem. The EPC family is shown to be only second order for systems of ODEs. However, the EPC-RK45 algorithm had the highest accuracy on the Monte Carlo test, requiring at least a factor of 2 fewer function evaluations to achieve a given accuracy than a second order predictor-corrector method (center extrapolation / center midpoint method) with regards to Gd-157 concentration. Members of the EL family can be derived to at least fourth order. The EL3 and the EL4 algorithms presented are shown to be third and fourth order respectively on the systems of ODE test. In the Monte Carlo test, these methods did not overtake the accuracy of EPC methods before statistical uncertainty dominated the error. The statistical properties of the algorithms were also analyzed during the Monte Carlo problem. The new methods are shown to yield smaller standard deviations on final quantities as compared to the reference predictor-corrector method, by up to a factor of 1.4.

  8. Exploring nonlinear feature space dimension reduction and data representation in breast Cadx with Laplacian eigenmaps and t-SNE.

    PubMed

    Jamieson, Andrew R; Giger, Maryellen L; Drukker, Karen; Li, Hui; Yuan, Yading; Bhooshan, Neha

    2010-01-01

    In this preliminary study, recently developed unsupervised nonlinear dimension reduction (DR) and data representation techniques were applied to computer-extracted breast lesion feature spaces across three separate imaging modalities: Ultrasound (U.S.) with 1126 cases, dynamic contrast enhanced magnetic resonance imaging with 356 cases, and full-field digital mammography with 245 cases. Two methods for nonlinear DR were explored: Laplacian eigenmaps [M. Belkin and P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural Comput. 15, 1373-1396 (2003)] and t-distributed stochastic neighbor embedding (t-SNE) [L. van der Maaten and G. Hinton, "Visualizing data using t-SNE," J. Mach. Learn. Res. 9, 2579-2605 (2008)]. These methods attempt to map originally high dimensional feature spaces to more human interpretable lower dimensional spaces while preserving both local and global information. The properties of these methods as applied to breast computer-aided diagnosis (CADx) were evaluated in the context of malignancy classification performance as well as in the visual inspection of the sparseness within the two-dimensional and three-dimensional mappings. Classification performance was estimated by using the reduced dimension mapped feature output as input into both linear and nonlinear classifiers: Markov chain Monte Carlo based Bayesian artificial neural network (MCMC-BANN) and linear discriminant analysis. The new techniques were compared to previously developed breast CADx methodologies, including automatic relevance determination and linear stepwise (LSW) feature selection, as well as a linear DR method based on principal component analysis. Using ROC analysis and 0.632+bootstrap validation, 95% empirical confidence intervals were computed for the each classifier's AUC performance. In the large U.S. data set, sample high performance results include, AUC0.632+ = 0.88 with 95% empirical bootstrap interval [0.787;0.895] for 13 ARD selected features and AUC0.632+ = 0.87 with interval [0.817;0.906] for four LSW selected features compared to 4D t-SNE mapping (from the original 81D feature space) giving AUC0.632+ = 0.90 with interval [0.847;0.919], all using the MCMC-BANN. Preliminary results appear to indicate capability for the new methods to match or exceed classification performance of current advanced breast lesion CADx algorithms. While not appropriate as a complete replacement of feature selection in CADx problems, DR techniques offer a complementary approach, which can aid elucidation of additional properties associated with the data. Specifically, the new techniques were shown to possess the added benefit of delivering sparse lower dimensional representations for visual interpretation, revealing intricate data structure of the feature space.

  9. GPU-powered Shotgun Stochastic Search for Dirichlet process mixtures of Gaussian Graphical Models

    PubMed Central

    Mukherjee, Chiranjit; Rodriguez, Abel

    2016-01-01

    Gaussian graphical models are popular for modeling high-dimensional multivariate data with sparse conditional dependencies. A mixture of Gaussian graphical models extends this model to the more realistic scenario where observations come from a heterogenous population composed of a small number of homogeneous sub-groups. In this paper we present a novel stochastic search algorithm for finding the posterior mode of high-dimensional Dirichlet process mixtures of decomposable Gaussian graphical models. Further, we investigate how to harness the massive thread-parallelization capabilities of graphical processing units to accelerate computation. The computational advantages of our algorithms are demonstrated with various simulated data examples in which we compare our stochastic search with a Markov chain Monte Carlo algorithm in moderate dimensional data examples. These experiments show that our stochastic search largely outperforms the Markov chain Monte Carlo algorithm in terms of computing-times and in terms of the quality of the posterior mode discovered. Finally, we analyze a gene expression dataset in which Markov chain Monte Carlo algorithms are too slow to be practically useful. PMID:28626348

  10. GPU-powered Shotgun Stochastic Search for Dirichlet process mixtures of Gaussian Graphical Models.

    PubMed

    Mukherjee, Chiranjit; Rodriguez, Abel

    2016-01-01

    Gaussian graphical models are popular for modeling high-dimensional multivariate data with sparse conditional dependencies. A mixture of Gaussian graphical models extends this model to the more realistic scenario where observations come from a heterogenous population composed of a small number of homogeneous sub-groups. In this paper we present a novel stochastic search algorithm for finding the posterior mode of high-dimensional Dirichlet process mixtures of decomposable Gaussian graphical models. Further, we investigate how to harness the massive thread-parallelization capabilities of graphical processing units to accelerate computation. The computational advantages of our algorithms are demonstrated with various simulated data examples in which we compare our stochastic search with a Markov chain Monte Carlo algorithm in moderate dimensional data examples. These experiments show that our stochastic search largely outperforms the Markov chain Monte Carlo algorithm in terms of computing-times and in terms of the quality of the posterior mode discovered. Finally, we analyze a gene expression dataset in which Markov chain Monte Carlo algorithms are too slow to be practically useful.

  11. Randomizing Genome-Scale Metabolic Networks

    PubMed Central

    Samal, Areejit; Martin, Olivier C.

    2011-01-01

    Networks coming from protein-protein interactions, transcriptional regulation, signaling, or metabolism may appear to have “unusual” properties. To quantify this, it is appropriate to randomize the network and test the hypothesis that the network is not statistically different from expected in a motivated ensemble. However, when dealing with metabolic networks, the randomization of the network using edge exchange generates fictitious reactions that are biochemically meaningless. Here we provide several natural ensembles of randomized metabolic networks. A first constraint is to use valid biochemical reactions. Further constraints correspond to imposing appropriate functional constraints. We explain how to perform these randomizations with the help of Markov Chain Monte Carlo (MCMC) and show that they allow one to approach the properties of biological metabolic networks. The implication of the present work is that the observed global structural properties of real metabolic networks are likely to be the consequence of simple biochemical and functional constraints. PMID:21779409

  12. Forecast and analysis of the cosmological redshift drift.

    PubMed

    Lazkoz, Ruth; Leanizbarrutia, Iker; Salzano, Vincenzo

    2018-01-01

    The cosmological redshift drift could lead to the next step in high-precision cosmic geometric observations, becoming a direct and irrefutable test for cosmic acceleration. In order to test the viability and possible properties of this effect, also called Sandage-Loeb (SL) test, we generate a model-independent mock data set in order to compare its constraining power with that of the future mock data sets of Type Ia Supernovae (SNe) and Baryon Acoustic Oscillations (BAO). The performance of those data sets is analyzed by testing several cosmological models with the Markov chain Monte Carlo (MCMC) method, both independently as well as combining all data sets. Final results show that, in general, SL data sets allow for remarkable constraints on the matter density parameter today [Formula: see text] on every tested model, showing also a great complementarity with SNe and BAO data regarding dark energy parameters.

  13. Cosmological parameter estimation using Particle Swarm Optimization

    NASA Astrophysics Data System (ADS)

    Prasad, J.; Souradeep, T.

    2014-03-01

    Constraining parameters of a theoretical model from observational data is an important exercise in cosmology. There are many theoretically motivated models, which demand greater number of cosmological parameters than the standard model of cosmology uses, and make the problem of parameter estimation challenging. It is a common practice to employ Bayesian formalism for parameter estimation for which, in general, likelihood surface is probed. For the standard cosmological model with six parameters, likelihood surface is quite smooth and does not have local maxima, and sampling based methods like Markov Chain Monte Carlo (MCMC) method are quite successful. However, when there are a large number of parameters or the likelihood surface is not smooth, other methods may be more effective. In this paper, we have demonstrated application of another method inspired from artificial intelligence, called Particle Swarm Optimization (PSO) for estimating cosmological parameters from Cosmic Microwave Background (CMB) data taken from the WMAP satellite.

  14. Transmission Parameters of the 2001 Foot and Mouth Epidemic in Great Britain

    PubMed Central

    Chis Ster, Irina; Ferguson, Neil M.

    2007-01-01

    Despite intensive ongoing research, key aspects of the spatial-temporal evolution of the 2001 foot and mouth disease (FMD) epidemic in Great Britain (GB) remain unexplained. Here we develop a Markov Chain Monte Carlo (MCMC) method for estimating epidemiological parameters of the 2001 outbreak for a range of simple transmission models. We make the simplifying assumption that infectious farms were completely observed in 2001, equivalent to assuming that farms that were proactively culled but not diagnosed with FMD were not infectious, even if some were infected. We estimate how transmission parameters varied through time, highlighting the impact of the control measures on the progression of the epidemic. We demonstrate statistically significant evidence for assortative contact patterns between animals of the same species. Predictive risk maps of the transmission potential in different geographic areas of GB are presented for the fitted models. PMID:17551582

  15. Analysing child mortality in Nigeria with geoadditive discrete-time survival models.

    PubMed

    Adebayo, Samson B; Fahrmeir, Ludwig

    2005-03-15

    Child mortality reflects a country's level of socio-economic development and quality of life. In developing countries, mortality rates are not only influenced by socio-economic, demographic and health variables but they also vary considerably across regions and districts. In this paper, we analysed child mortality in Nigeria with flexible geoadditive discrete-time survival models. This class of models allows us to measure small-area district-specific spatial effects simultaneously with possibly non-linear or time-varying effects of other factors. Inference is fully Bayesian and uses computationally efficient Markov chain Monte Carlo (MCMC) simulation techniques. The application is based on the 1999 Nigeria Demographic and Health Survey. Our method assesses effects at a high level of temporal and spatial resolution not available with traditional parametric models, and the results provide some evidence on how to reduce child mortality by improving socio-economic and public health conditions. Copyright (c) 2004 John Wiley & Sons, Ltd.

  16. Cultural Consensus Theory: Aggregating Continuous Responses in a Finite Interval

    NASA Astrophysics Data System (ADS)

    Batchelder, William H.; Strashny, Alex; Romney, A. Kimball

    Cultural consensus theory (CCT) consists of cognitive models for aggregating responses of "informants" to test items about some domain of their shared cultural knowledge. This paper develops a CCT model for items requiring bounded numerical responses, e.g. probability estimates, confidence judgments, or similarity judgments. The model assumes that each item generates a latent random representation in each informant, with mean equal to the consensus answer and variance depending jointly on the informant and the location of the consensus answer. The manifest responses may reflect biases of the informants. Markov Chain Monte Carlo (MCMC) methods were used to estimate the model, and simulation studies validated the approach. The model was applied to an existing cross-cultural dataset involving native Japanese and English speakers judging the similarity of emotion terms. The results sharpened earlier studies that showed that both cultures appear to have very similar cognitive representations of emotion terms.

  17. Bayesian comparison of protein structures using partial Procrustes distance.

    PubMed

    Ejlali, Nasim; Faghihi, Mohammad Reza; Sadeghi, Mehdi

    2017-09-26

    An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.

  18. Selectivity curves of the capture of mangrove crab (Ucides cordatus) on the northern coast of Brazil using bayesian inference.

    PubMed

    Furtado-Junior, I; Abrunhosa, F A; Holanda, F C A F; Tavares, M C S

    2016-06-01

    Fishing selectivity of the mangrove crab Ucides cordatus in the north coast of Brazil can be defined as the fisherman's ability to capture and select individuals from a certain size or sex (or a combination of these factors) which suggests an empirical selectivity. Considering this hypothesis, we calculated the selectivity curves for males and females crabs using the logit function of the logistic model in the formulation. The Bayesian inference consisted of obtaining the posterior distribution by applying the Markov chain Monte Carlo (MCMC) method to software R using the OpenBUGS, BRugs, and R2WinBUGS libraries. The estimated results of width average carapace selection for males and females compared with previous studies reporting the average width of the carapace of sexual maturity allow us to confirm the hypothesis that most mature individuals do not suffer from fishing pressure; thus, ensuring their sustainability.

  19. Self-Learning Monte Carlo Method

    NASA Astrophysics Data System (ADS)

    Liu, Junwei; Qi, Yang; Meng, Zi Yang; Fu, Liang

    Monte Carlo simulation is an unbiased numerical tool for studying classical and quantum many-body systems. One of its bottlenecks is the lack of general and efficient update algorithm for large size systems close to phase transition or with strong frustrations, for which local updates perform badly. In this work, we propose a new general-purpose Monte Carlo method, dubbed self-learning Monte Carlo (SLMC), in which an efficient update algorithm is first learned from the training data generated in trial simulations and then used to speed up the actual simulation. We demonstrate the efficiency of SLMC in a spin model at the phase transition point, achieving a 10-20 times speedup. This work is supported by the DOE Office of Basic Energy Sciences, Division of Materials Sciences and Engineering under Award DE-SC0010526.

  20. Visual improvement for bad handwriting based on Monte-Carlo method

    NASA Astrophysics Data System (ADS)

    Shi, Cao; Xiao, Jianguo; Xu, Canhui; Jia, Wenhua

    2014-03-01

    A visual improvement algorithm based on Monte Carlo simulation is proposed in this paper, in order to enhance visual effects for bad handwriting. The whole improvement process is to use well designed typeface so as to optimize bad handwriting image. In this process, a series of linear operators for image transformation are defined for transforming typeface image to approach handwriting image. And specific parameters of linear operators are estimated by Monte Carlo method. Visual improvement experiments illustrate that the proposed algorithm can effectively enhance visual effect for handwriting image as well as maintain the original handwriting features, such as tilt, stroke order and drawing direction etc. The proposed visual improvement algorithm, in this paper, has a huge potential to be applied in tablet computer and Mobile Internet, in order to improve user experience on handwriting.

  1. Prospects of detection of the first sources with SKA using matched filters

    NASA Astrophysics Data System (ADS)

    Ghara, Raghunath; Choudhury, T. Roy; Datta, Kanan K.; Mellema, Garrelt; Choudhuri, Samir; Majumdar, Suman; Giri, Sambit K.

    2018-05-01

    The matched filtering technique is an efficient method to detect H ii bubbles and absorption regions in radio interferometric observations of the redshifted 21-cm signal from the epoch of reionization and the Cosmic Dawn. Here, we present an implementation of this technique to the upcoming observations such as the SKA1-low for a blind search of absorption regions at the Cosmic Dawn. The pipeline explores four dimensional parameter space on the simulated mock visibilities using a MCMC algorithm. The framework is able to efficiently determine the positions and sizes of the absorption/H ii regions in the field of view.

  2. Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics.

    PubMed

    Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel

    2012-09-25

    Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.

  3. Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics

    PubMed Central

    2012-01-01

    Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363

  4. Simultaneous fitting of genomic-BLUP and Bayes-C components in a genomic prediction model.

    PubMed

    Iheshiulor, Oscar O M; Woolliams, John A; Svendsen, Morten; Solberg, Trygve; Meuwissen, Theo H E

    2017-08-24

    The rapid adoption of genomic selection is due to two key factors: availability of both high-throughput dense genotyping and statistical methods to estimate and predict breeding values. The development of such methods is still ongoing and, so far, there is no consensus on the best approach. Currently, the linear and non-linear methods for genomic prediction (GP) are treated as distinct approaches. The aim of this study was to evaluate the implementation of an iterative method (called GBC) that incorporates aspects of both linear [genomic-best linear unbiased prediction (G-BLUP)] and non-linear (Bayes-C) methods for GP. The iterative nature of GBC makes it less computationally demanding similar to other non-Markov chain Monte Carlo (MCMC) approaches. However, as a Bayesian method, GBC differs from both MCMC- and non-MCMC-based methods by combining some aspects of G-BLUP and Bayes-C methods for GP. Its relative performance was compared to those of G-BLUP and Bayes-C. We used an imputed 50 K single-nucleotide polymorphism (SNP) dataset based on the Illumina Bovine50K BeadChip, which included 48,249 SNPs and 3244 records. Daughter yield deviations for somatic cell count, fat yield, milk yield, and protein yield were used as response variables. GBC was frequently (marginally) superior to G-BLUP and Bayes-C in terms of prediction accuracy and was significantly better than G-BLUP only for fat yield. On average across the four traits, GBC yielded a 0.009 and 0.006 increase in prediction accuracy over G-BLUP and Bayes-C, respectively. Computationally, GBC was very much faster than Bayes-C and similar to G-BLUP. Our results show that incorporating some aspects of G-BLUP and Bayes-C in a single model can improve accuracy of GP over the commonly used method: G-BLUP. Generally, GBC did not statistically perform better than G-BLUP and Bayes-C, probably due to the close relationships between reference and validation individuals. Nevertheless, it is a flexible tool, in the sense, that it simultaneously incorporates some aspects of linear and non-linear models for GP, thereby exploiting family relationships while also accounting for linkage disequilibrium between SNPs and genes with large effects. The application of GBC in GP merits further exploration.

  5. Markov Chain Monte Carlo Inversion of Mantle Temperature and Composition, with Application to Iceland

    NASA Astrophysics Data System (ADS)

    Brown, Eric; Petersen, Kenni; Lesher, Charles

    2017-04-01

    Basalts are formed by adiabatic decompression melting of the asthenosphere, and thus provide records of the thermal, chemical and dynamical state of the upper mantle. However, uniquely constraining the importance of these factors through the lens of melting is challenging given the inevitability that primary basalts are the product of variable mixing of melts derived from distinct lithologies having different melting behaviors (e.g. peridotite vs. pyroxenite). Forward mantle melting models, such as REEBOX PRO [1], are useful tools in this regard, because they can account for differences in melting behavior and melt pooling processes, and provide estimates of bulk crust composition and volume that can be compared with geochemical and geophysical constraints, respectively. Nevertheless, these models require critical assumptions regarding mantle temperature, and lithologic abundance(s)/composition(s), all of which are poorly constrained. To provide better constraints on these parameters and their uncertainties, we have coupled a Markov Chain Monte Carlo (MCMC) sampling technique with the REEBOX PRO melting model. The MCMC method systematically samples distributions of key REEBOX PRO input parameters (mantle potential temperature, and initial abundances and compositions of the source lithologies) based on a likelihood function that describes the 'fit' of the model outputs (bulk crust composition and volume and end-member peridotite and pyroxenite melts) relative to geochemical and geophysical constraints and their associated uncertainties. As a case study, we have tested and applied the model to magmatism along Reykjanes Peninsula in Iceland, where pyroxenite has been inferred to be present in the mantle source. This locale is ideal because there exist sufficient geochemical and geophysical data to estimate bulk crust compositions and volumes, as well as the range of near-parental melts derived from the mantle. We find that for the case of passive upwelling, the models that best fit the geochemical and geophysical observables require elevated mantle potential temperatures ( 120 °C above ambient mantle), and 5% pyroxenite. The modeled peridotite source has a trace element composition similar to depleted MORB mantle, whereas the trace element composition of the pyroxenite is similar to enriched mid-ocean ridge basalt. These results highlight the promise of this method for efficiently exploring the range of mantle temperatures, lithologic abundances, and mantle source compositions that are most consistent with available observational constraints in individual volcanic systems. 1 Brown and Lesher (2016), G-cubed, 17, 3929-3968

  6. Loading relativistic Maxwell distributions in particle simulations

    NASA Astrophysics Data System (ADS)

    Zenitani, S.

    2015-12-01

    In order to study energetic plasma phenomena by using particle-in-cell (PIC) and Monte-Carlo simulations, we need to deal with relativistic velocity distributions in these simulations. However, numerical algorithms to deal with relativistic distributions are not well known. In this contribution, we overview basic algorithms to load relativistic Maxwell distributions in PIC and Monte-Carlo simulations. For stationary relativistic Maxwellian, the inverse transform method and the Sobol algorithm are reviewed. To boost particles to obtain relativistic shifted-Maxwellian, two rejection methods are newly proposed in a physically transparent manner. Their acceptance efficiencies are 􏰅50% for generic cases and 100% for symmetric distributions. They can be combined with arbitrary base algorithms.

  7. An unbiased Hessian representation for Monte Carlo PDFs.

    PubMed

    Carrazza, Stefano; Forte, Stefano; Kassabov, Zahari; Latorre, José Ignacio; Rojo, Juan

    We develop a methodology for the construction of a Hessian representation of Monte Carlo sets of parton distributions, based on the use of a subset of the Monte Carlo PDF replicas as an unbiased linear basis, and of a genetic algorithm for the determination of the optimal basis. We validate the methodology by first showing that it faithfully reproduces a native Monte Carlo PDF set (NNPDF3.0), and then, that if applied to Hessian PDF set (MMHT14) which was transformed into a Monte Carlo set, it gives back the starting PDFs with minimal information loss. We then show that, when applied to a large Monte Carlo PDF set obtained as combination of several underlying sets, the methodology leads to a Hessian representation in terms of a rather smaller set of parameters (MC-H PDFs), thereby providing an alternative implementation of the recently suggested Meta-PDF idea and a Hessian version of the recently suggested PDF compression algorithm (CMC-PDFs). The mc2hessian conversion code is made publicly available together with (through LHAPDF6) a Hessian representations of the NNPDF3.0 set, and the MC-H PDF set.

  8. SU-F-T-444: Quality Improvement Review of Radiation Therapy Treatment Planning in the Presence of Dental Implants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parenica, H; Ford, J; Mavroidis, P

    Purpose: To quantify and compare the effect of metallic dental implants (MDI) on dose distributions calculated using Collapsed Cone Convolution Superposition (CCCS) algorithm or a Monte Carlo algorithm (with and without correcting for the density of the MDI). Methods: Seven previously treated patients to the head and neck region were included in this study. The MDI and the streaking artifacts on the CT images were carefully contoured. For each patient a plan was optimized and calculated using the Pinnacle3 treatment planning system (TPS). For each patient two dose calculations were performed, a) with the densities of the MDI and CTmore » artifacts overridden (12 g/cc and 1 g/cc respectively) and b) without density overrides. The plans were then exported to the Monaco TPS and recalculated using Monte Carlo dose calculation algorithm. The changes in dose to PTVs and surrounding Regions of Interest (ROIs) were examined between all plans. Results: The Monte Carlo dose calculation indicated that PTVs received 6% lower dose than the CCCS algorithm predicted. In some cases, the Monte Carlo algorithm indicated that surrounding ROIs received higher dose (up to a factor of 2). Conclusion: Not properly accounting for dental implants can impact both the high dose regions (PTV) and the low dose regions (OAR). This study implies that if MDI and the artifacts are not appropriately contoured and given the correct density, there is potential significant impact on PTV coverage and OAR maximum doses.« less

  9. Bayesian estimation of realized stochastic volatility model by Hybrid Monte Carlo algorithm

    NASA Astrophysics Data System (ADS)

    Takaishi, Tetsuya

    2014-03-01

    The hybrid Monte Carlo algorithm (HMCA) is applied for Bayesian parameter estimation of the realized stochastic volatility (RSV) model. Using the 2nd order minimum norm integrator (2MNI) for the molecular dynamics (MD) simulation in the HMCA, we find that the 2MNI is more efficient than the conventional leapfrog integrator. We also find that the autocorrelation time of the volatility variables sampled by the HMCA is very short. Thus it is concluded that the HMCA with the 2MNI is an efficient algorithm for parameter estimations of the RSV model.

  10. Exploring cluster Monte Carlo updates with Boltzmann machines

    NASA Astrophysics Data System (ADS)

    Wang, Lei

    2017-11-01

    Boltzmann machines are physics informed generative models with broad applications in machine learning. They model the probability distribution of an input data set with latent variables and generate new samples accordingly. Applying the Boltzmann machines back to physics, they are ideal recommender systems to accelerate the Monte Carlo simulation of physical systems due to their flexibility and effectiveness. More intriguingly, we show that the generative sampling of the Boltzmann machines can even give different cluster Monte Carlo algorithms. The latent representation of the Boltzmann machines can be designed to mediate complex interactions and identify clusters of the physical system. We demonstrate these findings with concrete examples of the classical Ising model with and without four-spin plaquette interactions. In the future, automatic searches in the algorithm space parametrized by Boltzmann machines may discover more innovative Monte Carlo updates.

  11. Testing trivializing maps in the Hybrid Monte Carlo algorithm

    PubMed Central

    Engel, Georg P.; Schaefer, Stefan

    2011-01-01

    We test a recent proposal to use approximate trivializing maps in a field theory to speed up Hybrid Monte Carlo simulations. Simulating the CPN−1 model, we find a small improvement with the leading order transformation, which is however compensated by the additional computational overhead. The scaling of the algorithm towards the continuum is not changed. In particular, the effect of the topological modes on the autocorrelation times is studied. PMID:21969733

  12. Dynamic load balancing for petascale quantum Monte Carlo applications: The Alias method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sudheer, C. D.; Krishnan, S.; Srinivasan, A.

    Diffusion Monte Carlo is the most accurate widely used Quantum Monte Carlo method for the electronic structure of materials, but it requires frequent load balancing or population redistribution steps to maintain efficiency and avoid accumulation of systematic errors on parallel machines. The load balancing step can be a significant factor affecting performance, and will become more important as the number of processing elements increases. We propose a new dynamic load balancing algorithm, the Alias Method, and evaluate it theoretically and empirically. An important feature of the new algorithm is that the load can be perfectly balanced with each process receivingmore » at most one message. It is also optimal in the maximum size of messages received by any process. We also optimize its implementation to reduce network contention, a process facilitated by the low messaging requirement of the algorithm. Empirical results on the petaflop Cray XT Jaguar supercomputer at ORNL showing up to 30% improvement in performance on 120,000 cores. The load balancing algorithm may be straightforwardly implemented in existing codes. The algorithm may also be employed by any method with many near identical computational tasks that requires load balancing.« less

  13. Stochastic evaluation of second-order many-body perturbation energies.

    PubMed

    Willow, Soohaeng Yoo; Kim, Kwang S; Hirata, So

    2012-11-28

    With the aid of the Laplace transform, the canonical expression of the second-order many-body perturbation correction to an electronic energy is converted into the sum of two 13-dimensional integrals, the 12-dimensional parts of which are evaluated by Monte Carlo integration. Weight functions are identified that are analytically normalizable, are finite and non-negative everywhere, and share the same singularities as the integrands. They thus generate appropriate distributions of four-electron walkers via the Metropolis algorithm, yielding correlation energies of small molecules within a few mE(h) of the correct values after 10(8) Monte Carlo steps. This algorithm does away with the integral transformation as the hotspot of the usual algorithms, has a far superior size dependence of cost, does not suffer from the sign problem of some quantum Monte Carlo methods, and potentially easily parallelizable and extensible to other more complex electron-correlation theories.

  14. A novel Bayesian approach to quantify clinical variables and to determine their spectroscopic counterparts in 1H NMR metabonomic data

    PubMed Central

    Vehtari, Aki; Mäkinen, Ville-Petteri; Soininen, Pasi; Ingman, Petri; Mäkelä, Sanna M; Savolainen, Markku J; Hannuksela, Minna L; Kaski, Kimmo; Ala-Korpela, Mika

    2007-01-01

    Background A key challenge in metabonomics is to uncover quantitative associations between multidimensional spectroscopic data and biochemical measures used for disease risk assessment and diagnostics. Here we focus on clinically relevant estimation of lipoprotein lipids by 1H NMR spectroscopy of serum. Results A Bayesian methodology, with a biochemical motivation, is presented for a real 1H NMR metabonomics data set of 75 serum samples. Lipoprotein lipid concentrations were independently obtained for these samples via ultracentrifugation and specific biochemical assays. The Bayesian models were constructed by Markov chain Monte Carlo (MCMC) and they showed remarkably good quantitative performance, the predictive R-values being 0.985 for the very low density lipoprotein triglycerides (VLDL-TG), 0.787 for the intermediate, 0.943 for the low, and 0.933 for the high density lipoprotein cholesterol (IDL-C, LDL-C and HDL-C, respectively). The modelling produced a kernel-based reformulation of the data, the parameters of which coincided with the well-known biochemical characteristics of the 1H NMR spectra; particularly for VLDL-TG and HDL-C the Bayesian methodology was able to clearly identify the most characteristic resonances within the heavily overlapping information in the spectra. For IDL-C and LDL-C the resulting model kernels were more complex than those for VLDL-TG and HDL-C, probably reflecting the severe overlap of the IDL and LDL resonances in the 1H NMR spectra. Conclusion The systematic use of Bayesian MCMC analysis is computationally demanding. Nevertheless, the combination of high-quality quantification and the biochemical rationale of the resulting models is expected to be useful in the field of metabonomics. PMID:17493257

  15. Inverse Modeling of Hydrologic Parameters Using Surface Flux and Runoff Observations in the Community Land Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sun, Yu; Hou, Zhangshuan; Huang, Maoyi

    2013-12-10

    This study demonstrates the possibility of inverting hydrologic parameters using surface flux and runoff observations in version 4 of the Community Land Model (CLM4). Previous studies showed that surface flux and runoff calculations are sensitive to major hydrologic parameters in CLM4 over different watersheds, and illustrated the necessity and possibility of parameter calibration. Two inversion strategies, the deterministic least-square fitting and stochastic Markov-Chain Monte-Carlo (MCMC) - Bayesian inversion approaches, are evaluated by applying them to CLM4 at selected sites. The unknowns to be estimated include surface and subsurface runoff generation parameters and vadose zone soil water parameters. We find thatmore » using model parameters calibrated by the least-square fitting provides little improvements in the model simulations but the sampling-based stochastic inversion approaches are consistent - as more information comes in, the predictive intervals of the calibrated parameters become narrower and the misfits between the calculated and observed responses decrease. In general, parameters that are identified to be significant through sensitivity analyses and statistical tests are better calibrated than those with weak or nonlinear impacts on flux or runoff observations. Temporal resolution of observations has larger impacts on the results of inverse modeling using heat flux data than runoff data. Soil and vegetation cover have important impacts on parameter sensitivities, leading to the different patterns of posterior distributions of parameters at different sites. Overall, the MCMC-Bayesian inversion approach effectively and reliably improves the simulation of CLM under different climates and environmental conditions. Bayesian model averaging of the posterior estimates with different reference acceptance probabilities can smooth the posterior distribution and provide more reliable parameter estimates, but at the expense of wider uncertainty bounds.« less

  16. Coronal loop seismology using damping of standing kink oscillations by mode coupling. II. additional physical effects and Bayesian analysis

    NASA Astrophysics Data System (ADS)

    Pascoe, D. J.; Anfinogentov, S.; Nisticò, G.; Goddard, C. R.; Nakariakov, V. M.

    2017-04-01

    Context. The strong damping of kink oscillations of coronal loops can be explained by mode coupling. The damping envelope depends on the transverse density profile of the loop. Observational measurements of the damping envelope have been used to determine the transverse loop structure which is important for understanding other physical processes such as heating. Aims: The general damping envelope describing the mode coupling of kink waves consists of a Gaussian damping regime followed by an exponential damping regime. Recent observational detection of these damping regimes has been employed as a seismological tool. We extend the description of the damping behaviour to account for additional physical effects, namely a time-dependent period of oscillation, the presence of additional longitudinal harmonics, and the decayless regime of standing kink oscillations. Methods: We examine four examples of standing kink oscillations observed by the Atmospheric Imaging Assembly (AIA) onboard the Solar Dynamics Observatory (SDO). We use forward modelling of the loop position and investigate the dependence on the model parameters using Bayesian inference and Markov chain Monte Carlo (MCMC) sampling. Results: Our improvements to the physical model combined with the use of Bayesian inference and MCMC produce improved estimates of model parameters and their uncertainties. Calculation of the Bayes factor also allows us to compare the suitability of different physical models. We also use a new method based on spline interpolation of the zeroes of the oscillation to accurately describe the background trend of the oscillating loop. Conclusions: This powerful and robust method allows for accurate seismology of coronal loops, in particular the transverse density profile, and potentially reveals additional physical effects.

  17. Accounting for model error in Bayesian solutions to hydrogeophysical inverse problems using a local basis approach

    NASA Astrophysics Data System (ADS)

    Köpke, Corinna; Irving, James; Elsheikh, Ahmed H.

    2018-06-01

    Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward model linking subsurface physical properties to measured data, which is typically assumed to be perfectly known in the inversion procedure. However, to make the stochastic solution of the inverse problem computationally tractable using methods such as Markov-chain-Monte-Carlo (MCMC), fast approximations of the forward model are commonly employed. This gives rise to model error, which has the potential to significantly bias posterior statistics if not properly accounted for. Here, we present a new methodology for dealing with the model error arising from the use of approximate forward solvers in Bayesian solutions to hydrogeophysical inverse problems. Our approach is geared towards the common case where this error cannot be (i) effectively characterized through some parametric statistical distribution; or (ii) estimated by interpolating between a small number of computed model-error realizations. To this end, we focus on identification and removal of the model-error component of the residual during MCMC using a projection-based approach, whereby the orthogonal basis employed for the projection is derived in each iteration from the K-nearest-neighboring entries in a model-error dictionary. The latter is constructed during the inversion and grows at a specified rate as the iterations proceed. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar travel-time data considering three different subsurface parameterizations of varying complexity. Synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed for their inversion. In each case, our developed approach enables us to remove posterior bias and obtain a more realistic characterization of uncertainty.

  18. Impact of Federal drug law enforcement on the supply of heroin in Australia.

    PubMed

    Smithson, Michael; McFadden, Michael; Mwesigye, Sue-Ellen

    2005-08-01

    To conduct an empirical investigation of the efficacy of law enforcement in reducing heroin supply in Australia. Specifically, this paper addresses the question of whether heroin purity levels in the Australian Capital Territory (ACT) could be predicted by heroin seizures at the national level by the Australian Federal Police (AFP) in the preceding year. We considered two forms of evidence. First, a Bayesian Markov Chain Monte Carlo (MCMC) change-point model was used to discover (a) if there was a substantial increase in heroin seizures by the AFP, (b) when the increase began and (c) whether it occurred after increased funding to the Australian Federal Police for the purpose of drug law enforcement. Second, standard time-series methods were used to ascertain whether fluctuations in heroin seizure weights or the frequency of large-scale seizures after the aforementioned changes in seizure levels predicted fluctuations in heroin purity levels in the ACT after autocorrelation had been removed from the purity series. A Bayesian MCMC change-point model supported the hypothesis that heroin seizures rapidly increased about a year before the estimated decline in heroin purity and after the increased funding of AFP. The autoregression models suggested that 10-20% of the variance in the residuals of the heroin purity series was predicted by appropriately lagged residuals of the seizure-number and log-weight series, after autocorrelation had been removed. The overall results are consistent with the hypothesis that large-scale heroin seizures by the AFP reduce street-level heroin supply a year or so later, although the short-term dynamics suggest an 'opponent' response to residual fluctuations in seizures. To our knowledge, this is first time a connection has been identified between large-scale heroin seizures and street-level supply.

  19. Inverse modeling of hydrologic parameters using surface flux and runoff observations in the Community Land Model

    NASA Astrophysics Data System (ADS)

    Sun, Y.; Hou, Z.; Huang, M.; Tian, F.; Leung, L. Ruby

    2013-12-01

    This study demonstrates the possibility of inverting hydrologic parameters using surface flux and runoff observations in version 4 of the Community Land Model (CLM4). Previous studies showed that surface flux and runoff calculations are sensitive to major hydrologic parameters in CLM4 over different watersheds, and illustrated the necessity and possibility of parameter calibration. Both deterministic least-square fitting and stochastic Markov-chain Monte Carlo (MCMC)-Bayesian inversion approaches are evaluated by applying them to CLM4 at selected sites with different climate and soil conditions. The unknowns to be estimated include surface and subsurface runoff generation parameters and vadose zone soil water parameters. We find that using model parameters calibrated by the sampling-based stochastic inversion approaches provides significant improvements in the model simulations compared to using default CLM4 parameter values, and that as more information comes in, the predictive intervals (ranges of posterior distributions) of the calibrated parameters become narrower. In general, parameters that are identified to be significant through sensitivity analyses and statistical tests are better calibrated than those with weak or nonlinear impacts on flux or runoff observations. Temporal resolution of observations has larger impacts on the results of inverse modeling using heat flux data than runoff data. Soil and vegetation cover have important impacts on parameter sensitivities, leading to different patterns of posterior distributions of parameters at different sites. Overall, the MCMC-Bayesian inversion approach effectively and reliably improves the simulation of CLM under different climates and environmental conditions. Bayesian model averaging of the posterior estimates with different reference acceptance probabilities can smooth the posterior distribution and provide more reliable parameter estimates, but at the expense of wider uncertainty bounds.

  20. WMAP7 constraints on oscillations in the primordial power spectrum

    NASA Astrophysics Data System (ADS)

    Meerburg, P. Daniel; Wijers, Ralph A. M. J.; van der Schaar, Jan Pieter

    2012-03-01

    We use the 7-year Wilkinson Microwave Anisotropy Probe (WMAP7) data to place constraints on oscillations supplementing an almost scale-invariant primordial power spectrum. Such oscillations are predicted by a variety of models, some of which amount to assuming that there is some non-trivial choice of the vacuum state at the onset of inflation. In this paper, we will explore data-driven constraints on two distinct models of initial state modifications. In both models, the frequency, phase and amplitude are degrees of freedom of the theory for which the theoretical bounds are rather weak: both the amplitude and frequency have allowed values ranging over several orders of magnitude. This requires many computationally expensive evaluations of the model cosmic microwave background (CMB) spectra and their goodness of fit, even in a Markov chain Monte Carlo (MCMC), normally the most efficient fitting method for such a problem. To search more efficiently, we first run a densely-spaced grid, with only three varying parameters: the frequency, the amplitude and the baryon density. We obtain the optimal frequency and run an MCMC at the best-fitting frequency, randomly varying all other relevant parameters. To reduce the computational time of each power spectrum computation, we adjust both comoving momentum integration and spline interpolation (in l) as a function of frequency and amplitude of the primordial power spectrum. Applying this to the WMAP7 data allows us to improve existing constraints on the presence of oscillations. We confirm earlier findings that certain frequencies can improve the fitting over a model without oscillations. For those frequencies we compute the posterior probability, allowing us to put some constraints on the primordial parameter space of both models.

  1. [Economic Evaluation of mFOLFOX6-based First-line Regimens for Unresectable Advanced or Recurrent Colorectal Cancer Using Clinical Decision Analysis].

    PubMed

    Shida, Toshihiro; Endo, Yuji; Shiraishi, Tadashi; Yoshioka, Takashi; Suzuki, Kaoru; Kobayashi, Yuka; Ono, Yuki; Ito, Toshinori; Inoue, Tadao

    2018-01-01

     We evaluated four representative chemotherapy regimens for unresectable advanced or recurrent KRAS-wild type colorectal cancer: mFOLFOX6, mFOLFOX6+bevacizumab (Bmab), cetuximab (Cmab), or panitumumab (Pmab). We employed a decision analysis method in combination with clinical and economic evidence. The health outcomes of the regimens were analyzed on the basis of overall and progression-free survival. The data were drawn from the literature on randomized controlled clinical trials of the above-mentioned drugs. The total costs of the regimens were calculated on the basis of direct costs obtained from the medical records of patients diagnosed with unresectable advanced or recurrent colorectal cancer at Yamagata University Hospital and Yamagata Prefecture Central Hospital. Cost effectiveness was analyzed using a Markov chain Monte Carlo (MCMC) method. The study was designed from the viewpoint of public medical care. The MCMC analysis revealed that expected life months and expected cost were 20 months/3,527,119 yen for mFOLFOX6, 27 months/8,270,625 yen for mFOLFOX6+Bmab, 29 months/13,174,6297 yen for mFOLFOX6+Cmab, and 6 months/12,613,445 yen for mFOLFOX6+Pmab. Incremental costs per effectiveness ratios per life month against mFOLFOX6 were 637,592 yen for mFOLFOX6+Bmab, 1,075,162 yen for mFOLFOX6+Cmab, and 587,455 yen for mFOLFOX6+Pmab. Compared to the conventional mFOLFOX6 regimen, molecular-targeted drug regimens provide better health outcomes, but the cost increases accordingly. mFOLFOX 6+Pmab is the most cost-effective regimen among those surveyed in this study.

  2. A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks.

    PubMed

    Zhou, Xiaobo; Wang, Xiaodong; Pal, Ranadip; Ivanov, Ivan; Bittner, Michael; Dougherty, Edward R

    2004-11-22

    We have hypothesized that the construction of transcriptional regulatory networks using a method that optimizes connectivity would lead to regulation consistent with biological expectations. A key expectation is that the hypothetical networks should produce a few, very strong attractors, highly similar to the original observations, mimicking biological state stability and determinism. Another central expectation is that, since it is expected that the biological control is distributed and mutually reinforcing, interpretation of the observations should lead to a very small number of connection schemes. We propose a fully Bayesian approach to constructing probabilistic gene regulatory networks (PGRNs) that emphasizes network topology. The method computes the possible parent sets of each gene, the corresponding predictors and the associated probabilities based on a nonlinear perceptron model, using a reversible jump Markov chain Monte Carlo (MCMC) technique, and an MCMC method is employed to search the network configurations to find those with the highest Bayesian scores to construct the PGRN. The Bayesian method has been used to construct a PGRN based on the observed behavior of a set of genes whose expression patterns vary across a set of melanoma samples exhibiting two very different phenotypes with respect to cell motility and invasiveness. Key biological features have been faithfully reflected in the model. Its steady-state distribution contains attractors that are either identical or very similar to the states observed in the data, and many of the attractors are singletons, which mimics the biological propensity to stably occupy a given state. Most interestingly, the connectivity rules for the most optimal generated networks constituting the PGRN are remarkably similar, as would be expected for a network operating on a distributed basis, with strong interactions between the components.

  3. Training-Image Based Geostatistical Inversion Using a Spatial Generative Adversarial Neural Network

    NASA Astrophysics Data System (ADS)

    Laloy, Eric; Hérault, Romain; Jacques, Diederik; Linde, Niklas

    2018-01-01

    Probabilistic inversion within a multiple-point statistics framework is often computationally prohibitive for high-dimensional problems. To partly address this, we introduce and evaluate a new training-image based inversion approach for complex geologic media. Our approach relies on a deep neural network of the generative adversarial network (GAN) type. After training using a training image (TI), our proposed spatial GAN (SGAN) can quickly generate 2-D and 3-D unconditional realizations. A key characteristic of our SGAN is that it defines a (very) low-dimensional parameterization, thereby allowing for efficient probabilistic inversion using state-of-the-art Markov chain Monte Carlo (MCMC) methods. In addition, available direct conditioning data can be incorporated within the inversion. Several 2-D and 3-D categorical TIs are first used to analyze the performance of our SGAN for unconditional geostatistical simulation. Training our deep network can take several hours. After training, realizations containing a few millions of pixels/voxels can be produced in a matter of seconds. This makes it especially useful for simulating many thousands of realizations (e.g., for MCMC inversion) as the relative cost of the training per realization diminishes with the considered number of realizations. Synthetic inversion case studies involving 2-D steady state flow and 3-D transient hydraulic tomography with and without direct conditioning data are used to illustrate the effectiveness of our proposed SGAN-based inversion. For the 2-D case, the inversion rapidly explores the posterior model distribution. For the 3-D case, the inversion recovers model realizations that fit the data close to the target level and visually resemble the true model well.

  4. Forward and Inverse Modeling of Self-potential. A Tomography of Groundwater Flow and Comparison Between Deterministic and Stochastic Inversion Methods

    NASA Astrophysics Data System (ADS)

    Quintero-Chavarria, E.; Ochoa Gutierrez, L. H.

    2016-12-01

    Applications of the Self-potential Method in the fields of Hydrogeology and Environmental Sciences have had significant developments during the last two decades with a strong use on groundwater flows identification. Although only few authors deal with the forward problem's solution -especially in geophysics literature- different inversion procedures are currently being developed but in most cases they are compared with unconventional groundwater velocity fields and restricted to structured meshes. This research solves the forward problem based on the finite element method using the St. Venant's Principle to transform a point dipole, which is the field generated by a single vector, into a distribution of electrical monopoles. Then, two simple aquifer models were generated with specific boundary conditions and head potentials, velocity fields and electric potentials in the medium were computed. With the model's surface electric potential, the inverse problem is solved to retrieve the source of electric potential (vector field associated to groundwater flow) using deterministic and stochastic approaches. The first approach was carried out by implementing a Tikhonov regularization with a stabilized operator adapted to the finite element mesh while for the second a hierarchical Bayesian model based on Markov chain Monte Carlo (McMC) and Markov Random Fields (MRF) was constructed. For all implemented methods, the result between the direct and inverse models was contrasted in two ways: 1) shape and distribution of the vector field, and 2) magnitude's histogram. Finally, it was concluded that inversion procedures are improved when the velocity field's behavior is considered, thus, the deterministic method is more suitable for unconfined aquifers than confined ones. McMC has restricted applications and requires a lot of information (particularly in potentials fields) while MRF has a remarkable response especially when dealing with confined aquifers.

  5. Assessment of parametric uncertainty for groundwater reactive transport modeling,

    USGS Publications Warehouse

    Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun

    2014-01-01

    The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.

  6. Finding viable models in SUSY parameter spaces with signal specific discovery potential

    NASA Astrophysics Data System (ADS)

    Burgess, Thomas; Lindroos, Jan Øye; Lipniacka, Anna; Sandaker, Heidi

    2013-08-01

    Recent results from ATLAS giving a Higgs mass of 125.5 GeV, further constrain already highly constrained supersymmetric models such as pMSSM or CMSSM/mSUGRA. As a consequence, finding potentially discoverable and non-excluded regions of model parameter space is becoming increasingly difficult. Several groups have invested large effort in studying the consequences of Higgs mass bounds, upper limits on rare B-meson decays, and limits on relic dark matter density on constrained models, aiming at predicting superpartner masses, and establishing likelihood of SUSY models compared to that of the Standard Model vis-á-vis experimental data. In this paper a framework for efficient search for discoverable, non-excluded regions of different SUSY spaces giving specific experimental signature of interest is presented. The method employs an improved Markov Chain Monte Carlo (MCMC) scheme exploiting an iteratively updated likelihood function to guide search for viable models. Existing experimental and theoretical bounds as well as the LHC discovery potential are taken into account. This includes recent bounds on relic dark matter density, the Higgs sector and rare B-mesons decays. A clustering algorithm is applied to classify selected models according to expected phenomenology enabling automated choice of experimental benchmarks and regions to be used for optimizing searches. The aim is to provide experimentalist with a viable tool helping to target experimental signatures to search for, once a class of models of interest is established. As an example a search for viable CMSSM models with τ-lepton signatures observable with the 2012 LHC data set is presented. In the search 105209 unique models were probed. From these, ten reference benchmark points covering different ranges of phenomenological observables at the LHC were selected.

  7. Collective Poisson process with periodic rates: applications in physics from micro-to nanodevices.

    PubMed

    da Silva, Roberto; Lamb, Luis C; Wirth, Gilson Inacio

    2011-01-28

    Continuous reductions in the dimensions of semiconductor devices have led to an increasing number of noise sources, including random telegraph signals (RTS) due to the capture and emission of electrons by traps at random positions between oxide and semiconductor. The models traditionally used for microscopic devices become of limited validity in nano- and mesoscale systems since, in such systems, distributed quantities such as electron and trap densities, and concepts like electron mobility, become inadequate to model electrical behaviour. In addition, current experimental works have shown that RTS in semiconductor devices based on carbon nanotubes lead to giant current fluctuations. Therefore, the physics of this phenomenon and techniques to decrease the amplitudes of RTS need to be better understood. This problem can be described as a collective Poisson process under different, but time-independent, rates, τ(c) and τ(e), that control the capture and emission of electrons by traps distributed over the oxide. Thus, models that consider calculations performed under time-dependent periodic capture and emission rates should be of interest in order to model more efficient devices. We show a complete theoretical description of a model that is capable of showing a noise reduction of current fluctuations in the time domain, and a reduction of the power spectral density in the frequency domain, in semiconductor devices as predicted by previous experimental work. We do so through numerical integrations and a novel Monte Carlo Markov chain (MCMC) algorithm based on microscopic discrete values. The proposed model also handles the ballistic regime, relevant in nano- and mesoscale devices. Finally, we show that the ballistic regime leads to nonlinearity in the electrical behaviour.

  8. Hierarchical fractional-step approximations and parallel kinetic Monte Carlo algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arampatzis, Giorgos, E-mail: garab@math.uoc.gr; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Plechac, Petr, E-mail: plechac@math.udel.edu

    2012-10-01

    We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decompositionmore » corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors. The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.« less

  9. An improved target velocity sampling algorithm for free gas elastic scattering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Romano, Paul K.; Walsh, Jonathan A.

    We present an improved algorithm for sampling the target velocity when simulating elastic scattering in a Monte Carlo neutron transport code that correctly accounts for the energy dependence of the scattering cross section. The algorithm samples the relative velocity directly, thereby avoiding a potentially inefficient rejection step based on the ratio of cross sections. Here, we have shown that this algorithm requires only one rejection step, whereas other methods of similar accuracy require two rejection steps. The method was verified against stochastic and deterministic reference results for upscattering percentages in 238U. Simulations of a light water reactor pin cell problemmore » demonstrate that using this algorithm results in a 3% or less penalty in performance when compared with an approximate method that is used in most production Monte Carlo codes« less

  10. An improved target velocity sampling algorithm for free gas elastic scattering

    DOE PAGES

    Romano, Paul K.; Walsh, Jonathan A.

    2018-02-03

    We present an improved algorithm for sampling the target velocity when simulating elastic scattering in a Monte Carlo neutron transport code that correctly accounts for the energy dependence of the scattering cross section. The algorithm samples the relative velocity directly, thereby avoiding a potentially inefficient rejection step based on the ratio of cross sections. Here, we have shown that this algorithm requires only one rejection step, whereas other methods of similar accuracy require two rejection steps. The method was verified against stochastic and deterministic reference results for upscattering percentages in 238U. Simulations of a light water reactor pin cell problemmore » demonstrate that using this algorithm results in a 3% or less penalty in performance when compared with an approximate method that is used in most production Monte Carlo codes« less

  11. Simulation of Nuclear Reactor Kinetics by the Monte Carlo Method

    NASA Astrophysics Data System (ADS)

    Gomin, E. A.; Davidenko, V. D.; Zinchenko, A. S.; Kharchenko, I. K.

    2017-12-01

    The KIR computer code intended for calculations of nuclear reactor kinetics using the Monte Carlo method is described. The algorithm implemented in the code is described in detail. Some results of test calculations are given.

  12. Hydrogeophysical Assessment of Aquifer Uncertainty Using Simulated Annealing driven MRF-Based Stochastic Joint Inversion

    NASA Astrophysics Data System (ADS)

    Oware, E. K.

    2017-12-01

    Geophysical quantification of hydrogeological parameters typically involve limited noisy measurements coupled with inadequate understanding of the target phenomenon. Hence, a deterministic solution is unrealistic in light of the largely uncertain inputs. Stochastic imaging (SI), in contrast, provides multiple equiprobable realizations that enable probabilistic assessment of aquifer properties in a realistic manner. Generation of geologically realistic prior models is central to SI frameworks. Higher-order statistics for representing prior geological features in SI are, however, usually borrowed from training images (TIs), which may produce undesirable outcomes if the TIs are unpresentatitve of the target structures. The Markov random field (MRF)-based SI strategy provides a data-driven alternative to TI-based SI algorithms. In the MRF-based method, the simulation of spatial features is guided by Gibbs energy (GE) minimization. Local configurations with smaller GEs have higher likelihood of occurrence and vice versa. The parameters of the Gibbs distribution for computing the GE are estimated from the hydrogeophysical data, thereby enabling the generation of site-specific structures in the absence of reliable TIs. In Metropolis-like SI methods, the variance of the transition probability controls the jump-size. The procedure is a standard Markov chain Monte Carlo (McMC) method when a constant variance is assumed, and becomes simulated annealing (SA) when the variance (cooling temperature) is allowed to decrease gradually with time. We observe that in certain problems, the large variance typically employed at the beginning to hasten burn-in may be unideal for sampling at the equilibrium state. The powerfulness of SA stems from its flexibility to adaptively scale the variance at different stages of the sampling. Degeneration of results were reported in a previous implementation of the MRF-based SI strategy based on a constant variance. Here, we present an updated version of the algorithm based on SA that appears to resolve the degeneration problem with seemingly improved results. We illustrate the performance of the SA version with a joint inversion of time-lapse concentration and electrical resistivity measurements in a hypothetical trinary hydrofacies aquifer characterization problem.

  13. Dose specification for radiation therapy: dose to water or dose to medium?

    NASA Astrophysics Data System (ADS)

    Ma, C.-M.; Li, Jinsheng

    2011-05-01

    The Monte Carlo method enables accurate dose calculation for radiation therapy treatment planning and has been implemented in some commercial treatment planning systems. Unlike conventional dose calculation algorithms that provide patient dose information in terms of dose to water with variable electron density, the Monte Carlo method calculates the energy deposition in different media and expresses dose to a medium. This paper discusses the differences in dose calculated using water with different electron densities and that calculated for different biological media and the clinical issues on dose specification including dose prescription and plan evaluation using dose to water and dose to medium. We will demonstrate that conventional photon dose calculation algorithms compute doses similar to those simulated by Monte Carlo using water with different electron densities, which are close (<4% differences) to doses to media but significantly different (up to 11%) from doses to water converted from doses to media following American Association of Physicists in Medicine (AAPM) Task Group 105 recommendations. Our results suggest that for consistency with previous radiation therapy experience Monte Carlo photon algorithms report dose to medium for radiotherapy dose prescription, treatment plan evaluation and treatment outcome analysis.

  14. High-efficiency wavefunction updates for large scale Quantum Monte Carlo

    NASA Astrophysics Data System (ADS)

    Kent, Paul; McDaniel, Tyler; Li, Ying Wai; D'Azevedo, Ed

    Within ab intio Quantum Monte Carlo (QMC) simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunctions. The evaluation of each Monte Carlo move requires finding the determinant of a dense matrix, which is traditionally iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. For calculations with thousands of electrons, this operation dominates the execution profile. We propose a novel rank- k delayed update scheme. This strategy enables probability evaluation for multiple successive Monte Carlo moves, with application of accepted moves to the matrices delayed until after a predetermined number of moves, k. Accepted events grouped in this manner are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency. This procedure does not change the underlying Monte Carlo sampling or the sampling efficiency. For large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude speedups can be obtained on both multi-core CPU and on GPUs, making this algorithm highly advantageous for current petascale and future exascale computations.

  15. Quantifying the effect of air gap, depth, and range shifter thickness on TPS dosimetric accuracy in superficial PBS proton therapy.

    PubMed

    Shirey, Robert J; Wu, Hsinshun Terry

    2018-01-01

    This study quantifies the dosimetric accuracy of a commercial treatment planning system as functions of treatment depth, air gap, and range shifter thickness for superficial pencil beam scanning proton therapy treatments. The RayStation 6 pencil beam and Monte Carlo dose engines were each used to calculate the dose distributions for a single treatment plan with varying range shifter air gaps. Central axis dose values extracted from each of the calculated plans were compared to dose values measured with a calibrated PTW Markus chamber at various depths in RW3 solid water. Dose was measured at 12 depths, ranging from the surface to 5 cm, for each of the 18 different air gaps, which ranged from 0.5 to 28 cm. TPS dosimetric accuracy, defined as the ratio of calculated dose relative to the measured dose, was plotted as functions of depth and air gap for the pencil beam and Monte Carlo dose algorithms. The accuracy of the TPS pencil beam dose algorithm was found to be clinically unacceptable at depths shallower than 3 cm with air gaps wider than 10 cm, and increased range shifter thickness only added to the dosimetric inaccuracy of the pencil beam algorithm. Each configuration calculated with Monte Carlo was determined to be clinically acceptable. Further comparisons of the Monte Carlo dose algorithm to the measured spread-out Bragg Peaks of multiple fields used during machine commissioning verified the dosimetric accuracy of Monte Carlo in a variety of beam energies and field sizes. Discrepancies between measured and TPS calculated dose values can mainly be attributed to the ability (or lack thereof) of the TPS pencil beam dose algorithm to properly model secondary proton scatter generated in the range shifter. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

  16. Hybrid Microgrid Configuration Optimization with Evolutionary Algorithms

    NASA Astrophysics Data System (ADS)

    Lopez, Nicolas

    This dissertation explores the Renewable Energy Integration Problem, and proposes a Genetic Algorithm embedded with a Monte Carlo simulation to solve large instances of the problem that are impractical to solve via full enumeration. The Renewable Energy Integration Problem is defined as finding the optimum set of components to supply the electric demand to a hybrid microgrid. The components considered are solar panels, wind turbines, diesel generators, electric batteries, connections to the power grid and converters, which can be inverters and/or rectifiers. The methodology developed is explained as well as the combinatorial formulation. In addition, 2 case studies of a single objective optimization version of the problem are presented, in order to minimize cost and to minimize global warming potential (GWP) followed by a multi-objective implementation of the offered methodology, by utilizing a non-sorting Genetic Algorithm embedded with a monte Carlo Simulation. The method is validated by solving a small instance of the problem with known solution via a full enumeration algorithm developed by NREL in their software HOMER. The dissertation concludes that the evolutionary algorithms embedded with Monte Carlo simulation namely modified Genetic Algorithms are an efficient form of solving the problem, by finding approximate solutions in the case of single objective optimization, and by approximating the true Pareto front in the case of multiple objective optimization of the Renewable Energy Integration Problem.

  17. SAChES: Scalable Adaptive Chain-Ensemble Sampling.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swiler, Laura Painton; Ray, Jaideep; Ebeida, Mohamed Salah

    We present the development of a parallel Markov Chain Monte Carlo (MCMC) method called SAChES, Scalable Adaptive Chain-Ensemble Sampling. This capability is targed to Bayesian calibration of com- putationally expensive simulation models. SAChES involves a hybrid of two methods: Differential Evo- lution Monte Carlo followed by Adaptive Metropolis. Both methods involve parallel chains. Differential evolution allows one to explore high-dimensional parameter spaces using loosely coupled (i.e., largely asynchronous) chains. Loose coupling allows the use of large chain ensembles, with far more chains than the number of parameters to explore. This reduces per-chain sampling burden, enables high-dimensional inversions and the usemore » of computationally expensive forward models. The large number of chains can also ameliorate the impact of silent-errors, which may affect only a few chains. The chain ensemble can also be sampled to provide an initial condition when an aberrant chain is re-spawned. Adaptive Metropolis takes the best points from the differential evolution and efficiently hones in on the poste- rior density. The multitude of chains in SAChES is leveraged to (1) enable efficient exploration of the parameter space; and (2) ensure robustness to silent errors which may be unavoidable in extreme-scale computational platforms of the future. This report outlines SAChES, describes four papers that are the result of the project, and discusses some additional results.« less

  18. SU-F-T-148: Are the Approximations in Analytic Semi-Empirical Dose Calculation Algorithms for Intensity Modulated Proton Therapy for Complex Heterogeneities of Head and Neck Clinically Significant?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yepes, P; UT MD Anderson Cancer Center, Houston, TX; Titt, U

    2016-06-15

    Purpose: Evaluate the differences in dose distributions between the proton analytic semi-empirical dose calculation algorithm used in the clinic and Monte Carlo calculations for a sample of 50 head-and-neck (H&N) patients and estimate the potential clinical significance of the differences. Methods: A cohort of 50 H&N patients, treated at the University of Texas Cancer Center with Intensity Modulated Proton Therapy (IMPT), were selected for evaluation of clinical significance of approximations in computed dose distributions. H&N site was selected because of the highly inhomogeneous nature of the anatomy. The Fast Dose Calculator (FDC), a fast track-repeating accelerated Monte Carlo algorithm formore » proton therapy, was utilized for the calculation of dose distributions delivered during treatment plans. Because of its short processing time, FDC allows for the processing of large cohorts of patients. FDC has been validated versus GEANT4, a full Monte Carlo system and measurements in water and for inhomogeneous phantoms. A gamma-index analysis, DVHs, EUDs, and TCP and NTCPs computed using published models were utilized to evaluate the differences between the Treatment Plan System (TPS) and FDC. Results: The Monte Carlo results systematically predict lower dose delivered in the target. The observed differences can be as large as 8 Gy, and should have a clinical impact. Gamma analysis also showed significant differences between both approaches, especially for the target volumes. Conclusion: Monte Carlo calculations with fast algorithms is practical and should be considered for the clinic, at least as a treatment plan verification tool.« less

  19. The Metropolis Monte Carlo method with CUDA enabled Graphic Processing Units

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hall, Clifford; School of Physics, Astronomy, and Computational Sciences, George Mason University, 4400 University Dr., Fairfax, VA 22030; Ji, Weixiao

    2014-02-01

    We present a CPU–GPU system for runtime acceleration of large molecular simulations using GPU computation and memory swaps. The memory architecture of the GPU can be used both as container for simulation data stored on the graphics card and as floating-point code target, providing an effective means for the manipulation of atomistic or molecular data on the GPU. To fully take advantage of this mechanism, efficient GPU realizations of algorithms used to perform atomistic and molecular simulations are essential. Our system implements a versatile molecular engine, including inter-molecule interactions and orientational variables for performing the Metropolis Monte Carlo (MMC) algorithm,more » which is one type of Markov chain Monte Carlo. By combining memory objects with floating-point code fragments we have implemented an MMC parallel engine that entirely avoids the communication time of molecular data at runtime. Our runtime acceleration system is a forerunner of a new class of CPU–GPU algorithms exploiting memory concepts combined with threading for avoiding bus bandwidth and communication. The testbed molecular system used here is a condensed phase system of oligopyrrole chains. A benchmark shows a size scaling speedup of 60 for systems with 210,000 pyrrole monomers. Our implementation can easily be combined with MPI to connect in parallel several CPU–GPU duets. -- Highlights: •We parallelize the Metropolis Monte Carlo (MMC) algorithm on one CPU—GPU duet. •The Adaptive Tempering Monte Carlo employs MMC and profits from this CPU—GPU implementation. •Our benchmark shows a size scaling-up speedup of 62 for systems with 225,000 particles. •The testbed involves a polymeric system of oligopyrroles in the condensed phase. •The CPU—GPU parallelization includes dipole—dipole and Mie—Jones classic potentials.« less

  20. A point kernel algorithm for microbeam radiation therapy

    NASA Astrophysics Data System (ADS)

    Debus, Charlotte; Oelfke, Uwe; Bartzsch, Stefan

    2017-11-01

    Microbeam radiation therapy (MRT) is a treatment approach in radiation therapy where the treatment field is spatially fractionated into arrays of a few tens of micrometre wide planar beams of unusually high peak doses separated by low dose regions of several hundred micrometre width. In preclinical studies, this treatment approach has proven to spare normal tissue more effectively than conventional radiation therapy, while being equally efficient in tumour control. So far dose calculations in MRT, a prerequisite for future clinical applications are based on Monte Carlo simulations. However, they are computationally expensive, since scoring volumes have to be small. In this article a kernel based dose calculation algorithm is presented that splits the calculation into photon and electron mediated energy transport, and performs the calculation of peak and valley doses in typical MRT treatment fields within a few minutes. Kernels are analytically calculated depending on the energy spectrum and material composition. In various homogeneous materials peak, valley doses and microbeam profiles are calculated and compared to Monte Carlo simulations. For a microbeam exposure of an anthropomorphic head phantom calculated dose values are compared to measurements and Monte Carlo calculations. Except for regions close to material interfaces calculated peak dose values match Monte Carlo results within 4% and valley dose values within 8% deviation. No significant differences are observed between profiles calculated by the kernel algorithm and Monte Carlo simulations. Measurements in the head phantom agree within 4% in the peak and within 10% in the valley region. The presented algorithm is attached to the treatment planning platform VIRTUOS. It was and is used for dose calculations in preclinical and pet-clinical trials at the biomedical beamline ID17 of the European synchrotron radiation facility in Grenoble, France.

  1. Monte Carlo Simulations of Radiative and Neutrino Transport under Astrophysical Conditions

    NASA Astrophysics Data System (ADS)

    Krivosheyev, Yu. M.; Bisnovatyi-Kogan, G. S.

    2018-05-01

    Monte Carlo simulations are utilized to model radiative and neutrino transfer in astrophysics. An algorithm that can be used to study radiative transport in astrophysical plasma based on simulations of photon trajectories in a medium is described. Formation of the hard X-ray spectrum of the Galactic microquasar SS 433 is considered in detail as an example. Specific requirements for applying such simulations to neutrino transport in a densemedium and algorithmic differences compared to its application to photon transport are discussed.

  2. Monte Carlo sampling of Wigner functions and surface hopping quantum dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kube, Susanna; Lasser, Caroline; Weber, Marcus

    2009-04-01

    The article addresses the achievable accuracy for a Monte Carlo sampling of Wigner functions in combination with a surface hopping algorithm for non-adiabatic quantum dynamics. The approximation of Wigner functions is realized by an adaption of the Metropolis algorithm for real-valued functions with disconnected support. The integration, which is necessary for computing values of the Wigner function, uses importance sampling with a Gaussian weight function. The numerical experiments agree with theoretical considerations and show an error of 2-3%.

  3. An efficient Monte Carlo-based algorithm for scatter correction in keV cone-beam CT

    NASA Astrophysics Data System (ADS)

    Poludniowski, G.; Evans, P. M.; Hansen, V. N.; Webb, S.

    2009-06-01

    A new method is proposed for scatter-correction of cone-beam CT images. A coarse reconstruction is used in initial iteration steps. Modelling of the x-ray tube spectra and detector response are included in the algorithm. Photon diffusion inside the imaging subject is calculated using the Monte Carlo method. Photon scoring at the detector is calculated using forced detection to a fixed set of node points. The scatter profiles are then obtained by linear interpolation. The algorithm is referred to as the coarse reconstruction and fixed detection (CRFD) technique. Scatter predictions are quantitatively validated against a widely used general-purpose Monte Carlo code: BEAMnrc/EGSnrc (NRCC, Canada). Agreement is excellent. The CRFD algorithm was applied to projection data acquired with a Synergy XVI CBCT unit (Elekta Limited, Crawley, UK), using RANDO and Catphan phantoms (The Phantom Laboratory, Salem NY, USA). The algorithm was shown to be effective in removing scatter-induced artefacts from CBCT images, and took as little as 2 min on a desktop PC. Image uniformity was greatly improved as was CT-number accuracy in reconstructions. This latter improvement was less marked where the expected CT-number of a material was very different to the background material in which it was embedded.

  4. Constant-pressure nested sampling with atomistic dynamics

    NASA Astrophysics Data System (ADS)

    Baldock, Robert J. N.; Bernstein, Noam; Salerno, K. Michael; Pártay, Lívia B.; Csányi, Gábor

    2017-10-01

    The nested sampling algorithm has been shown to be a general method for calculating the pressure-temperature-composition phase diagrams of materials. While the previous implementation used single-particle Monte Carlo moves, these are inefficient for condensed systems with general interactions where single-particle moves cannot be evaluated faster than the energy of the whole system. Here we enhance the method by using all-particle moves: either Galilean Monte Carlo or the total enthalpy Hamiltonian Monte Carlo algorithm, introduced in this paper. We show that these algorithms enable the determination of phase transition temperatures with equivalent accuracy to the previous method at 1 /N of the cost for an N -particle system with general interactions, or at equal cost when single-particle moves can be done in 1 /N of the cost of a full N -particle energy evaluation. We demonstrate this speed-up for the freezing and condensation transitions of the Lennard-Jones system and show the utility of the algorithms by calculating the order-disorder phase transition of a binary Lennard-Jones model alloy, the eutectic of copper-gold, the density anomaly of water, and the condensation and solidification of bead-spring polymers. The nested sampling method with all three algorithms is implemented in the pymatnest software.

  5. A new moving strategy for the sequential Monte Carlo approach in optimizing the hydrological model parameters

    NASA Astrophysics Data System (ADS)

    Zhu, Gaofeng; Li, Xin; Ma, Jinzhu; Wang, Yunquan; Liu, Shaomin; Huang, Chunlin; Zhang, Kun; Hu, Xiaoli

    2018-04-01

    Sequential Monte Carlo (SMC) samplers have become increasing popular for estimating the posterior parameter distribution with the non-linear dependency structures and multiple modes often present in hydrological models. However, the explorative capabilities and efficiency of the sampler depends strongly on the efficiency in the move step of SMC sampler. In this paper we presented a new SMC sampler entitled the Particle Evolution Metropolis Sequential Monte Carlo (PEM-SMC) algorithm, which is well suited to handle unknown static parameters of hydrologic model. The PEM-SMC sampler is inspired by the works of Liang and Wong (2001) and operates by incorporating the strengths of the genetic algorithm, differential evolution algorithm and Metropolis-Hasting algorithm into the framework of SMC. We also prove that the sampler admits the target distribution to be a stationary distribution. Two case studies including a multi-dimensional bimodal normal distribution and a conceptual rainfall-runoff hydrologic model by only considering parameter uncertainty and simultaneously considering parameter and input uncertainty show that PEM-SMC sampler is generally superior to other popular SMC algorithms in handling the high dimensional problems. The study also indicated that it may be important to account for model structural uncertainty by using multiplier different hydrological models in the SMC framework in future study.

  6. Hypothesis testing of scientific Monte Carlo calculations.

    PubMed

    Wallerberger, Markus; Gull, Emanuel

    2017-11-01

    The steadily increasing size of scientific Monte Carlo simulations and the desire for robust, correct, and reproducible results necessitates rigorous testing procedures for scientific simulations in order to detect numerical problems and programming bugs. However, the testing paradigms developed for deterministic algorithms have proven to be ill suited for stochastic algorithms. In this paper we demonstrate explicitly how the technique of statistical hypothesis testing, which is in wide use in other fields of science, can be used to devise automatic and reliable tests for Monte Carlo methods, and we show that these tests are able to detect some of the common problems encountered in stochastic scientific simulations. We argue that hypothesis testing should become part of the standard testing toolkit for scientific simulations.

  7. Hypothesis testing of scientific Monte Carlo calculations

    NASA Astrophysics Data System (ADS)

    Wallerberger, Markus; Gull, Emanuel

    2017-11-01

    The steadily increasing size of scientific Monte Carlo simulations and the desire for robust, correct, and reproducible results necessitates rigorous testing procedures for scientific simulations in order to detect numerical problems and programming bugs. However, the testing paradigms developed for deterministic algorithms have proven to be ill suited for stochastic algorithms. In this paper we demonstrate explicitly how the technique of statistical hypothesis testing, which is in wide use in other fields of science, can be used to devise automatic and reliable tests for Monte Carlo methods, and we show that these tests are able to detect some of the common problems encountered in stochastic scientific simulations. We argue that hypothesis testing should become part of the standard testing toolkit for scientific simulations.

  8. Online sequential Monte Carlo smoother for partially observed diffusion processes

    NASA Astrophysics Data System (ADS)

    Gloaguen, Pierre; Étienne, Marie-Pierre; Le Corff, Sylvain

    2018-12-01

    This paper introduces a new algorithm to approximate smoothed additive functionals of partially observed diffusion processes. This method relies on a new sequential Monte Carlo method which allows to compute such approximations online, i.e., as the observations are received, and with a computational complexity growing linearly with the number of Monte Carlo samples. The original algorithm cannot be used in the case of partially observed stochastic differential equations since the transition density of the latent data is usually unknown. We prove that it may be extended to partially observed continuous processes by replacing this unknown quantity by an unbiased estimator obtained for instance using general Poisson estimators. This estimator is proved to be consistent and its performance are illustrated using data from two models.

  9. A surrogate accelerated multicanonical Monte Carlo method for uncertainty quantification

    NASA Astrophysics Data System (ADS)

    Wu, Keyi; Li, Jinglai

    2016-09-01

    In this work we consider a class of uncertainty quantification problems where the system performance or reliability is characterized by a scalar parameter y. The performance parameter y is random due to the presence of various sources of uncertainty in the system, and our goal is to estimate the probability density function (PDF) of y. We propose to use the multicanonical Monte Carlo (MMC) method, a special type of adaptive importance sampling algorithms, to compute the PDF of interest. Moreover, we develop an adaptive algorithm to construct local Gaussian process surrogates to further accelerate the MMC iterations. With numerical examples we demonstrate that the proposed method can achieve several orders of magnitudes of speedup over the standard Monte Carlo methods.

  10. Path integral Monte Carlo ground state approach: formalism, implementation, and applications

    NASA Astrophysics Data System (ADS)

    Yan, Yangqian; Blume, D.

    2017-11-01

    Monte Carlo techniques have played an important role in understanding strongly correlated systems across many areas of physics, covering a wide range of energy and length scales. Among the many Monte Carlo methods applicable to quantum mechanical systems, the path integral Monte Carlo approach with its variants has been employed widely. Since semi-classical or classical approaches will not be discussed in this review, path integral based approaches can for our purposes be divided into two categories: approaches applicable to quantum mechanical systems at zero temperature and approaches applicable to quantum mechanical systems at finite temperature. While these two approaches are related to each other, the underlying formulation and aspects of the algorithm differ. This paper reviews the path integral Monte Carlo ground state (PIGS) approach, which solves the time-independent Schrödinger equation. Specifically, the PIGS approach allows for the determination of expectation values with respect to eigen states of the few- or many-body Schrödinger equation provided the system Hamiltonian is known. The theoretical framework behind the PIGS algorithm, implementation details, and sample applications for fermionic systems are presented.

  11. Multigroup Monte Carlo on GPUs: Comparison of history- and event-based algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamilton, Steven P.; Slattery, Stuart R.; Evans, Thomas M.

    This article presents an investigation of the performance of different multigroup Monte Carlo transport algorithms on GPUs with a discussion of both history-based and event-based approaches. Several algorithmic improvements are introduced for both approaches. By modifying the history-based algorithm that is traditionally favored in CPU-based MC codes to occasionally filter out dead particles to reduce thread divergence, performance exceeds that of either the pure history-based or event-based approaches. The impacts of several algorithmic choices are discussed, including performance studies on Kepler and Pascal generation NVIDIA GPUs for fixed source and eigenvalue calculations. Single-device performance equivalent to 20–40 CPU cores onmore » the K40 GPU and 60–80 CPU cores on the P100 GPU is achieved. Last, in addition, nearly perfect multi-device parallel weak scaling is demonstrated on more than 16,000 nodes of the Titan supercomputer.« less

  12. Multigroup Monte Carlo on GPUs: Comparison of history- and event-based algorithms

    DOE PAGES

    Hamilton, Steven P.; Slattery, Stuart R.; Evans, Thomas M.

    2017-12-22

    This article presents an investigation of the performance of different multigroup Monte Carlo transport algorithms on GPUs with a discussion of both history-based and event-based approaches. Several algorithmic improvements are introduced for both approaches. By modifying the history-based algorithm that is traditionally favored in CPU-based MC codes to occasionally filter out dead particles to reduce thread divergence, performance exceeds that of either the pure history-based or event-based approaches. The impacts of several algorithmic choices are discussed, including performance studies on Kepler and Pascal generation NVIDIA GPUs for fixed source and eigenvalue calculations. Single-device performance equivalent to 20–40 CPU cores onmore » the K40 GPU and 60–80 CPU cores on the P100 GPU is achieved. Last, in addition, nearly perfect multi-device parallel weak scaling is demonstrated on more than 16,000 nodes of the Titan supercomputer.« less

  13. Comments on "Including the effects of temperature-dependent opacities in the implicit Monte Carlo algorithm" by N.A. Gentile [J. Comput. Phys. 230 (2011) 5100-5114

    NASA Astrophysics Data System (ADS)

    Ghosh, Karabi

    2017-02-01

    We briefly comment on a paper by N.A. Gentile [J. Comput. Phys. 230 (2011) 5100-5114] in which the Fleck factor has been modified to include the effects of temperature-dependent opacities in the implicit Monte Carlo algorithm developed by Fleck and Cummings [1,2]. Instead of the Fleck factor, f = 1 / (1 + βcΔtσP), the author derived the modified Fleck factor g = 1 / (1 + βcΔtσP - min [σP‧ (aTr4 - aT4)cΔt/ρCV, 0 ]) to be used in the Implicit Monte Carlo (IMC) algorithm in order to obtain more accurate solutions with much larger time steps. Here β = 4 aT3 / ρCV, σP is the Planck opacity and the derivative of Planck opacity w.r.t. the material temperature is σP‧ = dσP / dT.

  14. Parallel Algorithms for Monte Carlo Particle Transport Simulation on Exascale Computing Architectures

    NASA Astrophysics Data System (ADS)

    Romano, Paul Kollath

    Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallel efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O( N ) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes---in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters. (Copies available exclusively from MIT Libraries, libraries.mit.edu/docs - docs mit.edu)

  15. Exploring nonlinear feature space dimension reduction and data representation in breast CADx with Laplacian eigenmaps and t-SNE

    PubMed Central

    Jamieson, Andrew R.; Giger, Maryellen L.; Drukker, Karen; Li, Hui; Yuan, Yading; Bhooshan, Neha

    2010-01-01

    Purpose: In this preliminary study, recently developed unsupervised nonlinear dimension reduction (DR) and data representation techniques were applied to computer-extracted breast lesion feature spaces across three separate imaging modalities: Ultrasound (U.S.) with 1126 cases, dynamic contrast enhanced magnetic resonance imaging with 356 cases, and full-field digital mammography with 245 cases. Two methods for nonlinear DR were explored: Laplacian eigenmaps [M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Comput. 15, 1373–1396 (2003)] and t-distributed stochastic neighbor embedding (t-SNE) [L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res. 9, 2579–2605 (2008)]. Methods: These methods attempt to map originally high dimensional feature spaces to more human interpretable lower dimensional spaces while preserving both local and global information. The properties of these methods as applied to breast computer-aided diagnosis (CADx) were evaluated in the context of malignancy classification performance as well as in the visual inspection of the sparseness within the two-dimensional and three-dimensional mappings. Classification performance was estimated by using the reduced dimension mapped feature output as input into both linear and nonlinear classifiers: Markov chain Monte Carlo based Bayesian artificial neural network (MCMC-BANN) and linear discriminant analysis. The new techniques were compared to previously developed breast CADx methodologies, including automatic relevance determination and linear stepwise (LSW) feature selection, as well as a linear DR method based on principal component analysis. Using ROC analysis and 0.632+bootstrap validation, 95% empirical confidence intervals were computed for the each classifier’s AUC performance. Results: In the large U.S. data set, sample high performance results include, AUC0.632+=0.88 with 95% empirical bootstrap interval [0.787;0.895] for 13 ARD selected features and AUC0.632+=0.87 with interval [0.817;0.906] for four LSW selected features compared to 4D t-SNE mapping (from the original 81D feature space) giving AUC0.632+=0.90 with interval [0.847;0.919], all using the MCMC-BANN. Conclusions: Preliminary results appear to indicate capability for the new methods to match or exceed classification performance of current advanced breast lesion CADx algorithms. While not appropriate as a complete replacement of feature selection in CADx problems, DR techniques offer a complementary approach, which can aid elucidation of additional properties associated with the data. Specifically, the new techniques were shown to possess the added benefit of delivering sparse lower dimensional representations for visual interpretation, revealing intricate data structure of the feature space. PMID:20175497

  16. A Technology Solution Strengthens Comprehensive Environmental Management

    DTIC Science & Technology

    2012-05-23

    General Navigation  Chemical Approval Example  NEPA Coordination Example  Safety PPE Example  Summary Marine Corps Support Facility...coordination, completion and documentation through automated workflows of various business processes  Chemical Approval  NEPA Coordination  Safety ...Completion Diagram Government Employee/M CMC MCMC Chemical Manager MCMC HS&E Specialist IMO Chemical Safety Specialist IMO Chemical Environmental

  17. Using SAS PROC MCMC for Item Response Theory Models

    ERIC Educational Resources Information Center

    Ames, Allison J.; Samonte, Kelli

    2015-01-01

    Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian…

  18. Using SAS PROC MCMC for Item Response Theory Models

    PubMed Central

    Samonte, Kelli

    2014-01-01

    Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian methods in the context of item response theory to serve as a useful guide for practitioners in estimating and interpreting item response theory (IRT) models. Included is a description of the estimation procedure used by SAS PROC MCMC. Syntax is provided for estimation of both dichotomous and polytomous IRT models, as well as a discussion on how to extend the syntax to accommodate more complex IRT models. PMID:29795834

  19. MDTS: automatic complex materials design using Monte Carlo tree search.

    PubMed

    M Dieb, Thaer; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji

    2017-01-01

    Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.

  20. MDTS: automatic complex materials design using Monte Carlo tree search

    NASA Astrophysics Data System (ADS)

    Dieb, Thaer M.; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji

    2017-12-01

    Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.

  1. Efficient kinetic Monte Carlo method for reaction-diffusion problems with spatially varying annihilation rates

    NASA Astrophysics Data System (ADS)

    Schwarz, Karsten; Rieger, Heiko

    2013-03-01

    We present an efficient Monte Carlo method to simulate reaction-diffusion processes with spatially varying particle annihilation or transformation rates as it occurs for instance in the context of motor-driven intracellular transport. Like Green's function reaction dynamics and first-passage time methods, our algorithm avoids small diffusive hops by propagating sufficiently distant particles in large hops to the boundaries of protective domains. Since for spatially varying annihilation or transformation rates the single particle diffusion propagator is not known analytically, we present an algorithm that generates efficiently either particle displacements or annihilations with the correct statistics, as we prove rigorously. The numerical efficiency of the algorithm is demonstrated with an illustrative example.

  2. Note: A pure-sampling quantum Monte Carlo algorithm with independent Metropolis.

    PubMed

    Vrbik, Jan; Ospadov, Egor; Rothstein, Stuart M

    2016-07-14

    Recently, Ospadov and Rothstein published a pure-sampling quantum Monte Carlo algorithm (PSQMC) that features an auxiliary Path Z that connects the midpoints of the current and proposed Paths X and Y, respectively. When sufficiently long, Path Z provides statistical independence of Paths X and Y. Under those conditions, the Metropolis decision used in PSQMC is done without any approximation, i.e., not requiring microscopic reversibility and without having to introduce any G(x → x'; τ) factors into its decision function. This is a unique feature that contrasts with all competing reptation algorithms in the literature. An example illustrates that dependence of Paths X and Y has adverse consequences for pure sampling.

  3. Note: A pure-sampling quantum Monte Carlo algorithm with independent Metropolis

    NASA Astrophysics Data System (ADS)

    Vrbik, Jan; Ospadov, Egor; Rothstein, Stuart M.

    2016-07-01

    Recently, Ospadov and Rothstein published a pure-sampling quantum Monte Carlo algorithm (PSQMC) that features an auxiliary Path Z that connects the midpoints of the current and proposed Paths X and Y, respectively. When sufficiently long, Path Z provides statistical independence of Paths X and Y. Under those conditions, the Metropolis decision used in PSQMC is done without any approximation, i.e., not requiring microscopic reversibility and without having to introduce any G(x → x'; τ) factors into its decision function. This is a unique feature that contrasts with all competing reptation algorithms in the literature. An example illustrates that dependence of Paths X and Y has adverse consequences for pure sampling.

  4. Event-driven Monte Carlo: Exact dynamics at all time scales for discrete-variable models

    NASA Astrophysics Data System (ADS)

    Mendoza-Coto, Alejandro; Díaz-Méndez, Rogelio; Pupillo, Guido

    2016-06-01

    We present an algorithm for the simulation of the exact real-time dynamics of classical many-body systems with discrete energy levels. In the same spirit of kinetic Monte Carlo methods, a stochastic solution of the master equation is found, with no need to define any other phase-space construction. However, unlike existing methods, the present algorithm does not assume any particular statistical distribution to perform moves or to advance the time, and thus is a unique tool for the numerical exploration of fast and ultra-fast dynamical regimes. By decomposing the problem in a set of two-level subsystems, we find a natural variable step size, that is well defined from the normalization condition of the transition probabilities between the levels. We successfully test the algorithm with known exact solutions for non-equilibrium dynamics and equilibrium thermodynamical properties of Ising-spin models in one and two dimensions, and compare to standard implementations of kinetic Monte Carlo methods. The present algorithm is directly applicable to the study of the real-time dynamics of a large class of classical Markovian chains, and particularly to short-time situations where the exact evolution is relevant.

  5. Geometrically Constructed Markov Chain Monte Carlo Study of Quantum Spin-phonon Complex Systems

    NASA Astrophysics Data System (ADS)

    Suwa, Hidemaro

    2013-03-01

    We have developed novel Monte Carlo methods for precisely calculating quantum spin-boson models and investigated the critical phenomena of the spin-Peierls systems. Three significant methods are presented. The first is a new optimization algorithm of the Markov chain transition kernel based on the geometric weight allocation. This algorithm, for the first time, satisfies the total balance generally without imposing the detailed balance and always minimizes the average rejection rate, being better than the Metropolis algorithm. The second is the extension of the worm (directed-loop) algorithm to non-conserved particles, which cannot be treated efficiently by the conventional methods. The third is the combination with the level spectroscopy. Proposing a new gap estimator, we are successful in eliminating the systematic error of the conventional moment method. Then we have elucidated the phase diagram and the universality class of the one-dimensional XXZ spin-Peierls system. The criticality is totally consistent with the J1 -J2 model, an effective model in the antiadiabatic limit. Through this research, we have succeeded in investigating the critical phenomena of the effectively frustrated quantum spin system by the quantum Monte Carlo method without the negative sign. JSPS Postdoctoral Fellow for Research Abroad

  6. Technical Note: A direct ray-tracing method to compute integral depth dose in pencil beam proton radiography with a multilayer ionization chamber.

    PubMed

    Farace, Paolo; Righetto, Roberto; Deffet, Sylvain; Meijers, Arturs; Vander Stappen, Francois

    2016-12-01

    To introduce a fast ray-tracing algorithm in pencil proton radiography (PR) with a multilayer ionization chamber (MLIC) for in vivo range error mapping. Pencil beam PR was obtained by delivering spots uniformly positioned in a square (45 × 45 mm 2 field-of-view) of 9 × 9 spots capable of crossing the phantoms (210 MeV). The exit beam was collected by a MLIC to sample the integral depth dose (IDD MLIC ). PRs of an electron-density and of a head phantom were acquired by moving the couch to obtain multiple 45 × 45 mm 2 frames. To map the corresponding range errors, the two-dimensional set of IDD MLIC was compared with (i) the integral depth dose computed by the treatment planning system (TPS) by both analytic (IDD TPS ) and Monte Carlo (IDD MC ) algorithms in a volume of water simulating the MLIC at the CT, and (ii) the integral depth dose directly computed by a simple ray-tracing algorithm (IDD direct ) through the same CT data. The exact spatial position of the spot pattern was numerically adjusted testing different in-plane positions and selecting the one that minimized the range differences between IDD direct and IDD MLIC . Range error mapping was feasible by both the TPS and the ray-tracing methods, but very sensitive to even small misalignments. In homogeneous regions, the range errors computed by the direct ray-tracing algorithm matched the results obtained by both the analytic and the Monte Carlo algorithms. In both phantoms, lateral heterogeneities were better modeled by the ray-tracing and the Monte Carlo algorithms than by the analytic TPS computation. Accordingly, when the pencil beam crossed lateral heterogeneities, the range errors mapped by the direct algorithm matched better the Monte Carlo maps than those obtained by the analytic algorithm. Finally, the simplicity of the ray-tracing algorithm allowed to implement a prototype procedure for automated spatial alignment. The ray-tracing algorithm can reliably replace the TPS method in MLIC PR for in vivo range verification and it can be a key component to develop software tools for spatial alignment and correction of CT calibration.

  7. Markov switching multinomial logit model: An application to accident-injury severities.

    PubMed

    Malyshkina, Nataliya V; Mannering, Fred L

    2009-07-01

    In this study, two-state Markov switching multinomial logit models are proposed for statistical modeling of accident-injury severities. These models assume Markov switching over time between two unobserved states of roadway safety as a means of accounting for potential unobserved heterogeneity. The states are distinct in the sense that in different states accident-severity outcomes are generated by separate multinomial logit processes. To demonstrate the applicability of the approach, two-state Markov switching multinomial logit models are estimated for severity outcomes of accidents occurring on Indiana roads over a four-year time period. Bayesian inference methods and Markov Chain Monte Carlo (MCMC) simulations are used for model estimation. The estimated Markov switching models result in a superior statistical fit relative to the standard (single-state) multinomial logit models for a number of roadway classes and accident types. It is found that the more frequent state of roadway safety is correlated with better weather conditions and that the less frequent state is correlated with adverse weather conditions.

  8. Quantum Enhanced Inference in Markov Logic Networks

    NASA Astrophysics Data System (ADS)

    Wittek, Peter; Gogolin, Christian

    2017-04-01

    Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.

  9. Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula

    NASA Astrophysics Data System (ADS)

    Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.

    2016-03-01

    A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.

  10. Multivariate Copula Analysis Toolbox (MvCAT): Describing dependence and underlying uncertainty using a Bayesian framework

    NASA Astrophysics Data System (ADS)

    Sadegh, Mojtaba; Ragno, Elisa; AghaKouchak, Amir

    2017-06-01

    We present a newly developed Multivariate Copula Analysis Toolbox (MvCAT) which includes a wide range of copula families with different levels of complexity. MvCAT employs a Bayesian framework with a residual-based Gaussian likelihood function for inferring copula parameters and estimating the underlying uncertainties. The contribution of this paper is threefold: (a) providing a Bayesian framework to approximate the predictive uncertainties of fitted copulas, (b) introducing a hybrid-evolution Markov Chain Monte Carlo (MCMC) approach designed for numerical estimation of the posterior distribution of copula parameters, and (c) enabling the community to explore a wide range of copulas and evaluate them relative to the fitting uncertainties. We show that the commonly used local optimization methods for copula parameter estimation often get trapped in local minima. The proposed method, however, addresses this limitation and improves describing the dependence structure. MvCAT also enables evaluation of uncertainties relative to the length of record, which is fundamental to a wide range of applications such as multivariate frequency analysis.

  11. Observational constraints on holographic dark energy with varying gravitational constant

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lu, Jianbo; Xu, Lixin; Saridakis, Emmanuel N.

    2010-03-01

    We use observational data from Type Ia Supernovae (SN), Baryon Acoustic Oscillations (BAO), Cosmic Microwave Background (CMB) and observational Hubble data (OHD), and the Markov Chain Monte Carlo (MCMC) method, to constrain the cosmological scenario of holographic dark energy with varying gravitational constant. We consider both flat and non-flat background geometry, and we present the corresponding constraints and contour-plots of the model parameters. We conclude that the scenario is compatible with observations. In 1σ we find Ω{sub Λ0} = 0.72{sup +0.03}{sub −0.03}, Ω{sub k0} = −0.0013{sup +0.0130}{sub −0.0040}, c = 0.80{sup +0.19}{sub −0.14} and Δ{sub G}≡G'/G = −0.0025{sup +0.0080}{sub −0.0050},more » while for the present value of the dark energy equation-of-state parameter we obtain w{sub 0} = −1.04{sup +0.15}{sub −0.20}.« less

  12. Learning the ideal observer for SKE detection tasks by use of convolutional neural networks (Cum Laude Poster Award)

    NASA Astrophysics Data System (ADS)

    Zhou, Weimin; Anastasio, Mark A.

    2018-03-01

    It has been advocated that task-based measures of image quality (IQ) should be employed to evaluate and optimize imaging systems. Task-based measures of IQ quantify the performance of an observer on a medically relevant task. The Bayesian Ideal Observer (IO), which employs complete statistical information of the object and noise, achieves the upper limit of the performance for a binary signal classification task. However, computing the IO performance is generally analytically intractable and can be computationally burdensome when Markov-chain Monte Carlo (MCMC) techniques are employed. In this paper, supervised learning with convolutional neural networks (CNNs) is employed to approximate the IO test statistics for a signal-known-exactly and background-known-exactly (SKE/BKE) binary detection task. The receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) are compared to those produced by the analytically computed IO. The advantages of the proposed supervised learning approach for approximating the IO are demonstrated.

  13. A Bayesian Poisson-lognormal Model for Count Data for Multiple-Trait Multiple-Environment Genomic-Enabled Prediction

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Toledo, Fernando H.; Montesinos-López, José C.; Singh, Pawan; Juliana, Philomin; Salinas-Ruiz, Josafhat

    2017-01-01

    When a plant scientist wishes to make genomic-enabled predictions of multiple traits measured in multiple individuals in multiple environments, the most common strategy for performing the analysis is to use a single trait at a time taking into account genotype × environment interaction (G × E), because there is a lack of comprehensive models that simultaneously take into account the correlated counting traits and G × E. For this reason, in this study we propose a multiple-trait and multiple-environment model for count data. The proposed model was developed under the Bayesian paradigm for which we developed a Markov Chain Monte Carlo (MCMC) with noninformative priors. This allows obtaining all required full conditional distributions of the parameters leading to an exact Gibbs sampler for the posterior distribution. Our model was tested with simulated data and a real data set. Results show that the proposed multi-trait, multi-environment model is an attractive alternative for modeling multiple count traits measured in multiple environments. PMID:28364037

  14. Phylogenetic analysis and victim contact tracing of rabies virus from humans and dogs in Bali, Indonesia.

    PubMed

    Mahardika, G N K; Dibia, N; Budayanti, N S; Susilawathi, N M; Subrata, K; Darwinata, A E; Wignall, F S; Richt, J A; Valdivia-Granda, W A; Sudewi, A A R

    2014-06-01

    The emergence of human and animal rabies in Bali since November 2008 has attracted local, national and international interest. The potential origin and time of introduction of rabies virus to Bali is described. The nucleoprotein (N) gene of rabies virus from dog brain and human clinical specimens was sequenced using an automated DNA sequencer. Phylogenetic inference with Bayesian Markov Chain Monte Carlo (MCMC) analysis using the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) v. 1.7.5 software confirmed that the outbreak of rabies in Bali was caused by an Indonesian lineage virus following a single introduction. The ancestor of Bali viruses was the descendant of a virus from Kalimantan. Contact tracing showed that the event most likely occurred in early 2008. The introduction of rabies into a large unvaccinated dog population in Bali clearly demonstrates the risk of disease transmission for government agencies and should lead to an increased preparedness and efforts for sustained risk reduction to prevent such events from occurring in future.

  15. Phenotypic constraints promote latent versatility and carbon efficiency in metabolic networks.

    PubMed

    Bardoscia, Marco; Marsili, Matteo; Samal, Areejit

    2015-07-01

    System-level properties of metabolic networks may be the direct product of natural selection or arise as a by-product of selection on other properties. Here we study the effect of direct selective pressure for growth or viability in particular environments on two properties of metabolic networks: latent versatility to function in additional environments and carbon usage efficiency. Using a Markov chain Monte Carlo (MCMC) sampling based on flux balance analysis (FBA), we sample from a known biochemical universe random viable metabolic networks that differ in the number of directly constrained environments. We find that the latent versatility of sampled metabolic networks increases with the number of directly constrained environments and with the size of the networks. We then show that the average carbon wastage of sampled metabolic networks across the constrained environments decreases with the number of directly constrained environments and with the size of the networks. Our work expands the growing body of evidence about nonadaptive origins of key functional properties of biological networks.

  16. Quantum Enhanced Inference in Markov Logic Networks.

    PubMed

    Wittek, Peter; Gogolin, Christian

    2017-04-19

    Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.

  17. Quantum Enhanced Inference in Markov Logic Networks

    PubMed Central

    Wittek, Peter; Gogolin, Christian

    2017-01-01

    Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning. PMID:28422093

  18. A robust algorithm for automated target recognition using precomputed radar cross sections

    NASA Astrophysics Data System (ADS)

    Ehrman, Lisa M.; Lanterman, Aaron D.

    2004-09-01

    Passive radar is an emerging technology that offers a number of unique benefits, including covert operation. Many such systems are already capable of detecting and tracking aircraft. The goal of this work is to develop a robust algorithm for adding automated target recognition (ATR) capabilities to existing passive radar systems. In previous papers, we proposed conducting ATR by comparing the precomputed RCS of known targets to that of detected targets. To make the precomputed RCS as accurate as possible, a coordinated flight model is used to estimate aircraft orientation. Once the aircraft's position and orientation are known, it is possible to determine the incident and observed angles on the aircraft, relative to the transmitter and receiver. This makes it possible to extract the appropriate radar cross section (RCS) from our simulated database. This RCS is then scaled to account for propagation losses and the receiver's antenna gain. A Rician likelihood model compares these expected signals from different targets to the received target profile. We have previously employed Monte Carlo runs to gauge the probability of error in the ATR algorithm; however, generation of a statistically significant set of Monte Carlo runs is computationally intensive. As an alternative to Monte Carlo runs, we derive the relative entropy (also known as Kullback-Liebler distance) between two Rician distributions. Since the probability of Type II error in our hypothesis testing problem can be expressed as a function of the relative entropy via Stein's Lemma, this provides us with a computationally efficient method for determining an upper bound on our algorithm's performance. It also provides great insight into the types of classification errors we can expect from our algorithm. This paper compares the numerically approximated probability of Type II error with the results obtained from a set of Monte Carlo runs.

  19. Fungible Correlation Matrices: A Method for Generating Nonsingular, Singular, and Improper Correlation Matrices for Monte Carlo Research.

    PubMed

    Waller, Niels G

    2016-01-01

    For a fixed set of standardized regression coefficients and a fixed coefficient of determination (R-squared), an infinite number of predictor correlation matrices will satisfy the implied quadratic form. I call such matrices fungible correlation matrices. In this article, I describe an algorithm for generating positive definite (PD), positive semidefinite (PSD), or indefinite (ID) fungible correlation matrices that have a random or fixed smallest eigenvalue. The underlying equations of this algorithm are reviewed from both algebraic and geometric perspectives. Two simulation studies illustrate that fungible correlation matrices can be profitably used in Monte Carlo research. The first study uses PD fungible correlation matrices to compare penalized regression algorithms. The second study uses ID fungible correlation matrices to compare matrix-smoothing algorithms. R code for generating fungible correlation matrices is presented in the supplemental materials.

  20. Vertically Integrated Seismological Analysis II : Inference

    NASA Astrophysics Data System (ADS)

    Arora, N. S.; Russell, S.; Sudderth, E.

    2009-12-01

    Methods for automatically associating detected waveform features with hypothesized seismic events, and localizing those events, are a critical component of efforts to verify the Comprehensive Test Ban Treaty (CTBT). As outlined in our companion abstract, we have developed a hierarchical model which views detection, association, and localization as an integrated probabilistic inference problem. In this abstract, we provide more details on the Markov chain Monte Carlo (MCMC) methods used to solve this inference task. MCMC generates samples from a posterior distribution π(x) over possible worlds x by defining a Markov chain whose states are the worlds x, and whose stationary distribution is π(x). In the Metropolis-Hastings (M-H) method, transitions in the Markov chain are constructed in two steps. First, given the current state x, a candidate next state x‧ is generated from a proposal distribution q(x‧ | x), which may be (more or less) arbitrary. Second, the transition to x‧ is not automatic, but occurs with an acceptance probability—α(x‧ | x) = min(1, π(x‧)q(x | x‧)/π(x)q(x‧ | x)). The seismic event model outlined in our companion abstract is quite similar to those used in multitarget tracking, for which MCMC has proved very effective. In this model, each world x is defined by a collection of events, a list of properties characterizing those events (times, locations, magnitudes, and types), and the association of each event to a set of observed detections. The target distribution π(x) = P(x | y), the posterior distribution over worlds x given the observed waveform data y at all stations. Proposal distributions then implement several types of moves between worlds. For example, birth moves create new events; death moves delete existing events; split moves partition the detections for an event into two new events; merge moves combine event pairs; swap moves modify the properties and assocations for pairs of events. Importantly, the rules for accepting such complex moves need not be hand-designed. Instead, they are automatically determined by the underlying probabilistic model, which is in turn calibrated via historical data and scientific knowledge. Consider a small seismic event which generates weak signals at several different stations, which might independently be mistaken for noise. A birth move may nevertheless hypothesize an event jointly explaining these detections. If the corresponding waveform data then aligns with the seismological knowledge encoded in the probabilistic model, the event may be detected even though no single station observes it unambiguously. Alternatively, if a large outlier reading is produced at a single station, moves which instantiate a corresponding (false) event would be rejected because of the absence of plausible detections at other sensors. More broadly, one of the main advantages of our MCMC approach is its consistent handling of the relative uncertainties in different information sources. By avoiding low-level thresholds, we expect to improve accuracy and robustness. At the conference, we will present results quantitatively validating our approach, using ground-truth associations and locations provided either by simulation or human analysts.

Top