Markov chain Monte Carlo inference for Markov jump processes via the linear noise approximation.
Stathopoulos, Vassilios; Girolami, Mark A
2013-02-13
Bayesian analysis for Markov jump processes (MJPs) is a non-trivial and challenging problem. Although exact inference is theoretically possible, it is computationally demanding, thus its applicability is limited to a small class of problems. In this paper, we describe the application of Riemann manifold Markov chain Monte Carlo (MCMC) methods using an approximation to the likelihood of the MJP that is valid when the system modelled is near its thermodynamic limit. The proposed approach is both statistically and computationally efficient whereas the convergence rate and mixing of the chains allow for fast MCMC inference. The methodology is evaluated using numerical simulations on two problems from chemical kinetics and one from systems biology.
Strelioff, Christopher C; Crutchfield, James P; Hübler, Alfred W
2007-07-01
Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k , from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.
Golightly, Andrew; Wilkinson, Darren J.
2011-01-01
Computational systems biology is concerned with the development of detailed mechanistic models of biological processes. Such models are often stochastic and analytically intractable, containing uncertain parameters that must be estimated from time course data. In this article, we consider the task of inferring the parameters of a stochastic kinetic model defined as a Markov (jump) process. Inference for the parameters of complex nonlinear multivariate stochastic process models is a challenging problem, but we find here that algorithms based on particle Markov chain Monte Carlo turn out to be a very effective computationally intensive approach to the problem. Approximations to the inferential model based on stochastic differential equations (SDEs) are considered, as well as improvements to the inference scheme that exploit the SDE structure. We apply the methodology to a Lotka–Volterra system and a prokaryotic auto-regulatory network. PMID:23226583
Markov Chain Monte Carlo Used in Parameter Inference of Magnetic Resonance Spectra
Hock, Kiel; Earle, Keith
2016-02-06
In this paper, we use Boltzmann statistics and the maximum likelihood distribution derived from Bayes’ Theorem to infer parameter values for a Pake Doublet Spectrum, a lineshape of historical significance and contemporary relevance for determining distances between interacting magnetic dipoles. A Metropolis Hastings Markov Chain Monte Carlo algorithm is implemented and designed to find the optimum parameter set and to estimate parameter uncertainties. In conclusion, the posterior distribution allows us to define a metric on parameter space that induces a geometry with negative curvature that affects the parameter uncertainty estimates, particularly for spectra with low signal to noise.
PHAISTOS: a framework for Markov chain Monte Carlo simulation and inference of protein structure.
Boomsma, Wouter; Frellsen, Jes; Harder, Tim; Bottaro, Sandro; Johansson, Kristoffer E; Tian, Pengfei; Stovgaard, Kasper; Andreetta, Christian; Olsson, Simon; Valentin, Jan B; Antonov, Lubomir D; Christensen, Anders S; Borg, Mikael; Jensen, Jan H; Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper; Hamelryck, Thomas
2013-07-15
We present a new software framework for Markov chain Monte Carlo sampling for simulation, prediction, and inference of protein structure. The software package contains implementations of recent advances in Monte Carlo methodology, such as efficient local updates and sampling from probabilistic models of local protein structure. These models form a probabilistic alternative to the widely used fragment and rotamer libraries. Combined with an easily extendible software architecture, this makes PHAISTOS well suited for Bayesian inference of protein structure from sequence and/or experimental data. Currently, two force-fields are available within the framework: PROFASI and OPLS-AA/L, the latter including the generalized Born surface area solvent model. A flexible command-line and configuration-file interface allows users quickly to set up simulations with the desired configuration. PHAISTOS is released under the GNU General Public License v3.0. Source code and documentation are freely available from http://phaistos.sourceforge.net. The software is implemented in C++ and has been tested on Linux and OSX platforms.
Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo.
Pagel, Mark; Meade, Andrew
2008-12-27
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon--known as heterotachy--can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Murakami, Yohei; Takada, Shoji
2013-01-01
When model parameters in systems biology are not available from experiments, they need to be inferred so that the resulting simulation reproduces the experimentally known phenomena. For the purpose, Bayesian statistics with Markov chain Monte Carlo (MCMC) is a useful method. Conventional MCMC needs likelihood to evaluate a posterior distribution of acceptable parameters, while the approximate Bayesian computation (ABC) MCMC evaluates posterior distribution with use of qualitative fitness measure. However, none of these algorithms can deal with mixture of quantitative, i.e., likelihood, and qualitative fitness measures simultaneously. Here, to deal with this mixture, we formulated Bayesian formula for hybrid fitness measures (HFM). Then we implemented it to MCMC (MCMC-HFM). We tested MCMC-HFM first for a kinetic toy model with a positive feedback. Inferring kinetic parameters mainly related to the positive feedback, we found that MCMC-HFM reliably infer them using both qualitative and quantitative fitness measures. Then, we applied the MCMC-HFM to an apoptosis signal transduction network previously proposed. For kinetic parameters related to implicit positive feedbacks, which are important for bistability and irreversibility of the output, the MCMC-HFM reliably inferred these kinetic parameters. In particular, some kinetic parameters that have experimental estimates were inferred without using these data and the results were consistent with experiments. Moreover, for some parameters, the mixed use of quantitative and qualitative fitness measures narrowed down the acceptable range of parameters.
Wu, Chieh-Hsi; Drummond, Alexei J
2011-05-01
We provide a framework for Bayesian coalescent inference from microsatellite data that enables inference of population history parameters averaged over microsatellite mutation models. To achieve this we first implemented a rich family of microsatellite mutation models and related components in the software package BEAST. BEAST is a powerful tool that performs Bayesian MCMC analysis on molecular data to make coalescent and evolutionary inferences. Our implementation permits the application of existing nonparametric methods to microsatellite data. The implemented microsatellite models are based on the replication slippage mechanism and focus on three properties of microsatellite mutation: length dependency of mutation rate, mutational bias toward expansion or contraction, and number of repeat units changed in a single mutation event. We develop a new model that facilitates microsatellite model averaging and Bayesian model selection by transdimensional MCMC. With Bayesian model averaging, the posterior distributions of population history parameters are integrated across a set of microsatellite models and thus account for model uncertainty. Simulated data are used to evaluate our method in terms of accuracy and precision of estimation and also identification of the true mutation model. Finally we apply our method to a red colobus monkey data set as an example.
NASA Astrophysics Data System (ADS)
Volchenkov, Dima; Dawin, Jean René
A system for using dice to compose music randomly is known as the musical dice game. The discrete time MIDI models of 804 pieces of classical music written by 29 composers have been encoded into the transition matrices and studied by Markov chains. Contrary to human languages, entropy dominates over redundancy, in the musical dice games based on the compositions of classical music. The maximum complexity is achieved on the blocks consisting of just a few notes (8 notes, for the musical dice games generated over Bach's compositions). First passage times to notes can be used to resolve tonality and feature a composer.
Markov Chain Estimation of Avian Seasonal Fecundity
To explore the consequences of modeling decisions on inference about avian seasonal fecundity we generalize previous Markov chain (MC) models of avian nest success to formulate two different MC models of avian seasonal fecundity that represent two different ways to model renestin...
Markov Chains and Chemical Processes
ERIC Educational Resources Information Center
Miller, P. J.
1972-01-01
Views as important the relating of abstract ideas of modern mathematics now being taught in the schools to situations encountered in the sciences. Describes use of matrices and Markov chains to study first-order processes. (Author/DF)
Observation uncertainty in reversible Markov chains.
Metzner, Philipp; Weber, Marcus; Schütte, Christof
2010-09-01
In many applications one is interested in finding a simplified model which captures the essential dynamical behavior of a real life process. If the essential dynamics can be assumed to be (approximately) memoryless then a reasonable choice for a model is a Markov model whose parameters are estimated by means of Bayesian inference from an observed time series. We propose an efficient Monte Carlo Markov chain framework to assess the uncertainty of the Markov model and related observables. The derived Gibbs sampler allows for sampling distributions of transition matrices subject to reversibility and/or sparsity constraints. The performance of the suggested sampling scheme is demonstrated and discussed for a variety of model examples. The uncertainty analysis of functions of the Markov model under investigation is discussed in application to the identification of conformations of the trialanine molecule via Robust Perron Cluster Analysis (PCCA+) .
McNally, Kevin; Cotton, Richard; Cocker, John; Jones, Kate; Bartels, Mike; Rick, David; Price, Paul; Loizou, George
2012-01-01
There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure. PMID:22719759
McNally, Kevin; Cotton, Richard; Cocker, John; Jones, Kate; Bartels, Mike; Rick, David; Price, Paul; Loizou, George
2012-01-01
There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
On a Result for Finite Markov Chains
ERIC Educational Resources Information Center
Kulathinal, Sangita; Ghosh, Lagnojita
2006-01-01
In an undergraduate course on stochastic processes, Markov chains are discussed in great detail. Textbooks on stochastic processes provide interesting properties of finite Markov chains. This note discusses one such property regarding the number of steps in which a state is reachable or accessible from another state in a finite Markov chain with M…
Markov chain Monte Carlo without likelihoods.
Marjoram, Paul; Molitor, John; Plagnol, Vincent; Tavare, Simon
2003-12-23
Many stochastic simulation approaches for generating observations from a posterior distribution depend on knowing a likelihood function. However, for many complex probability models, such likelihoods are either impossible or computationally prohibitive to obtain. Here we present a Markov chain Monte Carlo method for generating observations from a posterior distribution without the use of likelihoods. It can also be used in frequentist applications, in particular for maximum-likelihood estimation. The approach is illustrated by an example of ancestral inference in population genetics. A number of open problems are highlighted in the discussion.
Using Games to Teach Markov Chains
ERIC Educational Resources Information Center
Johnson, Roger W.
2003-01-01
Games are promoted as examples for classroom discussion of stationary Markov chains. In a game context Markov chain terminology and results are made concrete, interesting, and entertaining. Game length for several-player games such as "Hi Ho! Cherry-O" and "Chutes and Ladders" is investigated and new, simple formulas are given. Slight…
Markov chain Monte Carlo simulation for Bayesian Hidden Markov Models
NASA Astrophysics Data System (ADS)
Chan, Lay Guat; Ibrahim, Adriana Irawati Nur Binti
2016-10-01
A hidden Markov model (HMM) is a mixture model which has a Markov chain with finite states as its mixing distribution. HMMs have been applied to a variety of fields, such as speech and face recognitions. The main purpose of this study is to investigate the Bayesian approach to HMMs. Using this approach, we can simulate from the parameters' posterior distribution using some Markov chain Monte Carlo (MCMC) sampling methods. HMMs seem to be useful, but there are some limitations. Therefore, by using the Mixture of Dirichlet processes Hidden Markov Model (MDPHMM) based on Yau et. al (2011), we hope to overcome these limitations. We shall conduct a simulation study using MCMC methods to investigate the performance of this model.
Handling target obscuration through Markov chain observations
NASA Astrophysics Data System (ADS)
Kouritzin, Michael A.; Wu, Biao
2008-04-01
Target Obscuration, including foliage or building obscuration of ground targets and landscape or horizon obscuration of airborne targets, plagues many real world filtering problems. In particular, ground moving target identification Doppler radar, mounted on a surveillance aircraft or unattended airborne vehicle, is used to detect motion consistent with targets of interest. However, these targets try to obscure themselves (at least partially) by, for example, traveling along the edge of a forest or around buildings. This has the effect of creating random blockages in the Doppler radar image that move dynamically and somewhat randomly through this image. Herein, we address tracking problems with target obscuration by building memory into the observations, eschewing the usual corrupted, distorted partial measurement assumptions of filtering in favor of dynamic Markov chain assumptions. In particular, we assume the observations are a Markov chain whose transition probabilities depend upon the signal. The state of the observation Markov chain attempts to depict the current obscuration and the Markov chain dynamics are used to handle the evolution of the partially obscured radar image. Modifications of the classical filtering equations that allow observation memory (in the form of a Markov chain) are given. We use particle filters to estimate the position of the moving targets. Moreover, positive proof-of-concept simulations are included.
Markov chains for testing redundant software
NASA Technical Reports Server (NTRS)
White, Allan L.; Sjogren, Jon A.
1988-01-01
A preliminary design for a validation experiment has been developed that addresses several problems unique to assuring the extremely high quality of multiple-version programs in process-control software. The procedure uses Markov chains to model the error states of the multiple version programs. The programs are observed during simulated process-control testing, and estimates are obtained for the transition probabilities between the states of the Markov chain. The experimental Markov chain model is then expanded into a reliability model that takes into account the inertia of the system being controlled. The reliability of the multiple version software is computed from this reliability model at a given confidence level using confidence intervals obtained for the transition probabilities during the experiment. An example demonstrating the method is provided.
Entropy production fluctuations of finite Markov chains
NASA Astrophysics Data System (ADS)
Jiang, Da-Quan; Qian, Min; Zhang, Fu-Xi
2003-09-01
For almost every trajectory segment over a finite time span of a finite Markov chain with any given initial distribution, the logarithm of the ratio of its probability to that of its time-reversal converges exponentially to the entropy production rate of the Markov chain. The large deviation rate function has a symmetry of Gallavotti-Cohen type, which is called the fluctuation theorem. Moreover, similar symmetries also hold for the rate functions of the joint distributions of general observables and the logarithmic probability ratio.
Parallel Markov chain Monte Carlo simulations.
Ren, Ruichao; Orkoulas, G
2007-06-07
With strict detailed balance, parallel Monte Carlo simulation through domain decomposition cannot be validated with conventional Markov chain theory, which describes an intrinsically serial stochastic process. In this work, the parallel version of Markov chain theory and its role in accelerating Monte Carlo simulations via cluster computing is explored. It is shown that sequential updating is the key to improving efficiency in parallel simulations through domain decomposition. A parallel scheme is proposed to reduce interprocessor communication or synchronization, which slows down parallel simulation with increasing number of processors. Parallel simulation results for the two-dimensional lattice gas model show substantial reduction of simulation time for systems of moderate and large size.
Parallel Markov chain Monte Carlo simulations
NASA Astrophysics Data System (ADS)
Ren, Ruichao; Orkoulas, G.
2007-06-01
With strict detailed balance, parallel Monte Carlo simulation through domain decomposition cannot be validated with conventional Markov chain theory, which describes an intrinsically serial stochastic process. In this work, the parallel version of Markov chain theory and its role in accelerating Monte Carlo simulations via cluster computing is explored. It is shown that sequential updating is the key to improving efficiency in parallel simulations through domain decomposition. A parallel scheme is proposed to reduce interprocessor communication or synchronization, which slows down parallel simulation with increasing number of processors. Parallel simulation results for the two-dimensional lattice gas model show substantial reduction of simulation time for systems of moderate and large size.
Likelihood free inference for Markov processes: a comparison.
Owen, Jamie; Wilkinson, Darren J; Gillespie, Colin S
2015-04-01
Approaches to Bayesian inference for problems with intractable likelihoods have become increasingly important in recent years. Approximate Bayesian computation (ABC) and "likelihood free" Markov chain Monte Carlo techniques are popular methods for tackling inference in these scenarios but such techniques are computationally expensive. In this paper we compare the two approaches to inference, with a particular focus on parameter inference for stochastic kinetic models, widely used in systems biology. Discrete time transition kernels for models of this type are intractable for all but the most trivial systems yet forward simulation is usually straightforward. We discuss the relative merits and drawbacks of each approach whilst considering the computational cost implications and efficiency of these techniques. In order to explore the properties of each approach we examine a range of observation regimes using two example models. We use a Lotka-Volterra predator-prey model to explore the impact of full or partial species observations using various time course observations under the assumption of known and unknown measurement error. Further investigation into the impact of observation error is then made using a Schlögl system, a test case which exhibits bi-modal state stability in some regions of parameter space.
Finite Markov Chains and Random Discrete Structures
1994-07-26
arrays with fixed margins 4. Persi Diaconis and Susan Holmes, Three Examples of Monte- Carlo Markov Chains: at the Interface between Statistical Computing...solutions for a math- ematical model of thermomechanical phase transitions in shape memory materials with Landau- Ginzburg free energy 1168 Angelo Favini
The cutoff phenomenon in finite Markov chains.
Diaconis, P
1996-01-01
Natural mixing processes modeled by Markov chains often show a sharp cutoff in their convergence to long-time behavior. This paper presents problems where the cutoff can be proved (card shuffling, the Ehrenfests' urn). It shows that chains with polynomial growth (drunkard's walk) do not show cutoffs. The best general understanding of such cutoffs (high multiplicity of second eigenvalues due to symmetry) is explored. Examples are given where the symmetry is broken but the cutoff phenomenon persists. PMID:11607633
Numerical methods in Markov chain modeling
NASA Technical Reports Server (NTRS)
Philippe, Bernard; Saad, Youcef; Stewart, William J.
1989-01-01
Several methods for computing stationary probability distributions of Markov chains are described and compared. The main linear algebra problem consists of computing an eigenvector of a sparse, usually nonsymmetric, matrix associated with a known eigenvalue. It can also be cast as a problem of solving a homogeneous singular linear system. Several methods based on combinations of Krylov subspace techniques are presented. The performance of these methods on some realistic problems are compared.
On Measures Driven by Markov Chains
NASA Astrophysics Data System (ADS)
Heurteaux, Yanick; Stos, Andrzej
2014-12-01
We study measures on which are driven by a finite Markov chain and which generalize the famous Bernoulli products.We propose a hands-on approach to determine the structure function and to prove that the multifractal formalism is satisfied. Formulas for the dimension of the measures and for the Hausdorff dimension of their supports are also provided. Finally, we identify the measures with maximal dimension.
Markov Chain Monte Carlo and Irreversibility
NASA Astrophysics Data System (ADS)
Ottobre, Michela
2016-06-01
Markov Chain Monte Carlo (MCMC) methods are statistical methods designed to sample from a given measure π by constructing a Markov chain that has π as invariant measure and that converges to π. Most MCMC algorithms make use of chains that satisfy the detailed balance condition with respect to π; such chains are therefore reversible. On the other hand, recent work [18, 21, 28, 29] has stressed several advantages of using irreversible processes for sampling. Roughly speaking, irreversible diffusions converge to equilibrium faster (and lead to smaller asymptotic variance as well). In this paper we discuss some of the recent progress in the study of nonreversible MCMC methods. In particular: i) we explain some of the difficulties that arise in the analysis of nonreversible processes and we discuss some analytical methods to approach the study of continuous-time irreversible diffusions; ii) most of the rigorous results on irreversible diffusions are available for continuous-time processes; however, for computational purposes one needs to discretize such dynamics. It is well known that the resulting discretized chain will not, in general, retain all the good properties of the process that it is obtained from. In particular, if we want to preserve the invariance of the target measure, the chain might no longer be reversible. Therefore iii) we conclude by presenting an MCMC algorithm, the SOL-HMC algorithm [23], which results from a nonreversible discretization of a nonreversible dynamics.
Growth and Dissolution of Macromolecular Markov Chains
NASA Astrophysics Data System (ADS)
Gaspard, Pierre
2016-07-01
The kinetics and thermodynamics of free living copolymerization are studied for processes with rates depending on k monomeric units of the macromolecular chain behind the unit that is attached or detached. In this case, the sequence of monomeric units in the growing copolymer is a kth-order Markov chain. In the regime of steady growth, the statistical properties of the sequence are determined analytically in terms of the attachment and detachment rates. In this way, the mean growth velocity as well as the thermodynamic entropy production and the sequence disorder can be calculated systematically. These different properties are also investigated in the regime of depolymerization where the macromolecular chain is dissolved by the surrounding solution. In this regime, the entropy production is shown to satisfy Landauer's principle.
Stochastic seismic tomography by interacting Markov chains
NASA Astrophysics Data System (ADS)
Bottero, Alexis; Gesret, Alexandrine; Romary, Thomas; Noble, Mark; Maisons, Christophe
2016-10-01
Markov chain Monte Carlo sampling methods are widely used for non-linear Bayesian inversion where no analytical expression for the forward relation between data and model parameters is available. Contrary to the linear(ized) approaches, they naturally allow to evaluate the uncertainties on the model found. Nevertheless their use is problematic in high-dimensional model spaces especially when the computational cost of the forward problem is significant and/or the a posteriori distribution is multimodal. In this case, the chain can stay stuck in one of the modes and hence not provide an exhaustive sampling of the distribution of interest. We present here a still relatively unknown algorithm that allows interaction between several Markov chains at different temperatures. These interactions (based on importance resampling) ensure a robust sampling of any posterior distribution and thus provide a way to efficiently tackle complex fully non-linear inverse problems. The algorithm is easy to implement and is well adapted to run on parallel supercomputers. In this paper, the algorithm is first introduced and applied to a synthetic multimodal distribution in order to demonstrate its robustness and efficiency compared to a simulated annealing method. It is then applied in the framework of first arrival traveltime seismic tomography on real data recorded in the context of hydraulic fracturing. To carry out this study a wavelet-based adaptive model parametrization has been used. This allows to integrate the a priori information provided by sonic logs and to reduce optimally the dimension of the problem.
Markov Chain Analysis of Musical Dice Games
NASA Astrophysics Data System (ADS)
Volchenkov, D.; Dawin, J. R.
2012-07-01
A system for using dice to compose music randomly is known as the musical dice game. The discrete time MIDI models of 804 pieces of classical music written by 29 composers have been encoded into the transition matrices and studied by Markov chains. Contrary to human languages, entropy dominates over redundancy, in the musical dice games based on the compositions of classical music. The maximum complexity is achieved on the blocks consisting of just a few notes (8 notes, for the musical dice games generated over Bach's compositions). First passage times to notes can be used to resolve tonality and feature a composer.
Dynamic Bandwidth Provisioning Using Markov Chain Based on RSVP
2013-09-01
Cambridge University Press,2008. [20] P. Bremaud, Markov Chains : Gibbs Fields, Monte Carlo Simulation and Queues, New York, NY, Springer Science...is successful. Qualnet, a simulation platform for the wireless environment is used to simulate the algorithm (integration of Markov chain ...in Qualnet, the simulation platform used. 16 THIS PAGE INTENTIONALLY LEFT BLANK 17 III. GENERAL DISCUSSION OF MARKOV CHAIN ALGORITHM AND RSVP
Equilibrium Control Policies for Markov Chains
Malikopoulos, Andreas
2011-01-01
The average cost criterion has held great intuitive appeal and has attracted considerable attention. It is widely employed when controlling dynamic systems that evolve stochastically over time by means of formulating an optimization problem to achieve long-term goals efficiently. The average cost criterion is especially appealing when the decision-making process is long compared to other timescales involved, and there is no compelling motivation to select short-term optimization. This paper addresses the problem of controlling a Markov chain so as to minimize the average cost per unit time. Our approach treats the problem as a dual constrained optimization problem. We derive conditions guaranteeing that a saddle point exists for the new dual problem and we show that this saddle point is an equilibrium control policy for each state of the Markov chain. For practical situations with constraints consistent to those we study here, our results imply that recognition of such saddle points may be of value in deriving in real time an optimal control policy.
Lifting—A nonreversible Markov chain Monte Carlo algorithm
NASA Astrophysics Data System (ADS)
Vucelja, Marija
2016-12-01
Markov chain Monte Carlo algorithms are invaluable tools for exploring stationary properties of physical systems, especially in situations where direct sampling is unfeasible. Common implementations of Monte Carlo algorithms employ reversible Markov chains. Reversible chains obey detailed balance and thus ensure that the system will eventually relax to equilibrium, though detailed balance is not necessary for convergence to equilibrium. We review nonreversible Markov chains, which violate detailed balance and yet still relax to a given target stationary distribution. In particular cases, nonreversible Markov chains are substantially better at sampling than the conventional reversible Markov chains with up to a square root improvement in the convergence time to the steady state. One kind of nonreversible Markov chain is constructed from the reversible ones by enlarging the state space and by modifying and adding extra transition rates to create non-reversible moves. Because of the augmentation of the state space, such chains are often referred to as lifted Markov Chains. We illustrate the use of lifted Markov chains for efficient sampling on several examples. The examples include sampling on a ring, sampling on a torus, the Ising model on a complete graph, and the one-dimensional Ising model. We also provide a pseudocode implementation, review related work, and discuss the applicability of such methods.
Multivariate Markov chain modeling for stock markets
NASA Astrophysics Data System (ADS)
Maskawa, Jun-ichi
2003-06-01
We study a multivariate Markov chain model as a stochastic model of the price changes of portfolios in the framework of the mean field approximation. The time series of price changes are coded into the sequences of up and down spins according to their signs. We start with the discussion for small portfolios consisting of two stock issues. The generalization of our model to arbitrary size of portfolio is constructed by a recurrence relation. The resultant form of the joint probability of the stationary state coincides with Gibbs measure assigned to each configuration of spin glass model. Through the analysis of actual portfolios, it has been shown that the synchronization of the direction of the price changes is well described by the model.
ERIC Educational Resources Information Center
Kim, Jee-Seon; Bolt, Daniel M.
2007-01-01
The purpose of this ITEMS module is to provide an introduction to Markov chain Monte Carlo (MCMC) estimation for item response models. A brief description of Bayesian inference is followed by an overview of the various facets of MCMC algorithms, including discussion of prior specification, sampling procedures, and methods for evaluating chain…
Maximally reliable Markov chains under energy constraints.
Escola, Sean; Eisele, Michael; Miller, Kenneth; Paninski, Liam
2009-07-01
Signal-to-noise ratios in physical systems can be significantly degraded if the outputs of the systems are highly variable. Biological processes for which highly stereotyped signal generations are necessary features appear to have reduced their signal variabilities by employing multiple processing steps. To better understand why this multistep cascade structure might be desirable, we prove that the reliability of a signal generated by a multistate system with no memory (i.e., a Markov chain) is maximal if and only if the system topology is such that the process steps irreversibly through each state, with transition rates chosen such that an equal fraction of the total signal is generated in each state. Furthermore, our result indicates that by increasing the number of states, it is possible to arbitrarily increase the reliability of the system. In a physical system, however, an energy cost is associated with maintaining irreversible transitions, and this cost increases with the number of such transitions (i.e., the number of states). Thus, an infinite-length chain, which would be perfectly reliable, is infeasible. To model the effects of energy demands on the maximally reliable solution, we numerically optimize the topology under two distinct energy functions that penalize either irreversible transitions or incommunicability between states, respectively. In both cases, the solutions are essentially irreversible linear chains, but with upper bounds on the number of states set by the amount of available energy. We therefore conclude that a physical system for which signal reliability is important should employ a linear architecture, with the number of states (and thus the reliability) determined by the intrinsic energy constraints of the system.
Manpower planning using Markov Chain model
NASA Astrophysics Data System (ADS)
Saad, Syafawati Ab; Adnan, Farah Adibah; Ibrahim, Haslinda; Rahim, Rahela
2014-07-01
Manpower planning is a planning model which understands the flow of manpower based on the policies changes. For such purpose, numerous attempts have been made by researchers to develop a model to investigate the track of movements of lecturers for various universities. As huge number of lecturers in a university, it is difficult to track the movement of lecturers and also there is no quantitative way used in tracking the movement of lecturers. This research is aimed to determine the appropriate manpower model to understand the flow of lecturers in a university in Malaysia by determine the probability and mean time of lecturers remain in the same status rank. In addition, this research also intended to estimate the number of lecturers in different status rank (lecturer, senior lecturer and associate professor). From the previous studies, there are several methods applied in manpower planning model and appropriate method used in this research is Markov Chain model. Results obtained from this study indicate that the appropriate manpower planning model used is validated by compare to the actual data. The smaller margin of error gives a better result which means that the projection is closer to actual data. These results would give some suggestions for the university to plan the hiring lecturers and budgetary for university in future.
Differential evolution Markov chain with snooker updater and fewer chains
Vrugt, Jasper A; Ter Braak, Cajo J F
2008-01-01
Differential Evolution Markov Chain (DE-MC) is an adaptive MCMC algorithm, in which multiple chains are run in parallel. Standard DE-MC requires at least N=2d chains to be run in parallel, where d is the dimensionality of the posterior. This paper extends DE-MC with a snooker updater and shows by simulation and real examples that DE-MC can work for d up to 50--100 with fewer parallel chains (e.g. N=3) by exploiting information from their past by generating jumps from differences of pairs of past states. This approach extends the practical applicability of DE-MC and is shown to be about 5--26 times more efficient than the optimal Normal random walk Metropolis sampler for the 97.5% point of a variable from a 25--50 dimensional Student T{sub 3} distribution. In a nonlinear mixed effects model example the approach outperformed a block-updater geared to the specific features of the model.
Inferring phenomenological models of Markov processes from data
NASA Astrophysics Data System (ADS)
Rivera, Catalina; Nemenman, Ilya
Microscopically accurate modeling of stochastic dynamics of biochemical networks is hard due to the extremely high dimensionality of the state space of such networks. Here we propose an algorithm for inference of phenomenological, coarse-grained models of Markov processes describing the network dynamics directly from data, without the intermediate step of microscopically accurate modeling. The approach relies on the linear nature of the Chemical Master Equation and uses Bayesian Model Selection for identification of parsimonious models that fit the data. When applied to synthetic data from the Kinetic Proofreading process (KPR), a common mechanism used by cells for increasing specificity of molecular assembly, the algorithm successfully uncovers the known coarse-grained description of the process. This phenomenological description has been notice previously, but this time it is derived in an automated manner by the algorithm. James S. McDonnell Foundation Grant No. 220020321.
Assessing significance in a Markov chain without mixing.
Chikina, Maria; Frieze, Alan; Pegden, Wesley
2017-03-14
We present a statistical test to detect that a presented state of a reversible Markov chain was not chosen from a stationary distribution. In particular, given a value function for the states of the Markov chain, we would like to show rigorously that the presented state is an outlier with respect to the values, by establishing a [Formula: see text] value under the null hypothesis that it was chosen from a stationary distribution of the chain. A simple heuristic used in practice is to sample ranks of states from long random trajectories on the Markov chain and compare these with the rank of the presented state; if the presented state is a [Formula: see text] outlier compared with the sampled ranks (its rank is in the bottom [Formula: see text] of sampled ranks), then this observation should correspond to a [Formula: see text] value of [Formula: see text] This significance is not rigorous, however, without good bounds on the mixing time of the Markov chain. Our test is the following: Given the presented state in the Markov chain, take a random walk from the presented state for any number of steps. We prove that observing that the presented state is an [Formula: see text]-outlier on the walk is significant at [Formula: see text] under the null hypothesis that the state was chosen from a stationary distribution. We assume nothing about the Markov chain beyond reversibility and show that significance at [Formula: see text] is best possible in general. We illustrate the use of our test with a potential application to the rigorous detection of gerrymandering in Congressional districting.
Markov chains and semi-Markov models in time-to-event analysis
Abner, Erin L.; Charnigo, Richard J.; Kryscio, Richard J.
2014-01-01
A variety of statistical methods are available to investigators for analysis of time-to-event data, often referred to as survival analysis. Kaplan-Meier estimation and Cox proportional hazards regression are commonly employed tools but are not appropriate for all studies, particularly in the presence of competing risks and when multiple or recurrent outcomes are of interest. Markov chain models can accommodate censored data, competing risks (informative censoring), multiple outcomes, recurrent outcomes, frailty, and non-constant survival probabilities. Markov chain models, though often overlooked by investigators in time-to-event analysis, have long been used in clinical studies and have widespread application in other fields. PMID:24818062
Herbei, Radu; Kubatko, Laura
2013-03-26
Markov chains are widely used for modeling in many areas of molecular biology and genetics. As the complexity of such models advances, it becomes increasingly important to assess the rate at which a Markov chain converges to its stationary distribution in order to carry out accurate inference. A common measure of convergence to the stationary distribution is the total variation distance, but this measure can be difficult to compute when the state space of the chain is large. We propose a Monte Carlo method to estimate the total variation distance that can be applied in this situation, and we demonstrate how the method can be efficiently implemented by taking advantage of GPU computing techniques. We apply the method to two Markov chains on the space of phylogenetic trees, and discuss the implications of our findings for the development of algorithms for phylogenetic inference.
Dynamical Systems Based Non Equilibrium Statistical Mechanics for Markov Chains
NASA Astrophysics Data System (ADS)
Prevost, Mireille
We introduce an abstract framework concerning non-equilibrium statistical mechanics in the specific context of Markov chains. This framework encompasses both the Evans-Searles and the Gallavotti-Cohen fluctuation theorems. To support and expand on these concepts, several results are proven, among which a central limit theorem and a large deviation principle. The interest for Markov chains is twofold. First, they model a great variety of physical systems. Secondly, their simplicity allows for an easy introduction to an otherwise complicated field encompassing the statistical mechanics of Anosov and Axiom A diffeomorphisms. We give two examples relating the present framework to physical cases modelled by Markov chains. One of these concerns chemical reactions and links key concepts from the framework to their well known physical counterpart.
Markov chain solution of photon multiple scattering through turbid slabs.
Lin, Ying; Northrop, William F; Li, Xuesong
2016-11-14
This work introduces a Markov Chain solution to model photon multiple scattering through turbid slabs via anisotropic scattering process, i.e., Mie scattering. Results show that the proposed Markov Chain model agree with commonly used Monte Carlo simulation for various mediums such as medium with non-uniform phase functions and absorbing medium. The proposed Markov Chain solution method successfully converts the complex multiple scattering problem with practical phase functions into a matrix form and solves transmitted/reflected photon angular distributions by matrix multiplications. Such characteristics would potentially allow practical inversions by matrix manipulation or stochastic algorithms where widely applied stochastic methods such as Monte Carlo simulations usually fail, and thus enable practical diagnostics reconstructions such as medical diagnosis, spray analysis, and atmosphere sciences.
Stochastic Dynamics through Hierarchically Embedded Markov Chains
NASA Astrophysics Data System (ADS)
Vasconcelos, Vítor V.; Santos, Fernando P.; Santos, Francisco C.; Pacheco, Jorge M.
2017-02-01
Studying dynamical phenomena in finite populations often involves Markov processes of significant mathematical and/or computational complexity, which rapidly becomes prohibitive with increasing population size or an increasing number of individual configuration states. Here, we develop a framework that allows us to define a hierarchy of approximations to the stationary distribution of general systems that can be described as discrete Markov processes with time invariant transition probabilities and (possibly) a large number of states. This results in an efficient method for studying social and biological communities in the presence of stochastic effects—such as mutations in evolutionary dynamics and a random exploration of choices in social systems—including situations where the dynamics encompasses the existence of stable polymorphic configurations, thus overcoming the limitations of existing methods. The present formalism is shown to be general in scope, widely applicable, and of relevance to a variety of interdisciplinary problems.
Stochastic Dynamics through Hierarchically Embedded Markov Chains.
Vasconcelos, Vítor V; Santos, Fernando P; Santos, Francisco C; Pacheco, Jorge M
2017-02-03
Studying dynamical phenomena in finite populations often involves Markov processes of significant mathematical and/or computational complexity, which rapidly becomes prohibitive with increasing population size or an increasing number of individual configuration states. Here, we develop a framework that allows us to define a hierarchy of approximations to the stationary distribution of general systems that can be described as discrete Markov processes with time invariant transition probabilities and (possibly) a large number of states. This results in an efficient method for studying social and biological communities in the presence of stochastic effects-such as mutations in evolutionary dynamics and a random exploration of choices in social systems-including situations where the dynamics encompasses the existence of stable polymorphic configurations, thus overcoming the limitations of existing methods. The present formalism is shown to be general in scope, widely applicable, and of relevance to a variety of interdisciplinary problems.
Harmonic Oscillator Model for Radin's Markov-Chain Experiments
NASA Astrophysics Data System (ADS)
Sheehan, D. P.; Wright, J. H.
2006-10-01
The conscious observer stands as a central figure in the measurement problem of quantum mechanics. Recent experiments by Radin involving linear Markov chains driven by random number generators illuminate the role and temporal dynamics of observers interacting with quantum mechanically labile systems. In this paper a Lagrangian interpretation of these experiments indicates that the evolution of Markov chain probabilities can be modeled as damped harmonic oscillators. The results are best interpreted in terms of symmetric equicausal determinism rather than strict retrocausation, as posited by Radin. Based on the present analysis, suggestions are made for more advanced experiments.
Harmonic Oscillator Model for Radin's Markov-Chain Experiments
Sheehan, D. P.; Wright, J. H.
2006-10-16
The conscious observer stands as a central figure in the measurement problem of quantum mechanics. Recent experiments by Radin involving linear Markov chains driven by random number generators illuminate the role and temporal dynamics of observers interacting with quantum mechanically labile systems. In this paper a Lagrangian interpretation of these experiments indicates that the evolution of Markov chain probabilities can be modeled as damped harmonic oscillators. The results are best interpreted in terms of symmetric equicausal determinism rather than strict retrocausation, as posited by Radin. Based on the present analysis, suggestions are made for more advanced experiments.
Markov chain Monte Carlo linkage analysis of complex quantitative phenotypes.
Hinrichs, A; Reich, T
2001-01-01
We report a Markov chain Monte Carlo analysis of the five simulated quantitative traits in Genetic Analysis Workshop 12 using the Loki software. Our objectives were to determine the efficacy of the Markov chain Monte Carlo method and to test a new scoring technique. Our initial blind analysis, on replicate 42 (the "best replicate") successfully detected four out of the five disease loci and found no false positives. A power analysis shows that the software could usually detect 4 of the 10 trait/gene combinations at an empirical point-wise p-value of 1.5 x 10(-4).
Markov chain Monte Carlo methods: an introductory example
NASA Astrophysics Data System (ADS)
Klauenberg, Katy; Elster, Clemens
2016-02-01
When the Guide to the Expression of Uncertainty in Measurement (GUM) and methods from its supplements are not applicable, the Bayesian approach may be a valid and welcome alternative. Evaluating the posterior distribution, estimates or uncertainties involved in Bayesian inferences often requires numerical methods to avoid high-dimensional integrations. Markov chain Monte Carlo (MCMC) sampling is such a method—powerful, flexible and widely applied. Here, a concise introduction is given, illustrated by a simple, typical example from metrology. The Metropolis-Hastings algorithm is the most basic and yet flexible MCMC method. Its underlying concepts are explained and the algorithm is given step by step. The few lines of software code required for its implementation invite interested readers to get started. Diagnostics to evaluate the performance and common algorithmic choices are illustrated to calibrate the Metropolis-Hastings algorithm for efficiency. Routine application of MCMC algorithms may be hindered currently by the difficulty to assess the convergence of MCMC output and thus to assure the validity of results. An example points to the importance of convergence and initiates discussion about advantages as well as areas of research. Available software tools are mentioned throughout.
Searching for efficient Markov chain Monte Carlo proposal kernels.
Yang, Ziheng; Rodríguez, Carlos E
2013-11-26
Markov chain Monte Carlo (MCMC) or the Metropolis-Hastings algorithm is a simulation algorithm that has made modern Bayesian statistical inference possible. Nevertheless, the efficiency of different Metropolis-Hastings proposal kernels has rarely been studied except for the Gaussian proposal. Here we propose a unique class of Bactrian kernels, which avoid proposing values that are very close to the current value, and compare their efficiency with a number of proposals for simulating different target distributions, with efficiency measured by the asymptotic variance of a parameter estimate. The uniform kernel is found to be more efficient than the Gaussian kernel, whereas the Bactrian kernel is even better. When optimal scales are used for both, the Bactrian kernel is at least 50% more efficient than the Gaussian. Implementation in a Bayesian program for molecular clock dating confirms the general applicability of our results to generic MCMC algorithms. Our results refute a previous claim that all proposals had nearly identical performance and will prompt further research into efficient MCMC proposals.
Influence of credit scoring on the dynamics of Markov chain
NASA Astrophysics Data System (ADS)
Galina, Timofeeva
2015-11-01
Markov processes are widely used to model the dynamics of a credit portfolio and forecast the portfolio risk and profitability. In the Markov chain model the loan portfolio is divided into several groups with different quality, which determined by presence of indebtedness and its terms. It is proposed that dynamics of portfolio shares is described by a multistage controlled system. The article outlines mathematical formalization of controls which reflect the actions of the bank's management in order to improve the loan portfolio quality. The most important control is the organization of approval procedure of loan applications. The credit scoring is studied as a control affecting to the dynamic system. Different formalizations of "good" and "bad" consumers are proposed in connection with the Markov chain model.
Markov chain for estimating human mitochondrial DNA mutation pattern
NASA Astrophysics Data System (ADS)
Vantika, Sandy; Pasaribu, Udjianna S.
2015-12-01
The Markov chain was proposed to estimate the human mitochondrial DNA mutation pattern. One DNA sequence was taken randomly from 100 sequences in Genbank. The nucleotide transition matrix and mutation transition matrix were estimated from this sequence. We determined whether the states (mutation/normal) are recurrent or transient. The results showed that both of them are recurrent.
Operations and support cost modeling using Markov chains
NASA Technical Reports Server (NTRS)
Unal, Resit
1989-01-01
Systems for future missions will be selected with life cycle costs (LCC) as a primary evaluation criterion. This reflects the current realization that only systems which are considered affordable will be built in the future due to the national budget constaints. Such an environment calls for innovative cost modeling techniques which address all of the phases a space system goes through during its life cycle, namely: design and development, fabrication, operations and support; and retirement. A significant portion of the LCC for reusable systems are generated during the operations and support phase (OS). Typically, OS costs can account for 60 to 80 percent of the total LCC. Clearly, OS costs are wholly determined or at least strongly influenced by decisions made during the design and development phases of the project. As a result OS costs need to be considered and estimated early in the conceptual phase. To be effective, an OS cost estimating model needs to account for actual instead of ideal processes by associating cost elements with probabilities. One approach that may be suitable for OS cost modeling is the use of the Markov Chain Process. Markov chains are an important method of probabilistic analysis for operations research analysts but they are rarely used for life cycle cost analysis. This research effort evaluates the use of Markov Chains in LCC analysis by developing OS cost model for a hypothetical reusable space transportation vehicle (HSTV) and suggests further uses of the Markov Chain process as a design-aid tool.
Bayesian internal dosimetry calculations using Markov Chain Monte Carlo.
Miller, G; Martz, H F; Little, T T; Guilmette, R
2002-01-01
A new numerical method for solving the inverse problem of internal dosimetry is described. The new method uses Markov Chain Monte Carlo and the Metropolis algorithm. Multiple intake amounts, biokinetic types, and times of intake are determined from bioassay data by integrating over the Bayesian posterior distribution. The method appears definitive, but its application requires a large amount of computing time.
Exact goodness-of-fit tests for Markov chains.
Besag, J; Mondal, D
2013-06-01
Goodness-of-fit tests are useful in assessing whether a statistical model is consistent with available data. However, the usual χ² asymptotics often fail, either because of the paucity of the data or because a nonstandard test statistic is of interest. In this article, we describe exact goodness-of-fit tests for first- and higher order Markov chains, with particular attention given to time-reversible ones. The tests are obtained by conditioning on the sufficient statistics for the transition probabilities and are implemented by simple Monte Carlo sampling or by Markov chain Monte Carlo. They apply both to single and to multiple sequences and allow a free choice of test statistic. Three examples are given. The first concerns multiple sequences of dry and wet January days for the years 1948-1983 at Snoqualmie Falls, Washington State, and suggests that standard analysis may be misleading. The second one is for a four-state DNA sequence and lends support to the original conclusion that a second-order Markov chain provides an adequate fit to the data. The last one is six-state atomistic data arising in molecular conformational dynamics simulation of solvated alanine dipeptide and points to strong evidence against a first-order reversible Markov chain at 6 picosecond time steps.
Building Higher-Order Markov Chain Models with EXCEL
ERIC Educational Resources Information Center
Ching, Wai-Ki; Fung, Eric S.; Ng, Michael K.
2004-01-01
Categorical data sequences occur in many applications such as forecasting, data mining and bioinformatics. In this note, we present higher-order Markov chain models for modelling categorical data sequences with an efficient algorithm for solving the model parameters. The algorithm can be implemented easily in a Microsoft EXCEL worksheet. We give a…
Exploring Mass Perception with Markov Chain Monte Carlo
ERIC Educational Resources Information Center
Cohen, Andrew L.; Ross, Michael G.
2009-01-01
Several previous studies have examined the ability to judge the relative mass of objects in idealized collisions. With a newly developed technique of psychological Markov chain Monte Carlo sampling (A. N. Sanborn & T. L. Griffiths, 2008), this work explores participants; perceptions of different collision mass ratios. The results reveal…
Using Markov Chain Analyses in Counselor Education Research
ERIC Educational Resources Information Center
Duys, David K.; Headrick, Todd C.
2004-01-01
This study examined the efficacy of an infrequently used statistical analysis in counselor education research. A Markov chain analysis was used to examine hypothesized differences between students' use of counseling skills in an introductory course. Thirty graduate students participated in the study. Independent raters identified the microskills…
Students' Progress throughout Examination Process as a Markov Chain
ERIC Educational Resources Information Center
Hlavatý, Robert; Dömeová, Ludmila
2014-01-01
The paper is focused on students of Mathematical methods in economics at the Czech university of life sciences (CULS) in Prague. The idea is to create a model of students' progress throughout the whole course using the Markov chain approach. Each student has to go through various stages of the course requirements where his success depends on the…
Fuzzy Markov random fields versus chains for multispectral image segmentation.
Salzenstein, Fabien; Collet, Christophe
2006-11-01
This paper deals with a comparison of recent statistical models based on fuzzy Markov random fields and chains for multispectral image segmentation. The fuzzy scheme takes into account discrete and continuous classes which model the imprecision of the hidden data. In this framework, we assume the dependence between bands and we express the general model for the covariance matrix. A fuzzy Markov chain model is developed in an unsupervised way. This method is compared with the fuzzy Markovian field model previously proposed by one of the authors. The segmentation task is processed with Bayesian tools, such as the well-known MPM (Mode of Posterior Marginals) criterion. Our goal is to compare the robustness and rapidity for both methods (fuzzy Markov fields versus fuzzy Markov chains). Indeed, such fuzzy-based procedures seem to be a good answer, e.g., for astronomical observations when the patterns present diffuse structures. Moreover, these approaches allow us to process missing data in one or several spectral bands which correspond to specific situations in astronomy. To validate both models, we perform and compare the segmentation on synthetic images and raw multispectral astronomical data.
Time operator of Markov chains and mixing times. Applications to financial data
NASA Astrophysics Data System (ADS)
Gialampoukidis, I.; Gustafson, K.; Antoniou, I.
2014-12-01
We extend the notion of Time Operator from Kolmogorov Dynamical Systems and Bernoulli processes to Markov processes. The general methodology is presented and illustrated in the simple case of binary processes. We present a method to compute the eigenfunctions of the Time Operator. Internal Ages are related to other characteristic times of Markov chains, namely the Kemeny time, the convergence rate and Goodman’s intrinsic time. We clarified the concept of mixing time by providing analytic formulas for two-state Markov chains. Explicit formulas for mixing times are presented for any two-state regular Markov chain. The mixing time of a Markov chain is determined also by the Time Operator of the Markov chain, within its Age computation. We illustrate these results in terms of two realistic examples: A Markov chain from US GNP data and a Markov chain from Dow Jones closing prices. We propose moreover a representation for the Kemeny constant, in terms of internal Ages.
An Overview of Markov Chain Methods for the Study of Stage-Sequential Developmental Processes
ERIC Educational Resources Information Center
Kapland, David
2008-01-01
This article presents an overview of quantitative methodologies for the study of stage-sequential development based on extensions of Markov chain modeling. Four methods are presented that exemplify the flexibility of this approach: the manifest Markov model, the latent Markov model, latent transition analysis, and the mixture latent Markov model.…
Singer, Philipp; Helic, Denis; Taraghi, Behnam; Strohmaier, Markus
2014-01-01
One of the most frequently used models for understanding human navigation on the Web is the Markov chain model, where Web pages are represented as states and hyperlinks as probabilities of navigating from one page to another. Predominantly, human navigation on the Web has been thought to satisfy the memoryless Markov property stating that the next page a user visits only depends on her current page and not on previously visited ones. This idea has found its way in numerous applications such as Google's PageRank algorithm and others. Recently, new studies suggested that human navigation may better be modeled using higher order Markov chain models, i.e., the next page depends on a longer history of past clicks. Yet, this finding is preliminary and does not account for the higher complexity of higher order Markov chain models which is why the memoryless model is still widely used. In this work we thoroughly present a diverse array of advanced inference methods for determining the appropriate Markov chain order. We highlight strengths and weaknesses of each method and apply them for investigating memory and structure of human navigation on the Web. Our experiments reveal that the complexity of higher order models grows faster than their utility, and thus we confirm that the memoryless model represents a quite practical model for human navigation on a page level. However, when we expand our analysis to a topical level, where we abstract away from specific page transitions to transitions between topics, we find that the memoryless assumption is violated and specific regularities can be observed. We report results from experiments with two types of navigational datasets (goal-oriented vs. free form) and observe interesting structural differences that make a strong argument for more contextual studies of human navigation in future work.
Constructing 1/ωα noise from reversible Markov chains
NASA Astrophysics Data System (ADS)
Erland, Sveinung; Greenwood, Priscilla E.
2007-09-01
This paper gives sufficient conditions for the output of 1/ωα noise from reversible Markov chains on finite state spaces. We construct several examples exhibiting this behavior in a specified range of frequencies. We apply simple representations of the covariance function and the spectral density in terms of the eigendecomposition of the probability transition matrix. The results extend to hidden Markov chains. We generalize the results for aggregations of AR1-processes of C. W. J. Granger [J. Econometrics 14, 227 (1980)]. Given the eigenvalue function, there is a variety of ways to assign values to the states such that the 1/ωα condition is satisfied. We show that a random walk on a certain state space is complementary to the point process model of 1/ω noise of B. Kaulakys and T. Meskauskas [Phys. Rev. E 58, 7013 (1998)]. Passing to a continuous state space, we construct 1/ωα noise which also has a long memory.
Statistical significance test for transition matrices of atmospheric Markov chains
NASA Technical Reports Server (NTRS)
Vautard, Robert; Mo, Kingtse C.; Ghil, Michael
1990-01-01
Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.
Liouville equation and Markov chains: epistemological and ontological probabilities
NASA Astrophysics Data System (ADS)
Costantini, D.; Garibaldi, U.
2006-06-01
The greatest difficulty of a probabilistic approach to the foundations of Statistical Mechanics lies in the fact that for a system ruled by classical or quantum mechanics a basic description exists, whose evolution is deterministic. For such a system any kind of irreversibility is impossible in principle. The probability used in this approach is epistemological. On the contrary for irreducible aperiodic Markov chains the invariant measure is reached with probability one whatever the initial conditions. Almost surely the uniform distributions, on which the equilibrium treatment of quantum and classical perfect gases is based, are reached when time goes by. The transition probability for binary collision, deduced by the Ehrenfest-Brillouin model, points out an irreducible aperiodic Markov chain and thus an equilibrium distribution. This means that we are describing the temporal probabilistic evolution of the system. The probability involved in this evolution is ontological.
Parallel algorithms for simulating continuous time Markov chains
NASA Technical Reports Server (NTRS)
Nicol, David M.; Heidelberger, Philip
1992-01-01
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.
Space system operations and support cost analysis using Markov chains
NASA Technical Reports Server (NTRS)
Unal, Resit; Dean, Edwin B.; Moore, Arlene A.; Fairbairn, Robert E.
1990-01-01
This paper evaluates the use of Markov chain process in probabilistic life cycle cost analysis and suggests further uses of the process as a design aid tool. A methodology is developed for estimating operations and support cost and expected life for reusable space transportation systems. Application of the methodology is demonstrated for the case of a hypothetical space transportation vehicle. A sensitivity analysis is carried out to explore the effects of uncertainty in key model inputs.
Searching for convergence in phylogenetic Markov chain Monte Carlo.
Beiko, Robert G; Keith, Jonathan M; Harlow, Timothy J; Ragan, Mark A
2006-08-01
Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, delta and epsilon, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a "metachain" to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely.
Markov Chain evaluation of acute postoperative pain transition states
Tighe, Patrick J.; Bzdega, Matthew; Fillingim, Roger B.; Rashidi, Parisa; Aytug, Haldun
2016-01-01
Prior investigations on acute postoperative pain dynamicity have focused on daily pain assessments, and so were unable to examine intra-day variations in acute pain intensity. We analyzed 476,108 postoperative acute pain intensity ratings clinically documented on postoperative days 1 to 7 from 8,346 surgical patients using Markov Chain modeling to describe how patients are likely to transition from one pain state to another in a probabilistic fashion. The Markov Chain was found to be irreducible and positive recurrent, with no absorbing states. Transition probabilities ranged from 0.0031 for the transition from state 10 to state 1, to 0.69 for the transition from state zero to state zero. The greatest density of transitions was noted in the diagonal region of the transition matrix, suggesting that patients were generally most likely to transition to the same pain state as their current state. There were also slightly increased probability densities in transitioning to a state of asleep or zero from the current state. Examination of the number of steps required to traverse from a particular first pain score to a target state suggested that overall, fewer steps were required to reach a state of zero (range 6.1–8.8 steps) or asleep (range 9.1–11) than were required to reach a mild pain intensity state. Our results suggest that Markov Chains are a feasible method for describing probabilistic postoperative pain trajectories, pointing toward the possibility of using Markov decision processes to model sequential interactions between pain intensity ratings and postoperative analgesic interventions. PMID:26588689
Markov chain decision model for urinary incontinence procedures.
Kumar, Sameer; Ghildayal, Nidhi; Ghildayal, Neha
2017-03-13
Purpose Urinary incontinence (UI) is a common chronic health condition, a problem specifically among elderly women that impacts quality of life negatively. However, UI is usually viewed as likely result of old age, and as such is generally not evaluated or even managed appropriately. Many treatments are available to manage incontinence, such as bladder training and numerous surgical procedures such as Burch colposuspension and Sling for UI which have high success rates. The purpose of this paper is to analyze which of these popular surgical procedures for UI is effective. Design/methodology/approach This research employs randomized, prospective studies to obtain robust cost and utility data used in the Markov chain decision model for examining which of these surgical interventions is more effective in treating women with stress UI based on two measures: number of quality adjusted life years (QALY) and cost per QALY. Treeage Pro Healthcare software was employed in Markov decision analysis. Findings Results showed the Sling procedure is a more effective surgical intervention than the Burch. However, if a utility greater than certain utility value, for which both procedures are equally effective, is assigned to persistent incontinence, the Burch procedure is more effective than the Sling procedure. Originality/value This paper demonstrates the efficacy of a Markov chain decision modeling approach to study the comparative effectiveness analysis of available treatments for patients with UI, an important public health issue, widely prevalent among elderly women in developed and developing countries. This research also improves upon other analyses using a Markov chain decision modeling process to analyze various strategies for treating UI.
Exact likelihood-free Markov chain Monte Carlo for elliptically contoured distributions.
Muchmore, Patrick; Marjoram, Paul
2015-08-01
Recent results in Markov chain Monte Carlo (MCMC) show that a chain based on an unbiased estimator of the likelihood can have a stationary distribution identical to that of a chain based on exact likelihood calculations. In this paper we develop such an estimator for elliptically contoured distributions, a large family of distributions that includes and generalizes the multivariate normal. We then show how this estimator, combined with pseudorandom realizations of an elliptically contoured distribution, can be used to run MCMC in a way that replicates the stationary distribution of a likelihood based chain, but does not require explicit likelihood calculations. Because many elliptically contoured distributions do not have closed form densities, our simulation based approach enables exact MCMC based inference in a range of cases where previously it was impossible.
Exact Likelihood-free Markov Chain Monte Carlo for Elliptically Contoured Distributions
Marjoram, Paul
2015-01-01
Recent results in Markov chain Monte Carlo (MCMC) show that a chain based on an unbiased estimator of the likelihood can have a stationary distribution identical to that of a chain based on exact likelihood calculations. In this paper we develop such an estimator for elliptically contoured distributions, a large family of distributions that includes and generalizes the multivariate normal. We then show how this estimator, combined with pseudorandom realizations of an elliptically contoured distribution, can be used to run MCMC in a way that replicates the stationary distribution of a likelihood based chain, but does not require explicit likelihood calculations. Because many elliptically contoured distributions do not have closed form densities, our simulation based approach enables exact MCMC based inference in a range of cases where previously it was impossible. PMID:26167984
Markov chain Monte Carlo methods for state-space models with point process observations.
Yuan, Ke; Girolami, Mark; Niranjan, Mahesan
2012-06-01
This letter considers how a number of modern Markov chain Monte Carlo (MCMC) methods can be applied for parameter estimation and inference in state-space models with point process observations. We quantified the efficiencies of these MCMC methods on synthetic data, and our results suggest that the Reimannian manifold Hamiltonian Monte Carlo method offers the best performance. We further compared such a method with a previously tested variational Bayes method on two experimental data sets. Results indicate similar performance on the large data sets and superior performance on small ones. The work offers an extensive suite of MCMC algorithms evaluated on an important class of models for physiological signal analysis.
A Markov chain representation of the multiple testing problem.
Cabras, Stefano
2016-03-16
The problem of multiple hypothesis testing can be represented as a Markov process where a new alternative hypothesis is accepted in accordance with its relative evidence to the currently accepted one. This virtual and not formally observed process provides the most probable set of non null hypotheses given the data; it plays the same role as Markov Chain Monte Carlo in approximating a posterior distribution. To apply this representation and obtain the posterior probabilities over all alternative hypotheses, it is enough to have, for each test, barely defined Bayes Factors, e.g. Bayes Factors obtained up to an unknown constant. Such Bayes Factors may either arise from using default and improper priors or from calibrating p-values with respect to their corresponding Bayes Factor lower bound. Both sources of evidence are used to form a Markov transition kernel on the space of hypotheses. The approach leads to easy interpretable results and involves very simple formulas suitable to analyze large datasets as those arising from gene expression data (microarray or RNA-seq experiments).
Topological Charge Evolution in the Markov-Chain of QCD
Derek Leinweber; Anthony Williams; Jian-bo Zhang; Frank Lee
2004-04-01
The topological charge is studied on lattices of large physical volume and fine lattice spacing. We illustrate how a parity transformation on the SU(3) link-variables of lattice gauge configurations reverses the sign of the topological charge and leaves the action invariant. Random applications of the parity transformation are proposed to traverse from one topological charge sign to the other. The transformation provides an improved unbiased estimator of the ensemble average and is essential in improving the ergodicity of the Markov chain process.
Markov chain Monte Carlo method without detailed balance.
Suwa, Hidemaro; Todo, Synge
2010-09-17
We present a specific algorithm that generally satisfies the balance condition without imposing the detailed balance in the Markov chain Monte Carlo. In our algorithm, the average rejection rate is minimized, and even reduced to zero in many relevant cases. The absence of the detailed balance also introduces a net stochastic flow in a configuration space, which further boosts up the convergence. We demonstrate that the autocorrelation time of the Potts model becomes more than 6 times shorter than that by the conventional Metropolis algorithm. Based on the same concept, a bounce-free worm algorithm for generic quantum spin models is formulated as well.
Exploring mass perception with Markov chain Monte Carlo.
Cohen, Andrew L; Ross, Michael G
2009-12-01
Several previous studies have examined the ability to judge the relative mass of objects in idealized collisions. With a newly developed technique of psychological Markov chain Monte Carlo sampling (A. N. Sanborn & T. L. Griffiths, 2008), this work explores participants' perceptions of different collision mass ratios. The results reveal interparticipant differences and a qualitative distinction between the perception of 1:1 and 1:2 ratios. The results strongly suggest that participants' perceptions of 1:1 collisions are described by simple heuristics. The evidence for 1:2 collisions favors heuristic perception models that are sensitive to the sign but not the magnitude of perceived mass differences.
A Markov chain model for reliability growth and decay
NASA Technical Reports Server (NTRS)
Siegrist, K.
1982-01-01
A mathematical model is developed to describe a complex system undergoing a sequence of trials in which there is interaction between the internal states of the system and the outcomes of the trials. For example, the model might describe a system undergoing testing that is redesigned after each failure. The basic assumptions for the model are that the state of the system after a trial depends probabilistically only on the state before the trial and on the outcome of the trial and that the outcome of a trial depends probabilistically only on the state of the system before the trial. It is shown that under these basic assumptions, the successive states form a Markov chain and the successive states and outcomes jointly form a Markov chain. General results are obtained for the transition probabilities, steady-state distributions, etc. A special case studied in detail describes a system that has two possible state ('repaired' and 'unrepaired') undergoing trials that have three possible outcomes ('inherent failure', 'assignable-cause' 'failure' and 'success'). For this model, the reliability function is computed explicitly and an optimal repair policy is obtained.
Radiative transfer calculated from a Markov chain formalism
NASA Technical Reports Server (NTRS)
Esposito, L. W.; House, L. L.
1978-01-01
The theory of Markov chains is used to formulate the radiative transport problem in a general way by modeling the successive interactions of a photon as a stochastic process. Under the minimal requirement that the stochastic process is a Markov chain, the determination of the diffuse reflection or transmission from a scattering atmosphere is equivalent to the solution of a system of linear equations. This treatment is mathematically equivalent to, and thus has many of the advantages of, Monte Carlo methods, but can be considerably more rapid than Monte Carlo algorithms for numerical calculations in particular applications. We have verified the speed and accuracy of this formalism for the standard problem of finding the intensity of scattered light from a homogeneous plane-parallel atmosphere with an arbitrary phase function for scattering. Accurate results over a wide range of parameters were obtained with computation times comparable to those of a standard 'doubling' routine. The generality of this formalism thus allows fast, direct solutions to problems that were previously soluble only by Monte Carlo methods. Some comparisons are made with respect to integral equation methods.
Vrugt, Jasper A; Hyman, James M; Robinson, Bruce A; Higdon, Dave; Ter Braak, Cajo J F; Diks, Cees G H
2008-01-01
Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well constructed MCMC schemes to the appropriate limiting distribution under a variety of different conditions. In practice, however this convergence is often observed to be disturbingly slow. This is frequently caused by an inappropriate selection of the proposal distribution used to generate trial moves in the Markov Chain. Here we show that significant improvements to the efficiency of MCMC simulation can be made by using a self-adaptive Differential Evolution learning strategy within a population-based evolutionary framework. This scheme, entitled DiffeRential Evolution Adaptive Metropolis or DREAM, runs multiple different chains simultaneously for global exploration, and automatically tunes the scale and orientation of the proposal distribution in randomized subspaces during the search. Ergodicity of the algorithm is proved, and various examples involving nonlinearity, high-dimensionality, and multimodality show that DREAM is generally superior to other adaptive MCMC sampling approaches. The DREAM scheme significantly enhances the applicability of MCMC simulation to complex, multi-modal search problems.
2012-01-01
Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363
2012-08-01
AFRL-RX-WP-TP-2012-0397 INVERSE PROBLEM FOR ELECTROMAGNETIC PROPAGATION IN A DIELECTRIC MEDIUM USING MARKOV CHAIN MONTE CARLO METHOD ...SUBTITLE INVERSE PROBLEM FOR ELECTROMAGNETIC PROPAGATION IN A DIELECTRIC MEDIUM USING MARKOV CHAIN MONTE CARLO METHOD (PREPRINT) 5a. CONTRACT...a stochastic inverse methodology arising in electromagnetic imaging. Nondestructive testing using guided microwaves covers a wide range of
Bayesian seismic tomography by parallel interacting Markov chains
NASA Astrophysics Data System (ADS)
Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas
2014-05-01
The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the
Denker-Sato type Markov chains and Harnack inequality
NASA Astrophysics Data System (ADS)
Deng, Qi-Rong; Wang, Xiang-Yang
2015-10-01
In ([DS1], [DS2], [DS3]), Denker and Sato studied a Markov chain on the finite words space of the Sierpinski gasket (SG). They showed that the Martin boundary is homeomorphic to the SG. Recently, Lau and Wang (2015 Math. Z. 280 401-20) showed that the homeomorphism holds for an iterated function system with the open set condition provided that the transition probability on the finite words space is of DS-type. In this work, we continue studying this kind of transition probability on the unit interval. Using matrix expressions, we obtain a formula to calculate the Green function. By the ergodic arguments for non-negative matrices, we find that the Martin boundary is homeomorphic to the unit interval or the union of the unit interval and a countable set. This gives a good illustration for the results in Lau and Wang (2015 Math. Z. 280 401-20).
Reversible jump Markov chain Monte Carlo for deconvolution.
Kang, Dongwoo; Verotta, Davide
2007-06-01
To solve the problem of estimating an unknown input function to a linear time invariant system we propose an adaptive non-parametric method based on reversible jump Markov chain Monte Carlo (RJMCMC). We use piecewise polynomial functions (splines) to represent the input function. The RJMCMC algorithm allows the exploration of a large space of competing models, in our case the collection of splines corresponding to alternative positions of breakpoints, and it is based on the specification of transition probabilities between the models. RJMCMC determines: the number and the position of the breakpoints, and the coefficients determining the shape of the spline, as well as the corresponding posterior distribution of breakpoints, number of breakpoints, coefficients and arbitrary statistics of interest associated with the estimation problem. Simulation studies show that the RJMCMC method can obtain accurate reconstructions of complex input functions, and obtains better results compared with standard non-parametric deconvolution methods. Applications to real data are also reported.
Uncovering mental representations with Markov chain Monte Carlo.
Sanborn, Adam N; Griffiths, Thomas L; Shiffrin, Richard M
2010-03-01
A key challenge for cognitive psychology is the investigation of mental representations, such as object categories, subjective probabilities, choice utilities, and memory traces. In many cases, these representations can be expressed as a non-negative function defined over a set of objects. We present a behavioral method for estimating these functions. Our approach uses people as components of a Markov chain Monte Carlo (MCMC) algorithm, a sophisticated sampling method originally developed in statistical physics. Experiments 1 and 2 verified the MCMC method by training participants on various category structures and then recovering those structures. Experiment 3 demonstrated that the MCMC method can be used estimate the structures of the real-world animal shape categories of giraffes, horses, dogs, and cats. Experiment 4 combined the MCMC method with multidimensional scaling to demonstrate how different accounts of the structure of categories, such as prototype and exemplar models, can be tested, producing samples from the categories of apples, oranges, and grapes.
Markov Chain Monte Carlo Bayesian Learning for Neural Networks
NASA Technical Reports Server (NTRS)
Goodrich, Michael S.
2011-01-01
Conventional training methods for neural networks involve starting al a random location in the solution space of the network weights, navigating an error hyper surface to reach a minimum, and sometime stochastic based techniques (e.g., genetic algorithms) to avoid entrapment in a local minimum. It is further typically necessary to preprocess the data (e.g., normalization) to keep the training algorithm on course. Conversely, Bayesian based learning is an epistemological approach concerned with formally updating the plausibility of competing candidate hypotheses thereby obtaining a posterior distribution for the network weights conditioned on the available data and a prior distribution. In this paper, we developed a powerful methodology for estimating the full residual uncertainty in network weights and therefore network predictions by using a modified Jeffery's prior combined with a Metropolis Markov Chain Monte Carlo method.
Kinetics and thermodynamics of first-order Markov chain copolymerization
NASA Astrophysics Data System (ADS)
Gaspard, P.; Andrieux, D.
2014-07-01
We report a theoretical study of stochastic processes modeling the growth of first-order Markov copolymers, as well as the reversed reaction of depolymerization. These processes are ruled by kinetic equations describing both the attachment and detachment of monomers. Exact solutions are obtained for these kinetic equations in the steady regimes of multicomponent copolymerization and depolymerization. Thermodynamic equilibrium is identified as the state at which the growth velocity is vanishing on average and where detailed balance is satisfied. Away from equilibrium, the analytical expression of the thermodynamic entropy production is deduced in terms of the Shannon disorder per monomer in the copolymer sequence. The Mayo-Lewis equation is recovered in the fully irreversible growth regime. The theory also applies to Bernoullian chains in the case where the attachment and detachment rates only depend on the reacting monomer.
Applying diffusion-based Markov chain Monte Carlo
Paul, Rajib; Berliner, L. Mark
2017-01-01
We examine the performance of a strategy for Markov chain Monte Carlo (MCMC) developed by simulating a discrete approximation to a stochastic differential equation (SDE). We refer to the approach as diffusion MCMC. A variety of motivations for the approach are reviewed in the context of Bayesian analysis. In particular, implementation of diffusion MCMC is very simple to set-up, even in the presence of nonlinear models and non-conjugate priors. Also, it requires comparatively little problem-specific tuning. We implement the algorithm and assess its performance for both a test case and a glaciological application. Our results demonstrate that in some settings, diffusion MCMC is a faster alternative to a general Metropolis-Hastings algorithm. PMID:28301529
On the multi-level solution algorithm for Markov chains
Horton, G.
1996-12-31
We discuss the recently introduced multi-level algorithm for the steady-state solution of Markov chains. The method is based on the aggregation principle, which is well established in the literature. Recursive application of the aggregation yields a multi-level method which has been shown experimentally to give results significantly faster than the methods currently in use. The algorithm can be reformulated as an algebraic multigrid scheme of Galerkin-full approximation type. The uniqueness of the scheme stems from its solution-dependent prolongation operator which permits significant computational savings in the evaluation of certain terms. This paper describes the modeling of computer systems to derive information on performance, measured typically as job throughput or component utilization, and availability, defined as the proportion of time a system is able to perform a certain function in the presence of component failures and possibly also repairs.
Projection methods for the numerical solution of Markov chain models
NASA Technical Reports Server (NTRS)
Saad, Youcef
1989-01-01
Projection methods for computing stationary probability distributions for Markov chain models are presented. A general projection method is a method which seeks an approximation from a subspace of small dimension to the original problem. Thus, the original matrix problem of size N is approximated by one of dimension m, typically much smaller than N. A particularly successful class of methods based on this principle is that of Krylov subspace methods which utilize subspaces of the form span(v,av,...,A(exp m-1)v). These methods are effective in solving linear systems and eigenvalue problems (Lanczos, Arnoldi,...) as well as nonlinear equations. They can be combined with more traditional iterative methods such as successive overrelaxation, symmetric successive overrelaxation, or with incomplete factorization methods to enhance convergence.
On the Multilevel Solution Algorithm for Markov Chains
NASA Technical Reports Server (NTRS)
Horton, Graham
1997-01-01
We discuss the recently introduced multilevel algorithm for the steady-state solution of Markov chains. The method is based on an aggregation principle which is well established in the literature and features a multiplicative coarse-level correction. Recursive application of the aggregation principle, which uses an operator-dependent coarsening, yields a multi-level method which has been shown experimentally to give results significantly faster than the typical methods currently in use. When cast as a multigrid-like method, the algorithm is seen to be a Galerkin-Full Approximation Scheme with a solution-dependent prolongation operator. Special properties of this prolongation lead to the cancellation of the computationally intensive terms of the coarse-level equations.
Efficient inference of hidden Markov models from large observation sequences
NASA Astrophysics Data System (ADS)
Priest, Benjamin W.; Cybenko, George
2016-05-01
The hidden Markov model (HMM) is widely used to model time series data. However, the conventional Baum- Welch algorithm is known to perform poorly when applied to long observation sequences. The literature contains several alternatives that seek to improve the memory or time complexity of the algorithm. However, for an HMM with N states and an observation sequence of length T, these alternatives require at best O(N) space and O(N2T) time. Given the preponderance of applications that increasingly deal with massive amounts of data, an alternative whose time is O(T)+poly(N) is desired. Recent research presents an alternative to the Baum-Welch algorithm that relies on nonnegative matrix factorization. This document examines the space complexity of this alternative approach and proposes further optimizations using approaches adopted from the matrix sketching literature. The result is a streaming algorithm whose space complexity is constant and time complexity is linear with respect to the size of the observation sequence. The paper also presents a batch algorithm that allow for even further improved space complexity at the expense of an additional pass over the observation sequence.
NASA Astrophysics Data System (ADS)
Zhu, Yanzheng; Zhang, Lixian; Sreeram, Victor; Shammakh, Wafa; Ahmad, Bashir
2016-10-01
In this paper, the resilient model approximation problem for a class of discrete-time Markov jump time-delay systems with input sector-bounded nonlinearities is investigated. A linearised reduced-order model is determined with mode changes subject to domination by a hierarchical Markov chain containing two different nonhomogeneous Markov chains. Hence, the reduced-order model obtained not only reflects the dependence of the original systems but also model external influence that is related to the mode changes of the original system. Sufficient conditions formulated in terms of bilinear matrix inequalities for the existence of such models are established, such that the resulting error system is stochastically stable and has a guaranteed l2-l∞ error performance. A linear matrix inequalities optimisation coupled with line search is exploited to solve for the corresponding reduced-order systems. The potential and effectiveness of the developed theoretical results are demonstrated via a numerical example.
An overview of Markov chain methods for the study of stage-sequential developmental processes.
Kapland, David
2008-03-01
This article presents an overview of quantitative methodologies for the study of stage-sequential development based on extensions of Markov chain modeling. Four methods are presented that exemplify the flexibility of this approach: the manifest Markov model, the latent Markov model, latent transition analysis, and the mixture latent Markov model. A special case of the mixture latent Markov model, the so-called mover-stayer model, is used in this study. Unconditional and conditional models are estimated for the manifest Markov model and the latent Markov model, where the conditional models include a measure of poverty status. Issues of model specification, estimation, and testing using the Mplus software environment are briefly discussed, and the Mplus input syntax is provided. The author applies these 4 methods to a single example of stage-sequential development in reading competency in the early school years, using data from the Early Childhood Longitudinal Study--Kindergarten Cohort.
Harnessing graphical structure in Markov chain Monte Carlo learning
Stolorz, P.E.; Chew P.C.
1996-12-31
The Monte Carlo method is recognized as a useful tool in learning and probabilistic inference methods common to many datamining problems. Generalized Hidden Markov Models and Bayes nets are especially popular applications. However, the presence of multiple modes in many relevant integrands and summands often renders the method slow and cumbersome. Recent mean field alternatives designed to speed things up have been inspired by experience gleaned from physics. The current work adopts an approach very similar to this in spirit, but focusses instead upon dynamic programming notions as a basis for producing systematic Monte Carlo improvements. The idea is to approximate a given model by a dynamic programming-style decomposition, which then forms a scaffold upon which to build successively more accurate Monte Carlo approximations. Dynamic programming ideas alone fail to account for non-local structure, while standard Monte Carlo methods essentially ignore all structure. However, suitably-crafted hybrids can successfully exploit the strengths of each method, resulting in algorithms that combine speed with accuracy. The approach relies on the presence of significant {open_quotes}local{close_quotes} information in the problem at hand. This turns out to be a plausible assumption for many important applications. Example calculations are presented, and the overall strengths and weaknesses of the approach are discussed.
A Markov-Chain Monte-Carlo Based Method for Flaw Detection in Beams
Glaser, R E; Lee, C L; Nitao, J J; Hickling, T L; Hanley, W G
2006-09-28
A Bayesian inference methodology using a Markov Chain Monte Carlo (MCMC) sampling procedure is presented for estimating the parameters of computational structural models. This methodology combines prior information, measured data, and forward models to produce a posterior distribution for the system parameters of structural models that is most consistent with all available data. The MCMC procedure is based upon a Metropolis-Hastings algorithm that is shown to function effectively with noisy data, incomplete data sets, and mismatched computational nodes/measurement points. A series of numerical test cases based upon a cantilever beam is presented. The results demonstrate that the algorithm is able to estimate model parameters utilizing experimental data for the nodal displacements resulting from specified forces.
Of bugs and birds: Markov Chain Monte Carlo for hierarchical modeling in wildlife research
Link, W.A.; Cam, E.; Nichols, J.D.; Cooch, E.G.
2002-01-01
Markov chain Monte Carlo (MCMC) is a statistical innovation that allows researchers to fit far more complex models to data than is feasible using conventional methods. Despite its widespread use in a variety of scientific fields, MCMC appears to be underutilized in wildlife applications. This may be due to a misconception that MCMC requires the adoption of a subjective Bayesian analysis, or perhaps simply to its lack of familiarity among wildlife researchers. We introduce the basic ideas of MCMC and software BUGS (Bayesian inference using Gibbs sampling), stressing that a simple and satisfactory intuition for MCMC does not require extraordinary mathematical sophistication. We illustrate the use of MCMC with an analysis of the association between latent factors governing individual heterogeneity in breeding and survival rates of kittiwakes (Rissa tridactyla). We conclude with a discussion of the importance of individual heterogeneity for understanding population dynamics and designing management plans.
Inferring species interactions from co-occurrence data with Markov networks.
Harris, David J
2016-12-01
Inferring species interactions from co-occurrence data is one of the most controversial tasks in community ecology. One difficulty is that a single pairwise interaction can ripple through an ecological network and produce surprising indirect consequences. For example, the negative correlation between two competing species can be reversed in the presence of a third species that outcompetes both of them. Here, I apply models from statistical physics, called Markov networks or Markov random fields, that can predict the direct and indirect consequences of any possible species interaction matrix. Interactions in these models can be estimated from observed co-occurrence rates via maximum likelihood, controlling for indirect effects. Using simulated landscapes with known interactions, I evaluated Markov networks and six existing approaches. Markov networks consistently outperformed the other methods, correctly isolating direct interactions between species pairs even when indirect interactions or abiotic factors largely overpowered them. Two computationally efficient approximations, which controlled for indirect effects with partial correlations or generalized linear models, also performed well. Null models showed no evidence of being able to control for indirect effects, and reliably yielded incorrect inferences when such effects were present.
Pooley, C. M.; Bishop, S. C.; Marion, G.
2015-01-01
Bayesian statistics provides a framework for the integration of dynamic models with incomplete data to enable inference of model parameters and unobserved aspects of the system under study. An important class of dynamic models is discrete state space, continuous-time Markov processes (DCTMPs). Simulated via the Doob–Gillespie algorithm, these have been used to model systems ranging from chemistry to ecology to epidemiology. A new type of proposal, termed ‘model-based proposal’ (MBP), is developed for the efficient implementation of Bayesian inference in DCTMPs using Markov chain Monte Carlo (MCMC). This new method, which in principle can be applied to any DCTMP, is compared (using simple epidemiological SIS and SIR models as easy to follow exemplars) to a standard MCMC approach and a recently proposed particle MCMC (PMCMC) technique. When measurements are made on a single-state variable (e.g. the number of infected individuals in a population during an epidemic), model-based proposal MCMC (MBP-MCMC) is marginally faster than PMCMC (by a factor of 2–8 for the tests performed), and significantly faster than the standard MCMC scheme (by a factor of 400 at least). However, when model complexity increases and measurements are made on more than one state variable (e.g. simultaneously on the number of infected individuals in spatially separated subpopulations), MBP-MCMC is significantly faster than PMCMC (more than 100-fold for just four subpopulations) and this difference becomes increasingly large. PMID:25994297
Asteroid mass estimation using Markov-Chain Monte Carlo techniques
NASA Astrophysics Data System (ADS)
Siltala, Lauri; Granvik, Mikael
2016-10-01
Estimates for asteroid masses are based on their gravitational perturbations on the orbits of other objects such as Mars, spacecraft, or other asteroids and/or their satellites. In the case of asteroid-asteroid perturbations, this leads to a 13-dimensional inverse problem where the aim is to derive the mass of the perturbing asteroid and six orbital elements for both the perturbing asteroid and the test asteroid using astrometric observations. We have developed and implemented three different mass estimation algorithms utilizing asteroid-asteroid perturbations into the OpenOrb asteroid-orbit-computation software: the very rough 'marching' approximation, in which the asteroid orbits are fixed at a given epoch, reducing the problem to a one-dimensional estimation of the mass, an implementation of the Nelder-Mead simplex method, and most significantly, a Markov-Chain Monte Carlo (MCMC) approach. We will introduce each of these algorithms with particular focus on the MCMC algorithm, and present example results for both synthetic and real data. Our results agree with the published mass estimates, but suggest that the published uncertainties may be misleading as a consequence of using linearized mass-estimation methods. Finally, we discuss remaining challenges with the algorithms as well as future plans, particularly in connection with ESA's Gaia mission.
Compound extremes in a changing climate - a Markov chain approach
NASA Astrophysics Data System (ADS)
Sedlmeier, Katrin; Mieruch, Sebastian; Schädler, Gerd; Kottmeier, Christoph
2016-11-01
Studies using climate models and observed trends indicate that extreme weather has changed and may continue to change in the future. The potential impact of extreme events such as heat waves or droughts depends not only on their number of occurrences but also on "how these extremes occur", i.e., the interplay and succession of the events. These quantities are quite unexplored, for past changes as well as for future changes and call for sophisticated methods of analysis. To address this issue, we use Markov chains for the analysis of the dynamics and succession of multivariate or compound extreme events. We apply the method to observational data (1951-2010) and an ensemble of regional climate simulations for central Europe (1971-2000, 2021-2050) for two types of compound extremes, heavy precipitation and cold in winter and hot and dry days in summer. We identify three regions in Europe, which turned out to be likely susceptible to a future change in the succession of heavy precipitation and cold in winter, including a region in southwestern France, northern Germany and in Russia around Moscow. A change in the succession of hot and dry days in summer can be expected for regions in Spain and Bulgaria. The susceptibility to a dynamic change of hot and dry extremes in the Russian region will probably decrease.
Bayesian adaptive Markov chain Monte Carlo estimation of genetic parameters.
Mathew, B; Bauer, A M; Koistinen, P; Reetz, T C; Léon, J; Sillanpää, M J
2012-10-01
Accurate and fast estimation of genetic parameters that underlie quantitative traits using mixed linear models with additive and dominance effects is of great importance in both natural and breeding populations. Here, we propose a new fast adaptive Markov chain Monte Carlo (MCMC) sampling algorithm for the estimation of genetic parameters in the linear mixed model with several random effects. In the learning phase of our algorithm, we use the hybrid Gibbs sampler to learn the covariance structure of the variance components. In the second phase of the algorithm, we use this covariance structure to formulate an effective proposal distribution for a Metropolis-Hastings algorithm, which uses a likelihood function in which the random effects have been integrated out. Compared with the hybrid Gibbs sampler, the new algorithm had better mixing properties and was approximately twice as fast to run. Our new algorithm was able to detect different modes in the posterior distribution. In addition, the posterior mode estimates from the adaptive MCMC method were close to the REML (residual maximum likelihood) estimates. Moreover, our exponential prior for inverse variance components was vague and enabled the estimated mode of the posterior variance to be practically zero, which was in agreement with the support from the likelihood (in the case of no dominance). The method performance is illustrated using simulated data sets with replicates and field data in barley.
Markov chain Monte Carlo: an introduction for epidemiologists.
Hamra, Ghassan; MacLehose, Richard; Richardson, David
2013-04-01
Markov Chain Monte Carlo (MCMC) methods are increasingly popular among epidemiologists. The reason for this may in part be that MCMC offers an appealing approach to handling some difficult types of analyses. Additionally, MCMC methods are those most commonly used for Bayesian analysis. However, epidemiologists are still largely unfamiliar with MCMC. They may lack familiarity either with he implementation of MCMC or with interpretation of the resultant output. As with tutorials outlining the calculus behind maximum likelihood in previous decades, a simple description of the machinery of MCMC is needed. We provide an introduction to conducting analyses with MCMC, and show that, given the same data and under certain model specifications, the results of an MCMC simulation match those of methods based on standard maximum-likelihood estimation (MLE). In addition, we highlight examples of instances in which MCMC approaches to data analysis provide a clear advantage over MLE. We hope that this brief tutorial will encourage epidemiologists to consider MCMC approaches as part of their analytic tool-kit.
Markov chain analysis of succession in a rocky subtidal community.
Hill, M Forrest; Witman, Jon D; Caswell, Hal
2004-08-01
We present a Markov chain model of succession in a rocky subtidal community based on a long-term (1986-1994) study of subtidal invertebrates (14 species) at Ammen Rock Pinnacle in the Gulf of Maine. The model describes successional processes (disturbance, colonization, species persistence, and replacement), the equilibrium (stationary) community, and the rate of convergence. We described successional dynamics by species turnover rates, recurrence times, and the entropy of the transition matrix. We used perturbation analysis to quantify the response of diversity to successional rates and species removals. The equilibrium community was dominated by an encrusting sponge (Hymedesmia) and a bryozoan (Crisia eburnea). The equilibrium structure explained 98% of the variance in observed species frequencies. Dominant species have low probabilities of disturbance and high rates of colonization and persistence. On average, species turn over every 3.4 years. Recurrence times varied among species (7-268 years); rare species had the longest recurrence times. The community converged to equilibrium quickly (9.5 years), as measured by Dobrushin's coefficient of ergodicity. The largest changes in evenness would result from removal of the dominant sponge Hymedesmia. Subdominant species appear to increase evenness by slowing the dominance of Hymedesmia. Comparison of the subtidal community with intertidal and coral reef communities revealed that disturbance rates are an order of magnitude higher in coral reef than in rocky intertidal and subtidal communities. Colonization rates and turnover times, however, are lowest and longest in coral reefs, highest and shortest in intertidal communities, and intermediate in subtidal communities.
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Vrugt, Jasper A; Diks, Cees G H; Clark, Martyn P
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
Developing Markov chain models for road surface simulation
NASA Astrophysics Data System (ADS)
Israel, Wescott B.; Ferris, John B.
2007-04-01
Chassis loads and vehicle handling are primarily impacted by the road surface over which a vehicle is traversing. By accurately measuring the geometries of road surfaces, one can generate computer models of these surfaces that will allow more accurate predictions of the loads introduced to various vehicle components. However, the logistics and computational power necessary to handle such large data files makes this problem a difficult one to resolve, especially when vehicle design deadlines are impending. This work aims to improve this process by developing Markov Chain models by which all relevant characteristics of road surface geometries will be represented in the model. This will reduce the logistical difficulties that are presented when attempting to collect data and run a simulation using large data sets of individual roads. Models will be generated primarily from measured road profiles of highways in the United States. Any synthetic road realized from a particular model is representative of all profiles in the set from which the model was derived. Realizations of any length can then be generated allowing efficient simulation and timely information about chassis loads that can be used to make better informed design decisions, more quickly.
Threshold partitioning of sparse matrices and applications to Markov chains
Choi, Hwajeong; Szyld, D.B.
1996-12-31
It is well known that the order of the variables and equations of a large, sparse linear system influences the performance of classical iterative methods. In particular if, after a symmetric permutation, the blocks in the diagonal have more nonzeros, classical block methods have a faster asymptotic rate of convergence. In this paper, different ordering and partitioning algorithms for sparse matrices are presented. They are modifications of PABLO. In the new algorithms, in addition to the location of the nonzeros, the values of the entries are taken into account. The matrix resulting after the symmetric permutation has dense blocks along the diagonal, and small entries in the off-diagonal blocks. Parameters can be easily adjusted to obtain, for example, denser blocks, or blocks with elements of larger magnitude. In particular, when the matrices represent Markov chains, the permuted matrices are well suited for block iterative methods that find the corresponding probability distribution. Applications to three types of methods are explored: (1) Classical block methods, such as Block Gauss Seidel. (2) Preconditioned GMRES, where a block diagonal preconditioner is used. (3) Iterative aggregation method (also called aggregation/disaggregation) where the partition obtained from the ordering algorithm with certain parameters is used as an aggregation scheme. In all three cases, experiments are presented which illustrate the performance of the methods with the new orderings. The complexity of the new algorithms is linear in the number of nonzeros and the order of the matrix, and thus adding little computational effort to the overall solution.
MARKOV CHAIN MONTE CARLO POSTERIOR SAMPLING WITH THE HAMILTONIAN METHOD
K. HANSON
2001-02-01
The Markov Chain Monte Carlo technique provides a means for drawing random samples from a target probability density function (pdf). MCMC allows one to assess the uncertainties in a Bayesian analysis described by a numerically calculated posterior distribution. This paper describes the Hamiltonian MCMC technique in which a momentum variable is introduced for each parameter of the target pdf. In analogy to a physical system, a Hamiltonian H is defined as a kinetic energy involving the momenta plus a potential energy {var_phi}, where {var_phi} is minus the logarithm of the target pdf. Hamiltonian dynamics allows one to move along trajectories of constant H, taking large jumps in the parameter space with relatively few evaluations of {var_phi} and its gradient. The Hamiltonian algorithm alternates between picking a new momentum vector and following such trajectories. The efficiency of the Hamiltonian method for multidimensional isotropic Gaussian pdfs is shown to remain constant at around 7% for up to several hundred dimensions. The Hamiltonian method handles correlations among the variables much better than the standard Metropolis algorithm. A new test, based on the gradient of {var_phi}, is proposed to measure the convergence of the MCMC sequence.
Technical manual for basic version of the Markov chain nest productivity model (MCnest)
The Markov Chain Nest Productivity Model (or MCnest) integrates existing toxicity information from three standardized avian toxicity tests with information on species life history and the timing of pesticide applications relative to the timing of avian breeding seasons to quantit...
1986-10-01
these theorems to find steady-state solutions of Markov chains are analysed. The results obtained in this way are then applied to quasi birth-death processes. Keywords: computations; algorithms; equalibrium equations.
A Simple Discrete Model of Brownian Motors: Time-periodic Markov Chains
NASA Astrophysics Data System (ADS)
Ge, Hao; Jiang, Da-Quan; Qian, Min
2006-05-01
In this paper, we consider periodically inhomogeneous Markov chains, which can be regarded as a simple version of physical model—Brownian motors. We introduce for them the concepts of periodical reversibility, detailed balance, entropy production rate and circulation distribution. We prove the equivalence of the following statements: The time-periodic Markov chain is periodically reversible; It is in detailed balance; Kolmogorov's cycle condition is satisfied; Its entropy production rate vanishes; Every circuit and its reversed circuit have the same circulation weight. Hence, in our model of Markov chains, the directed transport phenomenon of Brownian motors, i.e. the existence of net circulation, can occur only in nonequilibrium and irreversible systems. Moreover, we verify the large deviation property and the Gallavotti-Cohen fluctuation theorem of sample entropy production rates of the Markov chain.
User’s manual for basic version of MCnest Markov chain nest productivity model
The Markov Chain Nest Productivity Model (or MCnest) integrates existing toxicity information from three standardized avian toxicity tests with information on species life history and the timing of pesticide applications relative to the timing of avian breeding seasons to quantit...
Cool walking: a new Markov chain Monte Carlo sampling method.
Brown, Scott; Head-Gordon, Teresa
2003-01-15
Effective relaxation processes for difficult systems like proteins or spin glasses require special simulation techniques that permit barrier crossing to ensure ergodic sampling. Numerous adaptations of the venerable Metropolis Monte Carlo (MMC) algorithm have been proposed to improve its sampling efficiency, including various hybrid Monte Carlo (HMC) schemes, and methods designed specifically for overcoming quasi-ergodicity problems such as Jump Walking (J-Walking), Smart Walking (S-Walking), Smart Darting, and Parallel Tempering. We present an alternative to these approaches that we call Cool Walking, or C-Walking. In C-Walking two Markov chains are propagated in tandem, one at a high (ergodic) temperature and the other at a low temperature. Nonlocal trial moves for the low temperature walker are generated by first sampling from the high-temperature distribution, then performing a statistical quenching process on the sampled configuration to generate a C-Walking jump move. C-Walking needs only one high-temperature walker, satisfies detailed balance, and offers the important practical advantage that the high and low-temperature walkers can be run in tandem with minimal degradation of sampling due to the presence of correlations. To make the C-Walking approach more suitable to real problems we decrease the required number of cooling steps by attempting to jump at intermediate temperatures during cooling. We further reduce the number of cooling steps by utilizing "windows" of states when jumping, which improves acceptance ratios and lowers the average number of cooling steps. We present C-Walking results with comparisons to J-Walking, S-Walking, Smart Darting, and Parallel Tempering on a one-dimensional rugged potential energy surface in which the exact normalized probability distribution is known. C-Walking shows superior sampling as judged by two ergodic measures.
Markov chain Monte Carlo sampling of gene genealogies conditional on unphased SNP genotype data.
Burkett, Kelly M; McNeney, Brad; Graham, Jinko
2013-10-01
The gene genealogy is a tree describing the ancestral relationships among genes sampled from unrelated individuals. Knowledge of the tree is useful for inference of population-genetic parameters and has potential application in gene-mapping. Markov chain Monte Carlo approaches that sample genealogies conditional on observed genetic data typically assume that haplotype data are observed even though commonly-used genotyping technologies provide only unphased genotype data. We have extended our haplotype-based genealogy sampler, sampletrees, to handle unphased genotype data. We use the sampled haplotype configurations as a diagnostic for adequate sampling of the tree space based on the reasoning that if haplotype sampling is restricted, sampling from the tree space will also be restricted. We compare the distributions of sampled haplotypes across multiple runs of sampletrees, and to those estimated by the phase inference program, PHASE. Performance was excellent for the majority of individuals as shown by the consistency of results across multiple runs. However, for some individuals in some datasets, sampletrees had problems sampling haplotype configurations; longer run lengths would be required for these datasets. For many datasets though, we expect that sampletrees will be useful for sampling from the posterior distribution of gene genealogies given unphased genotype data.
2011-01-01
Background Continuous time Markov chains (CTMCs) is a widely used model for describing the evolution of DNA sequences on the nucleotide, amino acid or codon level. The sufficient statistics for CTMCs are the time spent in a state and the number of changes between any two states. In applications past evolutionary events (exact times and types of changes) are unaccessible and the past must be inferred from DNA sequence data observed in the present. Results We describe and implement three algorithms for computing linear combinations of expected values of the sufficient statistics, conditioned on the end-points of the chain, and compare their performance with respect to accuracy and running time. The first algorithm is based on an eigenvalue decomposition of the rate matrix (EVD), the second on uniformization (UNI), and the third on integrals of matrix exponentials (EXPM). The implementation in R of the algorithms is available at http://www.birc.au.dk/~paula/. Conclusions We use two different models to analyze the accuracy and eight experiments to investigate the speed of the three algorithms. We find that they have similar accuracy and that EXPM is the slowest method. Furthermore we find that UNI is usually faster than EVD. PMID:22142146
NASA Astrophysics Data System (ADS)
Jamaluddin, Fadhilah; Rahim, Rahela Abdul
2015-12-01
Markov Chain has been introduced since the 1913 for the purpose of studying the flow of data for a consecutive number of years of the data and also forecasting. The important feature in Markov Chain is obtaining the accurate Transition Probability Matrix (TPM). However to obtain the suitable TPM is hard especially in involving long-term modeling due to unavailability of data. This paper aims to enhance the classical Markov Chain by introducing Exponential Smoothing technique in developing the appropriate TPM.
Weighted Markov Chains and Graphic State Nodes for Information Retrieval.
ERIC Educational Resources Information Center
Benoit, G.
2002-01-01
Discusses users' search behavior and decision making in data mining and information retrieval. Describes iterative information seeking as a Markov process during which users advance through states of nodes; and explains how the information system records the decision as weights, allowing the incorporation of users' decisions into the Markov…
Reliability analysis and prediction of mixed mode load using Markov Chain Model
Nikabdullah, N.; Singh, S. S. K.; Alebrahim, R.; Azizi, M. A.; K, Elwaleed A.; Noorani, M. S. M.
2014-06-19
The aim of this paper is to present the reliability analysis and prediction of mixed mode loading by using a simple two state Markov Chain Model for an automotive crankshaft. The reliability analysis and prediction for any automotive component or structure is important for analyzing and measuring the failure to increase the design life, eliminate or reduce the likelihood of failures and safety risk. The mechanical failures of the crankshaft are due of high bending and torsion stress concentration from high cycle and low rotating bending and torsional stress. The Markov Chain was used to model the two states based on the probability of failure due to bending and torsion stress. In most investigations it revealed that bending stress is much serve than torsional stress, therefore the probability criteria for the bending state would be higher compared to the torsion state. A statistical comparison between the developed Markov Chain Model and field data was done to observe the percentage of error. The reliability analysis and prediction was derived and illustrated from the Markov Chain Model were shown in the Weibull probability and cumulative distribution function, hazard rate and reliability curve and the bathtub curve. It can be concluded that Markov Chain Model has the ability to generate near similar data with minimal percentage of error and for a practical application; the proposed model provides a good accuracy in determining the reliability for the crankshaft under mixed mode loading.
Lele, Subhash R; Dennis, Brian; Lutscher, Frithjof
2007-07-01
We introduce a new statistical computing method, called data cloning, to calculate maximum likelihood estimates and their standard errors for complex ecological models. Although the method uses the Bayesian framework and exploits the computational simplicity of the Markov chain Monte Carlo (MCMC) algorithms, it provides valid frequentist inferences such as the maximum likelihood estimates and their standard errors. The inferences are completely invariant to the choice of the prior distributions and therefore avoid the inherent subjectivity of the Bayesian approach. The data cloning method is easily implemented using standard MCMC software. Data cloning is particularly useful for analysing ecological situations in which hierarchical statistical models, such as state-space models and mixed effects models, are appropriate. We illustrate the method by fitting two nonlinear population dynamics models to data in the presence of process and observation noise.
Finding noncommunicating sets for Markov chain Monte Carlo estimations on pedigrees
Lin, S. ); Thompson, E.; Wijsman, E. )
1994-04-01
Markov chain Monte Carlo (MCMC) has recently gained use as a method of estimating required probability and likelihood functions in pedigree analysis, when exact computation is impractical. However, when a multiallelic locus is involved, irreducibility of the constructed Markov chain, an essential requirement of the MCMC method, may fail. Solutions proposed by several researchers, which do not identify all the noncommunicating sets of genotypic configurations, are inefficient with highly polymorphic loci. This is a particularly serious problem in linkage analysis, because highly polymorphic markers are much more informative and thus are preferred. In the present paper, the authors describe an algorithm that finds all the noncommunicating classes of genotypic configurations on any pedigree. This leads to a more efficient method of defining an irreducible Markov chain. Examples, including a pedigree from a genetic study of familial Alzheimer disease, are used to illustrate how the algorithm works and how penetrances are modified for specific individuals to ensure irreducibility. 20 refs., 7 figs., 6 tabs.
Zou, Yonghong; Christensen, Erik R; Zheng, Wei; Wei, Hua; Li, An
2014-11-01
A stochastic process was developed to simulate the stepwise debromination pathways for polybrominated diphenyl ethers (PBDEs). The stochastic process uses an analogue Markov Chain Monte Carlo (AMCMC) algorithm to generate PBDE debromination profiles. The acceptance or rejection of the randomly drawn stepwise debromination reactions was determined by a maximum likelihood function. The experimental observations at certain time points were used as target profiles; therefore, the stochastic processes are capable of presenting the effects of reaction conditions on the selection of debromination pathways. The application of the model is illustrated by adopting the experimental results of decabromodiphenyl ether (BDE209) in hexane exposed to sunlight. Inferences that were not obvious from experimental data were suggested by model simulations. For example, BDE206 has much higher accumulation at the first 30 min of sunlight exposure. By contrast, model simulation suggests that, BDE206 and BDE207 had comparable yields from BDE209. The reason for the higher BDE206 level is that BDE207 has the highest depletion in producing octa products. Compared to a previous version of the stochastic model based on stochastic reaction sequences (SRS), the AMCMC approach was determined to be more efficient and robust. Due to the feature of only requiring experimental observations as input, the AMCMC model is expected to be applicable to a wide range of PBDE debromination processes, e.g. microbial, photolytic, or joint effects in natural environments.
Pavan, Alessandra; Thomaseth, Karl; Valerio, Anna
2003-01-01
The aim of this study is the characterization, by means of mathematical models, of the activity of isolated hepatic rat cells as regards the conversion of free fatty acids (FFA) to ketone bodies (KB). A new physiologically based compartmental model of FFA metabolism is used within a context of population pharmacokinetics. This analysis is based on a hierarchical model, that differs from standard model formulations, to account for the fact that some data sets belong to the same animal but have been collected under different experimental conditions. The statistical inference problem has been addressed within a Bayesian context and solved by using Markov Chain Monte Carlo (MCMC) simulation. The results obtained in this study indicate that, although hormones epinephrine and insulin are important metabolic regulatory factors in vivo, the conversion of FFA to KB by isolated hepatic rat cells is not significantly affected by epinephrine and only little influenced by insulin. So we conclude that in vivo, the interaction of these two hormones with other compounds not considered in this study plays a fundamental role in ketogenesis. From this study it appears that mathematical models of metabolic processes can be successfully employed in population kinetic studies using MCMC methods.
Enhancing gene regulatory network inference through data integration with markov random fields.
Banf, Michael; Rhee, Seung Y
2017-02-01
A gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization scheme to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE's potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.
Enhancing gene regulatory network inference through data integration with markov random fields
Banf, Michael; Rhee, Seung Y.
2017-01-01
A gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization scheme to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation. PMID:28145456
Enhancing gene regulatory network inference through data integration with markov random fields
Banf, Michael; Rhee, Seung Y.
2017-02-01
Here, a gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization schememore » to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.« less
Numerical solutions for patterns statistics on Markov chains.
Nuel, Gregory
2006-01-01
We propose here a review of the methods available to compute pattern statistics on text generated by a Markov source. Theoretical, but also numerical aspects are detailed for a wide range of techniques (exact, Gaussian, large deviations, binomial and compound Poisson). The SPatt package (Statistics for Pattern, free software available at http://stat.genopole.cnrs.fr/spatt) implementing all these methods is then used to compare all these approaches in terms of computational time and reliability in the most complete pattern statistics benchmark available at the present time.
Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression
Wiedenhoeft, John; Brugel, Eric; Schliep, Alexander
2016-01-01
By integrating Haar wavelets with Hidden Markov Models, we achieve drastically reduced running times for Bayesian inference using Forward-Backward Gibbs sampling. We show that this improves detection of genomic copy number variants (CNV) in array CGH experiments compared to the state-of-the-art, including standard Gibbs sampling. The method concentrates computational effort on chromosomal segments which are difficult to call, by dynamically and adaptively recomputing consecutive blocks of observations likely to share a copy number. This makes routine diagnostic use and re-analysis of legacy data collections feasible; to this end, we also propose an effective automatic prior. An open source software implementation of our method is available at http://schlieplab.org/Software/HaMMLET/ (DOI: 10.5281/zenodo.46262). This paper was selected for oral presentation at RECOMB 2016, and an abstract is published in the conference proceedings. PMID:27177143
Transition probabilities matrix of Markov Chain in the fatigue crack growth model
NASA Astrophysics Data System (ADS)
Nopiah, Zulkifli Mohd; Januri, Siti Sarah; Ariffin, Ahmad Kamal; Masseran, Nurulkamal; Abdullah, Shahrum
2016-10-01
Markov model is one of the reliable method to describe the growth of the crack from the initial until fracture phase. One of the important subjects in the crack growth models is to obtain the transition probability matrix of the fatigue. Determining probability transition matrix is important in Markov Chain model for describing probability behaviour of fatigue life in the structure. In this paper, we obtain transition probabilities of a Markov chain based on the Paris law equation to describe the physical meaning of fatigue crack growth problem. The results show that the transition probabilities are capable to calculate the probability of damage in the future with the possibilities of comparing each stage between time.
The Autonomous Duck: Exploring the Possibilities of a Markov Chain Model in Animation
NASA Astrophysics Data System (ADS)
Villegas, Javier
This document reports the construction of a framework for the generation of animations based in a Markov chain model of the different poses of some drawn character. The model was implemented and is demonstrated with the animation of a virtual duck in a random walk. Some potential uses of this model in interpolation and generation of in between frames are also explored.
Exponential integrators for a Markov chain model of the fast sodium channel of cardiomyocytes.
Starý, Tomás; Biktashev, Vadim N
2015-04-01
The modern Markov chain models of ionic channels in excitable membranes are numerically stiff. The popular numerical methods for these models require very small time steps to ensure stability. Our objective is to formulate and test two methods addressing this issue, so that the timestep can be chosen based on accuracy rather than stability. Both proposed methods extend Rush-Larsen technique, which was originally developed to Hogdkin-Huxley type gate models. One method, "matrix Rush-Larsen" (MRL) uses a matrix reformulation of the Rush-Larsen scheme, where the matrix exponentials are calculated using precomputed tables of eigenvalues and eigenvectors. The other, "hybrid operator splitting" (HOS) method exploits asymptotic properties of a particular Markov chain model, allowing explicit analytical expressions for the substeps. We test both methods on the Clancy and Rudy (2002) I(Na)Markov chain model. With precomputed tables for functions of the transmembrane voltage, both methods are comparable to the forward Euler method in accuracy and computational cost, but allow longer time steps without numerical instability. We conclude that both methods are of practical interest. MRL requires more computations than HOS, but is formulated in general terms which can be readily extended to other Markov chain channel models, whereas the utility of HOS depends on the asymptotic properties of a particular model. The significance of the methods is that they allow a considerable speed-up of large-scale computations of cardiac excitation models by increasing the time step, while maintaining acceptable accuracy and preserving numerical stability.
Markov Chain Monte Carlo Estimation of Item Parameters for the Generalized Graded Unfolding Model
ERIC Educational Resources Information Center
de la Torre, Jimmy; Stark, Stephen; Chernyshenko, Oleksandr S.
2006-01-01
The authors present a Markov Chain Monte Carlo (MCMC) parameter estimation procedure for the generalized graded unfolding model (GGUM) and compare it to the marginal maximum likelihood (MML) approach implemented in the GGUM2000 computer program, using simulated and real personality data. In the simulation study, test length, number of response…
Teaching Markov Chain Monte Carlo: Revealing the Basic Ideas behind the Algorithm
ERIC Educational Resources Information Center
Stewart, Wayne; Stewart, Sepideh
2014-01-01
For many scientists, researchers and students Markov chain Monte Carlo (MCMC) simulation is an important and necessary tool to perform Bayesian analyses. The simulation is often presented as a mathematical algorithm and then translated into an appropriate computer program. However, this can result in overlooking the fundamental and deeper…
A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis
ERIC Educational Resources Information Center
Edwards, Michael C.
2010-01-01
Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show…
Treatment-based Markov chain models clarify mechanisms of invasion in an invaded grassland community
Nelis, Lisa Castillo; Wootton, J. Timothy
2010-01-01
What are the relative roles of mechanisms underlying plant responses in grassland communities invaded by both plants and mammals? What type of community can we expect in the future given current or novel conditions? We address these questions by comparing Markov chain community models among treatments from a field experiment on invasive species on Robinson Crusoe Island, Chile. Because of seed dispersal, grazing and disturbance, we predicted that the exotic European rabbit (Oryctolagus cuniculus) facilitates epizoochorous exotic plants (plants with seeds that stick to the skin an animal) at the expense of native plants. To test our hypothesis, we crossed rabbit exclosure treatments with disturbance treatments, and sampled the plant community in permanent plots over 3 years. We then estimated Markov chain model transition probabilities and found significant differences among treatments. As hypothesized, this modelling revealed that exotic plants survive better in disturbed areas, while natives prefer no rabbits or disturbance. Surprisingly, rabbits negatively affect epizoochorous plants. Markov chain dynamics indicate that an overall replacement of native plants by exotic plants is underway. Using a treatment-based approach to multi-species Markov chain models allowed us to examine the changes in the importance of mechanisms in response to experimental impacts on communities. PMID:19864293
Avian life history profiles for use in the Markov chain nest productivity model (MCnest)
The Markov Chain nest productivity model, or MCnest, quantitatively estimates the effects of pesticides or other toxic chemicals on annual reproductive success of avian species (Bennett and Etterson 2013, Etterson and Bennett 2013). The Basic Version of MCnest was developed as a...
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Research on Air Traffic Control Automatic System Software Reliability Based on Markov Chain
NASA Astrophysics Data System (ADS)
Wang, Xinglong; Liu, Weixiang
Ensuring the space of air craft and high efficiency of air traffic are the main job tasks of the air traffic control automatic system. An Air Traffic Control Automatic System (ATCAS) and Markov model is put forward in this paper, which collected the 36 month failure data of ATCAS; A method to predict the s1,s2,s3 of ATCAS is based on Markov chain which predicts and validates the Reliability of ATCTS according to the deriving theory of Reliability. The experimental results show that the method can be used for the future research and proved to be practicable.
NASA Astrophysics Data System (ADS)
Marshall, L. A.; Nott, D.; Sharma, A.
An important aspect of practical hydrological engineering is modelling the catch- ment's response to rainfall. An abundance of models exist to do this, including con- ceptual rainfall-runoff models (CRRMs), which model the catchment as a configura- tion of interconnected storages aimed at providing a simplified representation of the physical processes responsible for runoff generation. While CRRMs have been a use- ful and popular tool for catchment modelling applications, as with most modelling approaches the challenge in using them is accurately assessing the best values to be assigned to the model variables. There are many obstacles to accurate parameter in- ference. Often, a single optimal set of parameter values do not exist. A range of values will often produce a suitable result. The interaction between parameters can also com- plicate the task of parameter inference, and if the data are limited this interaction may be difficult to characterise. An appealing solution is the use of Bayesian statistical inference, with computations carried out using Markov Chain Monte Carlo (MCMC) methods. This approach allows the combination of any pre-existing knowledge about the model parameters to be combined with the available catchment data. The uncer- tainty about a parameter is characterised in terms of its posterior distribution. This study assessed two MCMC schemes that characterise the parameter uncertainty of a CRRM. The aim of the study was to compare an established, complex MCMC scheme to a proposed, more automated scheme that requires little specification on the part of the user to achieve the desired results. The proposed scheme utilises the posterior co- variance between parameters to generate future parameter values. The attributes of the algorithm are ideal for hydrological models, which often exhibit a high degree of correlation between parameters. The Australian Water Balance Model (AWBM), a 8- parameter CRRM that has been tested and used in several
NASA Astrophysics Data System (ADS)
Sivakumar, Krishnamoorthy; Goutsias, John I.
1998-09-01
We study the problem of simulating a class of Gibbs random field models, called morphologically constrained Gibbs random fields, using Markov chain Monte Carlo sampling techniques. Traditional single site updating Markov chain Monte Carlo sampling algorithm, like the Metropolis algorithm, tend to converge extremely slowly when used to simulate these models, particularly at low temperatures and for constraints involving large geometrical shapes. Moreover, the morphologically constrained Gibbs random fields are not, in general, Markov. Hence, a Markov chain Monte Carlo sampling algorithm based on the Gibbs sampler is not possible. We prose a variant of the Metropolis algorithm that, at each iteration, allows multi-site updating and converges substantially faster than the traditional single- site updating algorithm. The set of sites that are updated at a particular iteration is specified in terms of a shape parameter and a size parameter. Computation of the acceptance probability involves a 'test ratio,' which requires computation of the ratio of the probabilities of the current and new realizations. Because of the special structure of our energy function, this computation can be done by means of a simple; local iterative procedure. Therefore lack of Markovianity does not impose any additional computational burden for model simulation. The proposed algorithm has been used to simulate a number of image texture models, both synthetic and natural.
First and second order semi-Markov chains for wind speed modeling
NASA Astrophysics Data System (ADS)
Prattico, F.; Petroni, F.; D'Amico, G.
2012-04-01
The increasing interest in renewable energy leads scientific research to find a better way to recover most of the available energy. Particularly, the maximum energy recoverable from wind is equal to 59.3% of that available (Betz law) at a specific pitch angle and when the ratio between the wind speed in output and in input is equal to 1/3. The pitch angle is the angle formed between the airfoil of the blade of the wind turbine and the wind direction. Old turbine and a lot of that actually marketed, in fact, have always the same invariant geometry of the airfoil. This causes that wind turbines will work with an efficiency that is lower than 59.3%. New generation wind turbines, instead, have a system to variate the pitch angle by rotating the blades. This system able the wind turbines to recover, at different wind speed, always the maximum energy, working in Betz limit at different speed ratios. A powerful system control of the pitch angle allows the wind turbine to recover better the energy in transient regime. A good stochastic model for wind speed is then needed to help both the optimization of turbine design and to assist the system control to predict the value of the wind speed to positioning the blades quickly and correctly. The possibility to have synthetic data of wind speed is a powerful instrument to assist designer to verify the structures of the wind turbines or to estimate the energy recoverable from a specific site. To generate synthetic data, Markov chains of first or higher order are often used [1,2,3]. In particular in [3] is presented a comparison between a first-order Markov chain and a second-order Markov chain. A similar work, but only for the first-order Markov chain, is conduced by [2], presenting the probability transition matrix and comparing the energy spectral density and autocorrelation of real and synthetic wind speed data. A tentative to modeling and to join speed and direction of wind is presented in [1], by using two models, first
Predictive glycoengineering of biosimilars using a Markov chain glycosylation model.
Spahn, Philipp N; Hansen, Anders H; Kol, Stefan; Voldborg, Bjørn G; Lewis, Nathan E
2017-02-01
Biosimilar drugs must closely resemble the pharmacological attributes of innovator products to ensure safety and efficacy to obtain regulatory approval. Glycosylation is one critical quality attribute that must be matched, but it is inherently difficult to control due to the complexity of its biogenesis. This usually implies that costly and time-consuming experimentation is required for clone identification and optimization of biosimilar glycosylation. Here, a computational method that utilizes a Markov model of glycosylation to predict optimal glycoengineering strategies to obtain a specific glycosylation profile with desired properties is described. The approach uses a genetic algorithm to find the required quantities to perturb glycosylation reaction rates that lead to the best possible match with a given glycosylation profile. Furthermore, the approach can be used to identify cell lines and clones that will require minimal intervention while achieving a glycoprofile that is most similar to the desired profile. Thus, this approach can facilitate biosimilar design by providing computational glycoengineering guidelines that can be generated with a minimal time and cost.
Inverting OII 83.4 nm dayglow profiles using Markov chain radiative transfer
NASA Astrophysics Data System (ADS)
Geddes, George; Douglas, Ewan; Finn, Susanna C.; Cook, Timothy; Chakrabarti, Supriya
2016-11-01
Emission profiles of the resonantly scattered OII 83.4 nm triplet can in principle be used to estimate O+ density profiles in the F2 region of the ionosphere. Given the emission source profile, solution of this inverse problem is possible but requires significant computation. The traditional Feautrier solution to the radiative transfer problem requires many iterations to converge, making it time consuming to compute. A Markov chain approach to the problem produces similar results by directly constructing a matrix that maps the source emission rate to an effective emission rate which includes scattering to all orders. The Markov chain approach presented here yields faster results and therefore can be used to perform the O+ density retrieval with higher resolution than would otherwise be possible.
Is anoxic depolarisation associated with an ADC threshold? A Markov chain Monte Carlo analysis.
King, Martin D; Crowder, Martin J; Hand, David J; Harris, Neil G; Williams, Stephen R; Obrenovitch, Tihomir P; Gadian, David G
2005-12-01
A Bayesian nonlinear hierarchical random coefficients model was used in a reanalysis of a previously published longitudinal study of the extracellular direct current (DC)-potential and apparent diffusion coefficient (ADC) responses to focal ischaemia. The main purpose was to examine the data for evidence of an ADC threshold for anoxic depolarisation. A Markov chain Monte Carlo simulation approach was adopted. The Metropolis algorithm was used to generate three parallel Markov chains and thus obtain a sampled posterior probability distribution for each of the DC-potential and ADC model parameters, together with a number of derived parameters. The latter were used in a subsequent threshold analysis. The analysis provided no evidence indicating a consistent and reproducible ADC threshold for anoxic depolarisation.
NASA Astrophysics Data System (ADS)
Staňová, Sidónia; Soták, Ján; Hudec, Norbert
2009-08-01
Methods based on the Markov Chains can be easily applied in the evaluation of order in sedimentary sequences. In this contribution Markov Chain analysis was applied to analysis of turbiditic formation of the Outer Western Carpathians in NW Slovakia, although it also has broader utilization in the interpretation of sedimentary sequences from other depositional environments. Non-random facies transitions were determined in the investigated strata and compared to the standard deep-water facies models to provide statistical evidence for the sedimentological interpretation of depositional processes. As a result, six genetic facies types, interpreted in terms of depositional processes, were identified. They comprise deposits of density flows, turbidity flows, suspension fallout as well as units which resulted from syn- or post-depositional deformation.
An 'adding' algorithm for the Markov chain formalism for radiation transfer
NASA Technical Reports Server (NTRS)
Esposito, L. W.
1979-01-01
An adding algorithm is presented, that extends the Markov chain method and considers a preceding calculation as a single state of a new Markov chain. This method takes advantage of the description of the radiation transport as a stochastic process. Successive application of this procedure makes calculation possible for any optical depth without increasing the size of the linear system used. It is determined that the time required for the algorithm is comparable to that for a doubling calculation for homogeneous atmospheres. For an inhomogeneous atmosphere the new method is considerably faster than the standard adding routine. It is concluded that the algorithm is efficient, accurate, and suitable for smaller computers in calculating the diffuse intensity scattered by an inhomogeneous planetary atmosphere.
Application of Markov chain to the pattern of mitochondrial deoxyribonucleic acid mutations
NASA Astrophysics Data System (ADS)
Vantika, Sandy; Pasaribu, Udjianna S.
2014-03-01
This research explains how Markov chain used to model the pattern of deoxyribonucleic acid mutations in mitochondrial (mitochondrial DNA). First, sign test was used to see a pattern of nucleotide bases that will appear at one position after the position of mutated nucleotide base. Results obtained from the sign test showed that for most cases, there exist a pattern of mutation except in the mutation cases of adenine to cytosine, adenine to thymine, and cytosine to guanine. Markov chain analysis results on data of mutations that occur in mitochondrial DNA indicate that one and two positions after the position of mutated nucleotide bases tend to be occupied by particular nucleotide bases. From this analysis, it can be said that the adenine, cytosine, guanine and thymine will mutate if the nucelotide base at one and/or two positions after them is cytosine.
Fitting timeseries by continuous-time Markov chains: A quadratic programming approach
Crommelin, D.T. . E-mail: crommelin@cims.nyu.edu; Vanden-Eijnden, E. . E-mail: eve2@cims.nyu.edu
2006-09-20
Construction of stochastic models that describe the effective dynamics of observables of interest is an useful instrument in various fields of application, such as physics, climate science, and finance. We present a new technique for the construction of such models. From the timeseries of an observable, we construct a discrete-in-time Markov chain and calculate the eigenspectrum of its transition probability (or stochastic) matrix. As a next step we aim to find the generator of a continuous-time Markov chain whose eigenspectrum resembles the observed eigenspectrum as closely as possible, using an appropriate norm. The generator is found by solving a minimization problem: the norm is chosen such that the object function is quadratic and convex, so that the minimization problem can be solved using quadratic programming techniques. The technique is illustrated on various toy problems as well as on datasets stemming from simulations of molecular dynamics and of atmospheric flows.
MC3: Multi-core Markov-chain Monte Carlo code
NASA Astrophysics Data System (ADS)
Cubillos, Patricio; Harrington, Joseph; Lust, Nate; Foster, AJ; Stemm, Madison; Loredo, Tom; Stevenson, Kevin; Campo, Chris; Hardin, Matt; Hardy, Ryan
2016-10-01
MC3 (Multi-core Markov-chain Monte Carlo) is a Bayesian statistics tool that can be executed from the shell prompt or interactively through the Python interpreter with single- or multiple-CPU parallel computing. It offers Markov-chain Monte Carlo (MCMC) posterior-distribution sampling for several algorithms, Levenberg-Marquardt least-squares optimization, and uniform non-informative, Jeffreys non-informative, or Gaussian-informative priors. MC3 can share the same value among multiple parameters and fix the value of parameters to constant values, and offers Gelman-Rubin convergence testing and correlated-noise estimation with time-averaging or wavelet-based likelihood estimation methods.
Fitting timeseries by continuous-time Markov chains: A quadratic programming approach
NASA Astrophysics Data System (ADS)
Crommelin, D. T.; Vanden-Eijnden, E.
2006-09-01
Construction of stochastic models that describe the effective dynamics of observables of interest is an useful instrument in various fields of application, such as physics, climate science, and finance. We present a new technique for the construction of such models. From the timeseries of an observable, we construct a discrete-in-time Markov chain and calculate the eigenspectrum of its transition probability (or stochastic) matrix. As a next step we aim to find the generator of a continuous-time Markov chain whose eigenspectrum resembles the observed eigenspectrum as closely as possible, using an appropriate norm. The generator is found by solving a minimization problem: the norm is chosen such that the object function is quadratic and convex, so that the minimization problem can be solved using quadratic programming techniques. The technique is illustrated on various toy problems as well as on datasets stemming from simulations of molecular dynamics and of atmospheric flows.
Korostil, Igor A; Peters, Gareth W; Cornebise, Julien; Regan, David G
2013-05-20
A Bayesian statistical model and estimation methodology based on forward projection adaptive Markov chain Monte Carlo is developed in order to perform the calibration of a high-dimensional nonlinear system of ordinary differential equations representing an epidemic model for human papillomavirus types 6 and 11 (HPV-6, HPV-11). The model is compartmental and involves stratification by age, gender and sexual-activity group. Developing this model and a means to calibrate it efficiently is relevant because HPV is a very multi-typed and common sexually transmitted infection with more than 100 types currently known. The two types studied in this paper, types 6 and 11, are causing about 90% of anogenital warts. We extend the development of a sexual mixing matrix on the basis of a formulation first suggested by Garnett and Anderson, frequently used to model sexually transmitted infections. In particular, we consider a stochastic mixing matrix framework that allows us to jointly estimate unknown attributes and parameters of the mixing matrix along with the parameters involved in the calibration of the HPV epidemic model. This matrix describes the sexual interactions between members of the population under study and relies on several quantities that are a priori unknown. The Bayesian model developed allows one to estimate jointly the HPV-6 and HPV-11 epidemic model parameters as well as unknown sexual mixing matrix parameters related to assortativity. Finally, we explore the ability of an extension to the class of adaptive Markov chain Monte Carlo algorithms to incorporate a forward projection strategy for the ordinary differential equation state trajectories. Efficient exploration of the Bayesian posterior distribution developed for the ordinary differential equation parameters provides a challenge for any Markov chain sampling methodology, hence the interest in adaptive Markov chain methods. We conclude with simulation studies on synthetic and recent actual data.
A Markov Chain Model for evaluating the effectiveness of randomized surveillance procedures
Edmunds, T.A.
1994-01-01
A Markov Chain Model has been developed to evaluate the effectiveness of randomized surveillance procedures. The model is applicable for surveillance systems that monitor a collection of assets by randomly selecting and inspecting the assets. The model provides an estimate of the detection probability as a function of the amount of time that an adversary would require to steal or sabotage the asset. An interactive computer code has been written to perform the necessary computations.
Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu
2016-01-01
The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com. PMID:27876823
Characterization of the rat exploratory behavior in the elevated plus-maze with Markov chains.
Tejada, Julián; Bosco, Geraldine G; Morato, Silvio; Roque, Antonio C
2010-11-30
The elevated plus-maze is an animal model of anxiety used to study the effect of different drugs on the behavior of the animal. It consists of a plus-shaped maze with two open and two closed arms elevated 50cm from the floor. The standard measures used to characterize exploratory behavior in the elevated plus-maze are the time spent and the number of entries in the open arms. In this work, we use Markov chains to characterize the exploratory behavior of the rat in the elevated plus-maze under three different conditions: normal and under the effects of anxiogenic and anxiolytic drugs. The spatial structure of the elevated plus-maze is divided into squares, which are associated with states of a Markov chain. By counting the frequencies of transitions between states during 5-min sessions in the elevated plus-maze, we constructed stochastic matrices for the three conditions studied. The stochastic matrices show specific patterns, which correspond to the observed behaviors of the rat under the three different conditions. For the control group, the stochastic matrix shows a clear preference for places in the closed arms. This preference is enhanced for the anxiogenic group. For the anxiolytic group, the stochastic matrix shows a pattern similar to a random walk. Our results suggest that Markov chains can be used together with the standard measures to characterize the rat behavior in the elevated plus-maze.
State space orderings for Gauss-Seidel in Markov chains revisited
Dayar, T.
1996-12-31
Symmetric state space orderings of a Markov chain may be used to reduce the magnitude of the subdominant eigenvalue of the (Gauss-Seidel) iteration matrix. Orderings that maximize the elemental mass or the number of nonzero elements in the dominant term of the Gauss-Seidel splitting (that is, the term approximating the coefficient matrix) do not necessarily converge faster. An ordering of a Markov chain that satisfies Property-R is semi-convergent. On the other hand, there are semi-convergent symmetric state space orderings that do not satisfy Property-R. For a given ordering, a simple approach for checking Property-R is shown. An algorithm that orders the states of a Markov chain so as to increase the likelihood of satisfying Property-R is presented. The computational complexity of the ordering algorithm is less than that of a single Gauss-Seidel iteration (for sparse matrices). In doing all this, the aim is to gain an insight for faster converging orderings. Results from a variety of applications improve the confidence in the algorithm.
Information Security Risk Assessment of Smart Grid Based on Absorbing Markov Chain and SPA
NASA Astrophysics Data System (ADS)
Jianye, Zhang; Qinshun, Zeng; Yiyang, Song; Cunbin, Li
2014-12-01
To assess and prevent the smart grid information security risks more effectively, this paper provides risk index quantitative calculation method based on absorbing Markov chain to overcome the deficiencies that links between system components were not taken into consideration and studies mostly were limited to static evaluation. The method avoids the shortcomings of traditional Expert Score with significant subjective factors and also considers the links between information system components, which make the risk index system closer to the reality. Then, a smart grid information security risk assessment model on the basis of set pair analysis improved by Markov chain was established. Using the identity, discrepancy, and contradiction of connection degree to dynamically reflect the trend of smart grid information security risk and combining with the Markov chain to calculate connection degree of the next period, the model implemented the smart grid information security risk assessment comprehensively and dynamically. Finally, this paper proves that the established model is scientific, effective, and feasible to dynamically evaluate the smart grid information security risks.
Algorithm Optimally Orders Forward-Chaining Inference Rules
NASA Technical Reports Server (NTRS)
James, Mark
2008-01-01
People typically develop knowledge bases in a somewhat ad hoc manner by incrementally adding rules with no specific organization. This often results in a very inefficient execution of those rules since they are so often order sensitive. This is relevant to tasks like Deep Space Network in that it allows the knowledge base to be incrementally developed and have it automatically ordered for efficiency. Although data flow analysis was first developed for use in compilers for producing optimal code sequences, its usefulness is now recognized in many software systems including knowledge-based systems. However, this approach for exhaustively computing data-flow information cannot directly be applied to inference systems because of the ubiquitous execution of the rules. An algorithm is presented that efficiently performs a complete producer/consumer analysis for each antecedent and consequence clause in a knowledge base to optimally order the rules to minimize inference cycles. An algorithm was developed that optimally orders a knowledge base composed of forwarding chaining inference rules such that independent inference cycle executions are minimized, thus, resulting in significantly faster execution. This algorithm was integrated into the JPL tool Spacecraft Health Inference Engine (SHINE) for verification and it resulted in a significant reduction in inference cycles for what was previously considered an ordered knowledge base. For a knowledge base that is completely unordered, then the improvement is much greater.
A Markov Chain Analysis of Fish Movements to Determine Entrainment Zones
Johnson, Gary E.; Hedgepeth, J; Skalski, John R.; Giorgi, Albert E.
2004-10-01
Fish can become entrained at water withdrawal locations such as fish bypasses or cooling water intakes. Accordingly, the size of a fish entrainment zone (FEZ) is often of interest to fisheries managers and facility operators. This study developed a new technique to map the FEZ, defined here as the region immediately upstream of a portal where the probability of fish movement toward the portal is greater than 90%. To map the FEZ, we applied a Markov chain analysis to fish movement data collected with an active tracking sonar. This device locks onto and follows a target, recording positions through a set of volumetric cells comprising the sampled volume. The probability of a fish moving from one cell to another was calculated from fish position data, which was used to populate a Markov transition matrix. We developed and applied the technique using data on salmon smolts migrating near the ice/trash sluiceway at The Dalles Dam on the Columbia River. The FEZ of the sluiceway entrance in 2000 as determined with this procedure was approximately 5 m across and extended 6-8 m out from the face of the dam in the surface layer 2-3 m deep. In conclusion, using a Markov chain analysis of fish track data we were able to describe and quantify the FEZ of the sluiceway at The Dalles Dam. This technique for FEZ mapping is applicable to other bioengineering efforts aimed at protecting fish populations affected by water withdrawals.
Animal vocal sequences: not the Markov chains we thought they were.
Kershenbaum, Arik; Bowles, Ann E; Freeberg, Todd M; Jin, Dezhe Z; Lameira, Adriano R; Bohn, Kirsten
2014-10-07
Many animals produce vocal sequences that appear complex. Most researchers assume that these sequences are well characterized as Markov chains (i.e. that the probability of a particular vocal element can be calculated from the history of only a finite number of preceding elements). However, this assumption has never been explicitly tested. Furthermore, it is unclear how language could evolve in a single step from a Markovian origin, as is frequently assumed, as no intermediate forms have been found between animal communication and human language. Here, we assess whether animal taxa produce vocal sequences that are better described by Markov chains, or by non-Markovian dynamics such as the 'renewal process' (RP), characterized by a strong tendency to repeat elements. We examined vocal sequences of seven taxa: Bengalese finches Lonchura striata domestica, Carolina chickadees Poecile carolinensis, free-tailed bats Tadarida brasiliensis, rock hyraxes Procavia capensis, pilot whales Globicephala macrorhynchus, killer whales Orcinus orca and orangutans Pongo spp. The vocal systems of most of these species are more consistent with a non-Markovian RP than with the Markovian models traditionally assumed. Our data suggest that non-Markovian vocal sequences may be more common than Markov sequences, which must be taken into account when evaluating alternative hypotheses for the evolution of signalling complexity, and perhaps human language origins.
Animal vocal sequences: not the Markov chains we thought they were
Kershenbaum, Arik; Bowles, Ann E.; Freeberg, Todd M.; Jin, Dezhe Z.; Lameira, Adriano R.; Bohn, Kirsten
2014-01-01
Many animals produce vocal sequences that appear complex. Most researchers assume that these sequences are well characterized as Markov chains (i.e. that the probability of a particular vocal element can be calculated from the history of only a finite number of preceding elements). However, this assumption has never been explicitly tested. Furthermore, it is unclear how language could evolve in a single step from a Markovian origin, as is frequently assumed, as no intermediate forms have been found between animal communication and human language. Here, we assess whether animal taxa produce vocal sequences that are better described by Markov chains, or by non-Markovian dynamics such as the ‘renewal process’ (RP), characterized by a strong tendency to repeat elements. We examined vocal sequences of seven taxa: Bengalese finches Lonchura striata domestica, Carolina chickadees Poecile carolinensis, free-tailed bats Tadarida brasiliensis, rock hyraxes Procavia capensis, pilot whales Globicephala macrorhynchus, killer whales Orcinus orca and orangutans Pongo spp. The vocal systems of most of these species are more consistent with a non-Markovian RP than with the Markovian models traditionally assumed. Our data suggest that non-Markovian vocal sequences may be more common than Markov sequences, which must be taken into account when evaluating alternative hypotheses for the evolution of signalling complexity, and perhaps human language origins. PMID:25143037
A Graph-Algorithmic Approach for the Study of Metastability in Markov Chains
NASA Astrophysics Data System (ADS)
Gan, Tingyue; Cameron, Maria
2017-01-01
Large continuous-time Markov chains with exponentially small transition rates arise in modeling complex systems in physics, chemistry, and biology. We propose a constructive graph-algorithmic approach to determine the sequence of critical timescales at which the qualitative behavior of a given Markov chain changes, and give an effective description of the dynamics on each of them. This approach is valid for both time-reversible and time-irreversible Markov processes, with or without symmetry. Central to this approach are two graph algorithms, Algorithm 1 and Algorithm 2, for obtaining the sequences of the critical timescales and the hierarchies of Typical Transition Graphs or T-graphs indicating the most likely transitions in the system without and with symmetry, respectively. The sequence of critical timescales includes the subsequence of the reciprocals of the real parts of eigenvalues. Under a certain assumption, we prove sharp asymptotic estimates for eigenvalues (including pre-factors) and show how one can extract them from the output of Algorithm 1. We discuss the relationship between Algorithms 1 and 2 and explain how one needs to interpret the output of Algorithm 1 if it is applied in the case with symmetry instead of Algorithm 2. Finally, we analyze an example motivated by R. D. Astumian's model of the dynamics of kinesin, a molecular motor, by means of Algorithm 2.
Modeling anomalous radar propagation using first-order two-state Markov chains
NASA Astrophysics Data System (ADS)
Haddad, B.; Adane, A.; Mesnard, F.; Sauvageot, H.
In this paper, it is shown that radar echoes due to anomalous propagations (AP) can be modeled using Markov chains. For this purpose, images obtained in southwestern France by means of an S-band meteorological radar recorded every 5 min in 1996 were considered. The daily mean surfaces of AP appearing in these images are sorted into two states and their variations are then represented by a binary random variable. The Markov transition matrix, the 1-day-lag autocorrelation coefficient as well as the long-term probability of having each of both states are calculated on a monthly basis. The same kind of modeling was also applied to the rainfall observed in the radar dataset under study. The first-order two-state Markov chains are then found to fit the daily variations of either AP or rainfall areas very well. For each month of the year, the surfaces filled by both types of echo follow similar stochastic distributions, but their autocorrelation coefficient is different. Hence, it is suggested that this coefficient is a discriminant factor which could be used, among other criteria, to improve the identification of AP in radar images.
Williams, Michael S; Ebel, Eric D
2014-11-18
The fitting of statistical distributions to chemical and microbial contamination data is a common application in risk assessment. These distributions are used to make inferences regarding even the most pedestrian of statistics, such as the population mean. The reason for the heavy reliance on a fitted distribution is the presence of left-, right-, and interval-censored observations in the data sets, with censored observations being the result of nondetects in an assay, the use of screening tests, and other practical limitations. Considerable effort has been expended to develop statistical distributions and fitting techniques for a wide variety of applications. Of the various fitting methods, Markov Chain Monte Carlo methods are common. An underlying assumption for many of the proposed Markov Chain Monte Carlo methods is that the data represent independent and identically distributed (iid) observations from an assumed distribution. This condition is satisfied when samples are collected using a simple random sampling design. Unfortunately, samples of food commodities are generally not collected in accordance with a strict probability design. Nevertheless, pseudosystematic sampling efforts (e.g., collection of a sample hourly or weekly) from a single location in the farm-to-table continuum are reasonable approximations of a simple random sample. The assumption that the data represent an iid sample from a single distribution is more difficult to defend if samples are collected at multiple locations in the farm-to-table continuum or risk-based sampling methods are employed to preferentially select samples that are more likely to be contaminated. This paper develops a weighted bootstrap estimation framework that is appropriate for fitting a distribution to microbiological samples that are collected with unequal probabilities of selection. An example based on microbial data, derived by the Most Probable Number technique, demonstrates the method and highlights the
NASA Astrophysics Data System (ADS)
Zhang, Junlong; Li, Yongping; Huang, Guohe; Chen, Xi; Bao, Anming
2016-07-01
Without a realistic assessment of parameter uncertainty, decision makers may encounter difficulties in accurately describing hydrologic processes and assessing relationships between model parameters and watershed characteristics. In this study, a Markov-Chain-Monte-Carlo-based multilevel-factorial-analysis (MCMC-MFA) method is developed, which can not only generate samples of parameters from a well constructed Markov chain and assess parameter uncertainties with straightforward Bayesian inference, but also investigate the individual and interactive effects of multiple parameters on model output through measuring the specific variations of hydrological responses. A case study is conducted for addressing parameter uncertainties in the Kaidu watershed of northwest China. Effects of multiple parameters and their interactions are quantitatively investigated using the MCMC-MFA with a three-level factorial experiment (totally 81 runs). A variance-based sensitivity analysis method is used to validate the results of parameters' effects. Results disclose that (i) soil conservation service runoff curve number for moisture condition II (CN2) and fraction of snow volume corresponding to 50% snow cover (SNO50COV) are the most significant factors to hydrological responses, implying that infiltration-excess overland flow and snow water equivalent represent important water input to the hydrological system of the Kaidu watershed; (ii) saturate hydraulic conductivity (SOL_K) and soil evaporation compensation factor (ESCO) have obvious effects on hydrological responses; this implies that the processes of percolation and evaporation would impact hydrological process in this watershed; (iii) the interactions of ESCO and SNO50COV as well as CN2 and SNO50COV have an obvious effect, implying that snow cover can impact the generation of runoff on land surface and the extraction of soil evaporative demand in lower soil layers. These findings can help enhance the hydrological model
Modeling and computing of stock index forecasting based on neural network and Markov chain.
Dai, Yonghui; Han, Dongmei; Dai, Weihui
2014-01-01
The stock index reflects the fluctuation of the stock market. For a long time, there have been a lot of researches on the forecast of stock index. However, the traditional method is limited to achieving an ideal precision in the dynamic market due to the influences of many factors such as the economic situation, policy changes, and emergency events. Therefore, the approach based on adaptive modeling and conditional probability transfer causes the new attention of researchers. This paper presents a new forecast method by the combination of improved back-propagation (BP) neural network and Markov chain, as well as its modeling and computing technology. This method includes initial forecasting by improved BP neural network, division of Markov state region, computing of the state transition probability matrix, and the prediction adjustment. Results of the empirical study show that this method can achieve high accuracy in the stock index prediction, and it could provide a good reference for the investment in stock market.
An exact McNemar test for paired binary Markov chains.
Smith, W; Solow, A R
1996-09-01
A straightforward extension of the McNemar test for paired binary data yields an exact test for the equality of the limiting marginal distributions for bivariate binary Markov chains. The exact distribution of the test statistics under the null hypothesis of equal marginals depends on the classical cell occupancy statistics for the Bose-Einstein model. Exact p-values are computed for the one-sided test, and the mean and variance of the test statistic are found. The power of the Markov-McNemar test is found to be close to the power of the classical McNemar test for independent paired observations when the independence assumption holds. The method is applied to the comparison of ribosomal DNA sequences.
A MONTE CARLO MARKOV CHAIN BASED INVESTIGATION OF BLACK HOLE SPIN IN THE ACTIVE GALAXY NGC 3783
Reynolds, Christopher S.; Lohfink, Anne M.; Trippe, Margaret L.; Brenneman, Laura W.; Miller, Jon M.; Fabian, Andrew C.; Nowak, Michael A. E-mail: alohfink@astro.umd.edu
2012-08-20
The analysis of relativistically broadened X-ray spectral features from the inner accretion disk provides a powerful tool for measuring the spin of supermassive black holes in active galactic nuclei (AGNs). However, AGN spectra are often complex and careful analysis employing appropriate and self-consistent models is required if one has to obtain robust results. In this paper, we revisit the deep 2009 July Suzaku observation of the Seyfert galaxy NGC 3783 in order to study in a rigorous manner the robustness of the inferred black hole spin parameter. Using Monte Carlo Markov chain techniques, we identify a (partial) modeling degeneracy between the iron abundance of the disk and the black hole spin parameter. We show that the data for NGC 3783 strongly require both supersolar iron abundance (Z{sub Fe} = 2-4 Z{sub Sun }) and a rapidly spinning black hole (a > 0.89). We discuss various astrophysical considerations that can affect the measured abundance. We note that, while the abundance enhancement inferred in NGC 3783 is modest, the X-ray analysis of some other objects has found extreme iron abundances. We introduce the hypothesis that the radiative levitation of iron ions in the innermost regions of radiation-dominated AGN disks can enhance the photospheric abundance of iron. We show that radiative levitation is a plausible mechanism in the very inner regions of high accretion rate AGN disks.
Dodds, Michael G; Vicini, Paolo
2004-09-01
Advances in computer hardware and the associated computer-intensive algorithms made feasible by these advances [like Markov chain Monte Carlo (MCMC) data analysis techniques] have made possible the application of hierarchical full Bayesian methods in analyzing pharmacokinetic and pharmacodynamic (PK-PD) data sets that are multivariate in nature. Pharmacokinetic data analysis in particular has been one area that has seized upon this technology to refine estimates of drug parameters from sparse data gathered in a large, highly variable population of patients. A drawback in this type of analysis is that it is difficult to quantitatively assess convergence of the Markov chains to a target distribution, and thus, it is sometimes difficult to assess the reliability of estimates gained from this procedure. Another complicating factor is that, although the application of MCMC methods to population PK-PD problems has been facilitated by new software designed for the PK-PD domain (specifically PKBUGS), experts in PK-PD may not have the necessary experience with MCMC methods to detect and understand problems with model convergence. The objective of this work is to provide an example of a set of diagnostics useful to investigators, by analyzing in detail three convergence criteria (namely the Raftery and Lewis, Geweke, and Heidelberger and Welch methods) on a simulated problem and with a rule of thumb of 10,000 chain elements in the Markov chain. We used two publicly available software packages to assess convergence of MCMC parameter estimates; the first performs Bayesian parameter estimation (PKBUGS/WinBUGS), and the second is focused on posterior analysis of estimates (BOA). The main message that seems to emerge is that accurately estimating confidence regions for the parameters of interest is more demanding than estimating the parameter means. Together, these tools provide numerical means by which an investigator can establish confidence in convergence and thus in the
Khan, Mohammad Ibrahim; Kamal, Md Sarwar
2015-03-01
Markov Chain is very effective in prediction basically in long data set. In DNA sequencing it is always very important to find the existence of certain nucleotides based on the previous history of the data set. We imposed the Chapman Kolmogorov equation to accomplish the task of Markov Chain. Chapman Kolmogorov equation is the key to help the address the proper places of the DNA chain and this is very powerful tools in mathematics as well as in any other prediction based research. It incorporates the score of DNA sequences calculated by various techniques. Our research utilize the fundamentals of Warshall Algorithm (WA) and Dynamic Programming (DP) to measures the score of DNA segments. The outcomes of the experiment are that Warshall Algorithm is good for small DNA sequences on the other hand Dynamic Programming are good for long DNA sequences. On the top of above findings, it is very important to measure the risk factors of local sequencing during the matching of local sequence alignments whatever the length.
D. L. Kelly
2007-06-01
Markov chain Monte Carlo (MCMC) techniques represent an extremely flexible and powerful approach to Bayesian modeling. This work illustrates the application of such techniques to time-dependent reliability of components with repair. The WinBUGS package is used to illustrate, via examples, how Bayesian techniques can be used for parametric statistical modeling of time-dependent component reliability. Additionally, the crucial, but often overlooked subject of model validation is discussed, and summary statistics for judging the model’s ability to replicate the observed data are developed, based on the posterior predictive distribution for the parameters of interest.
Markov chain Monte Carlo linkage analysis of a complex qualitative phenotype.
Hinrichs, A; Lin, J H; Reich, T; Bierut, L; Suarez, B K
1999-01-01
We tested a new computer program, LOKI, that implements a reversible jump Markov chain Monte Carlo (MCMC) technique for segregation and linkage analysis. Our objective was to determine whether this software, designed for use with continuously distributed phenotypes, has any efficacy when applied to the discrete disease states of the simulated data from the Mordor data from GAW Problem 1. Although we were able to identify the genomic location for two of the three quantitative trait loci by repeated application of the software, the MCMC sampler experienced significant mixing problems indicating that the method, as currently formulated in LOKI, was not suitable for the discrete phenotypes in this data set.
Deremigio, Hilary; Kemper, Peter; Lamar, M Drew; Smith, Gregory D
2008-01-01
Mathematical models of calcium release sites derived from Markov chain models of intracellular calcium channels exhibit collective gating reminiscent of the experimentally observed phenomenon of stochastic calcium excitability (i.e., calcium puffs and sparks). We present a Kronecker structured representation for calcium release site models and perform benchmark stationary distribution calculations using numerical iterative solution techniques that leverage this structure. In this context we find multi-level methods and certain preconditioned projection methods superior to simple Gauss-Seidel type iterations. Response measures such as the number of channels in a particular state converge more quickly using these numerical iterative methods than occupation measures calculated via Monte Carlo simulation.
Under-reported data analysis with INAR-hidden Markov chains.
Fernández-Fontelo, Amanda; Cabaña, Alejandra; Puig, Pedro; Moriña, David
2016-11-20
In this work, we deal with correlated under-reported data through INAR(1)-hidden Markov chain models. These models are very flexible and can be identified through its autocorrelation function, which has a very simple form. A naïve method of parameter estimation is proposed, jointly with the maximum likelihood method based on a revised version of the forward algorithm. The most-probable unobserved time series is reconstructed by means of the Viterbi algorithm. Several examples of application in the field of public health are discussed illustrating the utility of the models. Copyright © 2016 John Wiley & Sons, Ltd.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry
1993-04-01
as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
Green, P. L.; Worden, K.
2015-01-01
In this paper, the authors outline the general principles behind an approach to Bayesian system identification and highlight the benefits of adopting a Bayesian framework when attempting to identify models of nonlinear dynamical systems in the presence of uncertainty. It is then described how, through a summary of some key algorithms, many of the potential difficulties associated with a Bayesian approach can be overcome through the use of Markov chain Monte Carlo (MCMC) methods. The paper concludes with a case study, where an MCMC algorithm is used to facilitate the Bayesian system identification of a nonlinear dynamical system from experimentally observed acceleration time histories. PMID:26303916
Gu, M G; Kong, F H
1998-06-23
We propose a general procedure for solving incomplete data estimation problems. The procedure can be used to find the maximum likelihood estimate or to solve estimating equations in difficult cases such as estimation with the censored or truncated regression model, the nonlinear structural measurement error model, and the random effects model. The procedure is based on the general principle of stochastic approximation and the Markov chain Monte-Carlo method. Applying the theory on adaptive algorithms, we derive conditions under which the proposed procedure converges. Simulation studies also indicate that the proposed procedure consistently converges to the maximum likelihood estimate for the structural measurement error logistic regression model.
A multi-level solution algorithm for steady-state Markov chains
NASA Technical Reports Server (NTRS)
Horton, Graham; Leutenegger, Scott T.
1993-01-01
A new iterative algorithm, the multi-level algorithm, for the numerical solution of steady state Markov chains is presented. The method utilizes a set of recursively coarsened representations of the original system to achieve accelerated convergence. It is motivated by multigrid methods, which are widely used for fast solution of partial differential equations. Initial results of numerical experiments are reported, showing significant reductions in computation time, often an order of magnitude or more, relative to the Gauss-Seidel and optimal SOR algorithms for a variety of test problems. The multi-level method is compared and contrasted with the iterative aggregation-disaggregation algorithm of Takahashi.
NASA Astrophysics Data System (ADS)
Chen, X.; Rubin, Y.; Baldocchi, D. D.
2005-12-01
Understanding the interactions between soil, plant, and the atmosphere under water-stressed conditions is important for ecosystems where water availability is limited. In such ecosystems, the amount of water transferred from the soil to the atmosphere is controlled not only by weather conditions and vegetation type but also by soil water availability. Although researchers have proposed different approaches to model the impact of soil moisture on plant activities, the parameters involved are difficult to measure. However, using measurements of observed latent heat and carbon fluxes, as well as soil moisture data, Bayesian inversion methods can be employed to estimate the various model parameters. In our study, actual Evapotranspiration (ET) of an ecosystem is approximated by the Priestley-Taylor relationship, with the Priestley-Taylor coefficient modeled as a function of soil moisture content. Soil moisture limitation on root uptake is characterized in a similar manner as the Feddes' model. The inference of Bayesian inversion is processed within the framework of graphical theories. Due to the difficulty of obtaining exact inference, the Markov chain Monte Carlo (MCMC) method is implemented using a free software package, BUGS (Bayesian inference Using Gibbs Sampling). The proposed methodology is applied to a Mediterranean Oak-Savanna FLUXNET site in California, where continuous measurements of actual ET are obtained from eddy-covariance technique and soil moisture contents are monitored by several time domain reflectometry probes located within the footprint of the flux tower. After the implementation of Bayesian inversion, the posterior distributions of all the parameters exhibit enhancement in information compared to the prior distributions. The generated samples based on data in year 2003 are used to predict the actual ET in year 2004 and the prediction uncertainties are assessed in terms of confidence intervals. Our tests also reveal the usefulness of various
NASA Astrophysics Data System (ADS)
Durán, E.
2012-04-01
The interbeded sandstones, siltstones and shale layers within the stratigraphic units of the Oficina Formation were stochastically characterized. The units within the Oritupano field are modeled using the information from 12 wells and a post-stack 3-D seismic cube. The Markov Chain algorithm was successful at maintaining the proportion of lithotypes of the columns in the study area. Different transition probability matrixes are evaluated by changing the length of the sequences represented in the transition matrix and how this choice of length affects ciclicity and the genetic relations between lithotypes. The Gibbs algorithm, using small sequences as building blocks for modeling, kept the main stratigraphic succession according to the geology. Although the modeled stratigraphy depends strongly on initial conditions, the use of longer sequences in the substitution helps not to overweight the transition counts from one lithotype to the same in the main diagonal of the probability matrix of the Markov Chain in the Gibbs algorithm. A methodology based on the phase spectrum of the seismic trace for tying the modeled sequences with the seismic data is evaluated and discussed. The results point to the phase spectrum as an alternate way to cross-correlate synthetic seismograms with the seismic trace in favor of the well known amplitude correlation. Finally, a map of net sand over the study area is generated from the modeled columns and compared with previous stratigraphic and facies analysis at the levels of interest.
NASA Astrophysics Data System (ADS)
Numazawa, Satoshi; Smith, Roger
2011-10-01
Classical harmonic transition state theory is considered and applied in discrete lattice cells with hierarchical transition levels. The scheme is then used to determine transitions that can be applied in a lattice-based kinetic Monte Carlo (KMC) atomistic simulation model. The model results in an effective reduction of KMC simulation steps by utilizing a classification scheme of transition levels for thermally activated atomistic diffusion processes. Thermally activated atomistic movements are considered as local transition events constrained in potential energy wells over certain local time periods. These processes are represented by Markov chains of multidimensional Boolean valued functions in three-dimensional lattice space. The events inhibited by the barriers under a certain level are regarded as thermal fluctuations of the canonical ensemble and accepted freely. Consequently, the fluctuating system evolution process is implemented as a Markov chain of equivalence class objects. It is shown that the process can be characterized by the acceptance of metastable local transitions. The method is applied to a problem of Au and Ag cluster growth on a rippled surface. The simulation predicts the existence of a morphology-dependent transition time limit from a local metastable to stable state for subsequent cluster growth by accretion. Excellent agreement with observed experimental results is obtained.
LD-SPatt: large deviations statistics for patterns on Markov chains.
Nuel, G
2004-01-01
Statistics on Markov chains are widely used for the study of patterns in biological sequences. Statistics on these models can be done through several approaches. Central limit theorem (CLT) producing Gaussian approximations are one of the most popular ones. Unfortunately, in order to find a pattern of interest, these methods have to deal with tail distribution events where CLT is especially bad. In this paper, we propose a new approach based on the large deviations theory to assess pattern statistics. We first recall theoretical results for empiric mean (level 1) as well as empiric distribution (level 2) large deviations on Markov chains. Then, we present the applications of these results focusing on numerical issues. LD-SPatt is the name of GPL software implementing these algorithms. We compare this approach to several existing ones in terms of complexity and reliability and show that the large deviations are more reliable than the Gaussian approximations in absolute values as well as in terms of ranking and are at least as reliable as compound Poisson approximations. We then finally discuss some further possible improvements and applications of this new method.
NASA Astrophysics Data System (ADS)
Maginnis, P. A.; West, M.; Dullerud, G. E.
2016-10-01
We propose an algorithm to accelerate Monte Carlo simulation for a broad class of stochastic processes. Specifically, the class of countable-state, discrete-time Markov chains driven by additive Poisson noise, or lattice discrete-time Markov chains. In particular, this class includes simulation of reaction networks via the tau-leaping algorithm. To produce the speedup, we simulate pairs of fair-draw trajectories that are negatively correlated. Thus, when averaged, these paths produce an unbiased Monte Carlo estimator that has reduced variance and, therefore, reduced error. Numerical results for three example systems included in this work demonstrate two to four orders of magnitude reduction of mean-square error. The numerical examples were chosen to illustrate different application areas and levels of system complexity. The areas are: gene expression (affine state-dependent rates), aerosol particle coagulation with emission and human immunodeficiency virus infection (both with nonlinear state-dependent rates). Our algorithm views the system dynamics as a "black-box", i.e., we only require control of pseudorandom number generator inputs. As a result, typical codes can be retrofitted with our algorithm using only minor changes. We prove several analytical results. Among these, we characterize the relationship of covariances between paths in the general nonlinear state-dependent intensity rates case, and we prove variance reduction of mean estimators in the special case of affine intensity rates.
Controlling influenza disease: Comparison between discrete time Markov chain and deterministic model
NASA Astrophysics Data System (ADS)
Novkaniza, F.; Ivana, Aldila, D.
2016-04-01
Mathematical model of respiratory diseases spread with Discrete Time Markov Chain (DTMC) and deterministic approach for constant total population size are analyzed and compared in this article. Intervention of medical treatment and use of medical mask included in to the model as a constant parameter to controlling influenza spreads. Equilibrium points and basic reproductive ratio as the endemic criteria and it level set depend on some variable are given analytically and numerically as a results from deterministic model analysis. Assuming total of human population is constant from deterministic model, number of infected people also analyzed with Discrete Time Markov Chain (DTMC) model. Since Δt → 0, we could assume that total number of infected people might change only from i to i + 1, i - 1, or i. Approximation probability of an outbreak with gambler's ruin problem will be presented. We find that no matter value of basic reproductive ℛ0, either its larger than one or smaller than one, number of infection will always tends to 0 for t → ∞. Some numerical simulation to compare between deterministic and DTMC approach is given to give a better interpretation and a better understanding about the models results.
NASA Astrophysics Data System (ADS)
Faggionato, Alessandra; di Pietro, Daniele
2011-04-01
We slightly extend the fluctuation theorem obtained in (Lebowitz and Spohn in J. Stat. Phys. 95:333-365, 1999) for sums of generators, considering continuous-time Markov chains on a finite state space whose underlying graph has multiple edges and no loop. This extended frame is suited when analyzing chemical systems. As simple corollary we derive by a different method the fluctuation theorem of D. Andrieux and P. Gaspard for the fluxes along the chords associated to a fundamental set of oriented cycles (Andrieux and Gaspard in J. Stat. Phys. 127:107-131, 2007). We associate to each random trajectory an oriented cycle on the graph and we decompose it in terms of a basis of oriented cycles. We prove a fluctuation theorem for the coefficients in this decomposition. The resulting fluctuation theorem involves the cycle affinities, which in many real systems correspond to the macroscopic forces. In addition, the above decomposition is useful when analyzing the large deviations of additive functionals of the Markov chain. As example of application, in a very general context we derive a fluctuation relation for the mechanical and chemical currents of a molecular motor moving along a periodic filament.
Mitavskiy, Boris; Cannings, Chris
2009-01-01
The evolutionary algorithm stochastic process is well-known to be Markovian. These have been under investigation in much of the theoretical evolutionary computing research. When the mutation rate is positive, the Markov chain modeling of an evolutionary algorithm is irreducible and, therefore, has a unique stationary distribution. Rather little is known about the stationary distribution. In fact, the only quantitative facts established so far tell us that the stationary distributions of Markov chains modeling evolutionary algorithms concentrate on uniform populations (i.e., those populations consisting of a repeated copy of the same individual). At the same time, knowing the stationary distribution may provide some information about the expected time it takes for the algorithm to reach a certain solution, assessment of the biases due to recombination and selection, and is of importance in population genetics to assess what is called a "genetic load" (see the introduction for more details). In the recent joint works of the first author, some bounds have been established on the rates at which the stationary distribution concentrates on the uniform populations. The primary tool used in these papers is the "quotient construction" method. It turns out that the quotient construction method can be exploited to derive much more informative bounds on ratios of the stationary distribution values of various subsets of the state space. In fact, some of the bounds obtained in the current work are expressed in terms of the parameters involved in all the three main stages of an evolutionary algorithm: namely, selection, recombination, and mutation.
Sampling graphs with a prescribed joint degree distribution using Markov Chains.
Pinar, Ali; Stanton, Isabelle
2010-10-01
One of the most influential results in network analysis is that many natural networks exhibit a power-law or log-normal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these real-world networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest that the joint degree distribution of graphs is an interesting avenue of study for further research into network structure. We provide a simple greedy algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via endpoint switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge.
Numazawa, Satoshi; Smith, Roger
2011-10-01
Classical harmonic transition state theory is considered and applied in discrete lattice cells with hierarchical transition levels. The scheme is then used to determine transitions that can be applied in a lattice-based kinetic Monte Carlo (KMC) atomistic simulation model. The model results in an effective reduction of KMC simulation steps by utilizing a classification scheme of transition levels for thermally activated atomistic diffusion processes. Thermally activated atomistic movements are considered as local transition events constrained in potential energy wells over certain local time periods. These processes are represented by Markov chains of multidimensional Boolean valued functions in three-dimensional lattice space. The events inhibited by the barriers under a certain level are regarded as thermal fluctuations of the canonical ensemble and accepted freely. Consequently, the fluctuating system evolution process is implemented as a Markov chain of equivalence class objects. It is shown that the process can be characterized by the acceptance of metastable local transitions. The method is applied to a problem of Au and Ag cluster growth on a rippled surface. The simulation predicts the existence of a morphology-dependent transition time limit from a local metastable to stable state for subsequent cluster growth by accretion. Excellent agreement with observed experimental results is obtained.
Comparing variational Bayes with Markov chain Monte Carlo for Bayesian computation in neuroimaging.
Nathoo, F S; Lesperance, M L; Lawson, A B; Dean, C B
2013-08-01
In this article, we consider methods for Bayesian computation within the context of brain imaging studies. In such studies, the complexity of the resulting data often necessitates the use of sophisticated statistical models; however, the large size of these data can pose significant challenges for model fitting. We focus specifically on the neuroelectromagnetic inverse problem in electroencephalography, which involves estimating the neural activity within the brain from electrode-level data measured across the scalp. The relationship between the observed scalp-level data and the unobserved neural activity can be represented through an underdetermined dynamic linear model, and we discuss Bayesian computation for such models, where parameters represent the unknown neural sources of interest. We review the inverse problem and discuss variational approximations for fitting hierarchical models in this context. While variational methods have been widely adopted for model fitting in neuroimaging, they have received very little attention in the statistical literature, where Markov chain Monte Carlo is often used. We derive variational approximations for fitting two models: a simple distributed source model and a more complex spatiotemporal mixture model. We compare the approximations to Markov chain Monte Carlo using both synthetic data as well as through the analysis of a real electroencephalography dataset examining the evoked response related to face perception. The computational advantages of the variational method are demonstrated and the accuracy associated with the resulting approximations are clarified.
NASA Astrophysics Data System (ADS)
Gonthier, Peter L.; Koh, Yew-Meng; Kust Harding, Alice
2016-04-01
We present preliminary results of a new population synthesis of millisecond pulsars (MSP) from the Galactic disk using Markov Chain Monte Carlo techniques to better understand the model parameter space. We include empirical radio and gamma-ray luminosity models that are dependent on the pulsar period and period derivative with freely varying exponents. The magnitudes of the model luminosities are adjusted to reproduce the number of MSPs detected by a group of thirteen radio surveys as well as the MSP birth rate in the Galaxy and the number of MSPs detected by Fermi. We explore various high-energy emission geometries like the slot gap, outer gap, two pole caustic and pair starved polar cap models. The parameters associated with the birth distributions for the mass accretion rate, magnetic field, and period distributions are well constrained. With the set of four free parameters, we employ Markov Chain Monte Carlo simulations to explore the model parameter space. We present preliminary comparisons of the simulated and detected distributions of radio and gamma-ray pulsar characteristics. We estimate the contribution of MSPs to the diffuse gamma-ray background with a special focus on the Galactic Center.We express our gratitude for the generous support of the National Science Foundation (RUI: AST-1009731), Fermi Guest Investigator Program and the NASA Astrophysics Theory and Fundamental Program (NNX09AQ71G).
Xie, Jun; Kim, Nak-Kyeong
2005-09-01
Statistical methods have been developed for finding local patterns, also called motifs, in multiple protein sequences. The aligned segments may imply functional or structural core regions. However, the existing methods often have difficulties in aligning multiple proteins when sequence residue identities are low (e.g., less than 25%). In this article, we develop a Bayesian model and Markov chain Monte Carlo (MCMC) methods for identifying subtle motifs in protein sequences. Specifically, a motif is defined not only in terms of specific sites characterized by amino acid frequency vectors, but also as a combination of secondary characteristics such as hydrophobicity, polarity, etc. Markov chain Monte Carlo methods are proposed to search for a motif pattern with high posterior probability under the new model. A special MCMC algorithm is developed, involving transitions between state spaces of different dimensions. The proposed methods were supported by a simulated study. It was then tested by two real datasets, including a group of helix-turn-helix proteins, and one set from the CATH Protein Structure Classification Database. Statistical comparisons showed that the new approach worked better than a typical Gibbs sampling approach which is based only on an amino acid model.
Markov Chain Monte Carlo simulation for projection of end stage renal disease patients in Greece.
Rodina-Theocharaki, A; Bliznakova, K; Pallikarakis, N
2012-07-01
End stage renal disease (ESRD) treatment methods are considered to be among the most expensive procedures for chronic conditions worldwide which also have severe impact on patients' quality of life. During the last decade, Greece has been among the countries with the highest incidence and prevalence, while at the same time with the lowest kidney transplantation rates. Predicting future patients' number on Renal Replacement Therapy (RRT) is essential for health care providers in order to achieve more effective resource management. In this study a Markov Chain Monte Carlo (MCMC) simulation is presented for predicting the future number of ESRD patients for the period 2009-2020 in Greece. The MCMC model comprises Monte Carlo sampling techniques applied on probability distributions of the constructed Markov Chain. The model predicts that there will be 15,147 prevalent patients on RRT in Greece by 2020. Additionally, a cost-effectiveness analysis was performed on a scenario of gradually reducing the hemodialysis patients in favor of increasing the transplantation number by 2020. The proposed scenario showed net savings of 86.54 million Euros for the period 2009-2020 compared to the base-case prediction.
Short-term droughts forecast using Markov chain model in Victoria, Australia
NASA Astrophysics Data System (ADS)
Rahmat, Siti Nazahiyah; Jayasuriya, Niranjali; Bhuiyan, Muhammed A.
2016-04-01
A comprehensive risk management strategy for dealing with drought should include both short-term and long-term planning. The objective of this paper is to present an early warning method to forecast drought using the Standardised Precipitation Index (SPI) and a non-homogeneous Markov chain model. A model such as this is useful for short-term planning. The developed method has been used to forecast droughts at a number of meteorological monitoring stations that have been regionalised into six (6) homogenous clusters with similar drought characteristics based on SPI. The non-homogeneous Markov chain model was used to estimate drought probabilities and drought predictions up to 3 months ahead. The drought severity classes defined using the SPI were computed at a 12-month time scale. The drought probabilities and the predictions were computed for six clusters that depict similar drought characteristics in Victoria, Australia. Overall, the drought severity class predicted was quite similar for all the clusters, with the non-drought class probabilities ranging from 49 to 57 %. For all clusters, the near normal class had a probability of occurrence varying from 27 to 38 %. For the more moderate and severe classes, the probabilities ranged from 2 to 13 % and 3 to 1 %, respectively. The developed model predicted drought situations 1 month ahead reasonably well. However, 2 and 3 months ahead predictions should be used with caution until the models are developed further.
Farr, W. M.; Mandel, I.; Stevens, D.
2015-01-01
Selection among alternative theoretical models given an observed dataset is an important challenge in many areas of physics and astronomy. Reversible-jump Markov chain Monte Carlo (RJMCMC) is an extremely powerful technique for performing Bayesian model selection, but it suffers from a fundamental difficulty and it requires jumps between model parameter spaces, but cannot efficiently explore both parameter spaces at once. Thus, a naive jump between parameter spaces is unlikely to be accepted in the Markov chain Monte Carlo (MCMC) algorithm and convergence is correspondingly slow. Here, we demonstrate an interpolation technique that uses samples from single-model MCMCs to propose intermodel jumps from an approximation to the single-model posterior of the target parameter space. The interpolation technique, based on a kD-tree data structure, is adaptive and efficient in modest dimensionality. We show that our technique leads to improved convergence over naive jumps in an RJMCMC, and compare it to other proposals in the literature to improve the convergence of RJMCMCs. We also demonstrate the use of the same interpolation technique as a way to construct efficient ‘global’ proposal distributions for single-model MCMCs without prior knowledge of the structure of the posterior distribution, and discuss improvements that permit the method to be used in higher dimensional spaces efficiently. PMID:26543580
Adaptive relaxation for the steady-state analysis of Markov chains
NASA Technical Reports Server (NTRS)
Horton, Graham
1994-01-01
We consider a variant of the well-known Gauss-Seidel method for the solution of Markov chains in steady state. Whereas the standard algorithm visits each state exactly once per iteration in a predetermined order, the alternative approach uses a dynamic strategy. A set of states to be visited is maintained which can grow and shrink as the computation progresses. In this manner, we hope to concentrate the computational work in those areas of the chain in which maximum improvement in the solution can be achieved. We consider the adaptive approach both as a solver in its own right and as a relaxation method within the multi-level algorithm. Experimental results show significant computational savings in both cases.
Markov chains at the interface of combinatorics, computing, and statistical physics
NASA Astrophysics Data System (ADS)
Streib, Amanda Pascoe
The fields of statistical physics, discrete probability, combinatorics, and theoretical computer science have converged around efforts to understand random structures and algorithms. Recent activity in the interface of these fields has enabled tremendous breakthroughs in each domain and has supplied a new set of techniques for researchers approaching related problems. This thesis makes progress on several problems in this interface whose solutions all build on insights from multiple disciplinary perspectives. First, we consider a dynamic growth process arising in the context of DNA-based self-assembly. The assembly process can be modeled as a simple Markov chain. We prove that the chain is rapidly mixing for large enough bias in regions of Zd. The proof uses a geometric distance function and a variant of path coupling in order to handle distances that can be exponentially large. We also provide the first results in the case of fluctuating bias, where the bias can vary depending on the location of the tile, which arises in the nanotechnology application. Moreover, we use intuition from statistical physics to construct a choice of the biases for which the Markov chain Mmon requires exponential time to converge. Second, we consider a related problem regarding the convergence rate of biased permutations that arises in the context of self-organizing lists. The Markov chain Mnn in this case is a nearest-neighbor chain that allows adjacent transpositions, and the rate of these exchanges is governed by various input parameters. It was conjectured that the chain is always rapidly mixing when the inversion probabilities are positively biased, i.e., we put nearest neighbor pair x < y in order with bias 1/2 ≤ pxy ≤ 1 and out of order with bias 1 - pxy. The Markov chain Mmon was known to have connections to a simplified version of this biased card-shuffling. We provide new connections between Mnn and Mmon by using simple combinatorial bijections, and we prove that Mnn is
Entropy and long-range memory in random symbolic additive Markov chains
NASA Astrophysics Data System (ADS)
Melnik, S. S.; Usatenko, O. V.
2016-06-01
The goal of this paper is to develop an estimate for the entropy of random symbolic sequences with elements belonging to a finite alphabet. As a plausible model, we use the high-order additive stationary ergodic Markov chain with long-range memory. Supposing that the correlations between random elements of the chain are weak, we express the conditional entropy of the sequence by means of the symbolic pair correlation function. We also examine an algorithm for estimating the conditional entropy of finite symbolic sequences. We show that the entropy contains two contributions, i.e., the correlation and the fluctuation. The obtained analytical results are used for numerical evaluation of the entropy of written English texts and DNA nucleotide sequences. The developed theory opens the way for constructing a more consistent and sophisticated approach to describe the systems with strong short-range and weak long-range memory.
NASA Technical Reports Server (NTRS)
Leutenegger, Scott T.; Horton, Graham
1994-01-01
Recently the Multi-Level algorithm was introduced as a general purpose solver for the solution of steady state Markov chains. In this paper, we consider the performance of the Multi-Level algorithm for solving Nearly Completely Decomposable (NCD) Markov chains, for which special-purpose iteractive aggregation/disaggregation algorithms such as the Koury-McAllister-Stewart (KMS) method have been developed that can exploit the decomposability of the the Markov chain. We present experimental results indicating that the general-purpose Multi-Level algorithm is competitive, and can be significantly faster than the special-purpose KMS algorithm when Gauss-Seidel and Gaussian Elimination are used for solving the individual blocks.
Reversible jump Markov chain Monte Carlo for Bayesian deconvolution of point sources
NASA Astrophysics Data System (ADS)
Stawinski, Guillaume; Doucet, Arnaud; Duvaut, Patrick
1998-09-01
In this article, we address the problem of Bayesian deconvolution of point sources in nuclear imaging under the assumption of Poissonian statistics. The observed image is the result of the convolution by a known point spread function of an unknown number of point sources with unknown parameters. To detect the number of sources and estimate their parameters we follow a Bayesian approach. However, instead of using a classical low level prior model based on Markov random fields, we prose a high-level model which describes the picture as a list of its constituent objects, rather than as a list of pixels on which the data are recorded. More precisely, each source is assumed to have a circular Gaussian shape and we set a prior distribution on the number of sources, on their locations and on the amplitude and width deviation of the Gaussian shape. This high-level model has far less parameters than a Markov random field model as only s small number of sources are usually present. The Bayesian model being defined, all inference is based on the resulting posterior distribution. This distribution does not admit any closed-form analytical expression. We present here a Reversible Jump MCMC algorithm for its estimation. This algorithm is tested on both synthetic and real data.
Optimal clinical trial design based on a dichotomous Markov-chain mixed-effect sleep model.
Steven Ernest, C; Nyberg, Joakim; Karlsson, Mats O; Hooker, Andrew C
2014-12-01
D-optimal designs for discrete-type responses have been derived using generalized linear mixed models, simulation based methods and analytical approximations for computing the fisher information matrix (FIM) of non-linear mixed effect models with homogeneous probabilities over time. In this work, D-optimal designs using an analytical approximation of the FIM for a dichotomous, non-homogeneous, Markov-chain phase advanced sleep non-linear mixed effect model was investigated. The non-linear mixed effect model consisted of transition probabilities of dichotomous sleep data estimated as logistic functions using piecewise linear functions. Theoretical linear and nonlinear dose effects were added to the transition probabilities to modify the probability of being in either sleep stage. D-optimal designs were computed by determining an analytical approximation the FIM for each Markov component (one where the previous state was awake and another where the previous state was asleep). Each Markov component FIM was weighted either equally or by the average probability of response being awake or asleep over the night and summed to derive the total FIM (FIM(total)). The reference designs were placebo, 0.1, 1-, 6-, 10- and 20-mg dosing for a 2- to 6-way crossover study in six dosing groups. Optimized design variables were dose and number of subjects in each dose group. The designs were validated using stochastic simulation/re-estimation (SSE). Contrary to expectations, the predicted parameter uncertainty obtained via FIM(total) was larger than the uncertainty in parameter estimates computed by SSE. Nevertheless, the D-optimal designs decreased the uncertainty of parameter estimates relative to the reference designs. Additionally, the improvement for the D-optimal designs were more pronounced using SSE than predicted via FIM(total). Through the use of an approximate analytic solution and weighting schemes, the FIM(total) for a non-homogeneous, dichotomous Markov-chain phase
Minsley, Burke J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, ‘best’ model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequencydomain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a signiﬁcant degree of ﬂexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and conﬁguration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favorably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment.
Minsley, B.J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, 'best' model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequency-domain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favourably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment. ?? 2011. Geophysical Journal International ?? 2011 RAS.
NASA Astrophysics Data System (ADS)
Vrugt, J. A.
2007-12-01
Markov chain Monte Carlo (MCMC) methods are widely used in fields ranging from physics and chemistry, to finance, economics and statistical inference for estimating the average properties of complex systems. The convergence rate of MCMC schemes is often observed, however to be disturbingly low, limiting its practical use in many applications. This is frequently caused by an inappropriate selection of the proposal distribution used to generate trial moves. Here we show that significant improvements to the efficiency of MCMC algorithms can be made by using a self-adaptive Differential Evolution search strategy within a population-based evolutionary framework. This scheme differs fundamentally from existing MCMC algorithms, in that trial jumps are simply a fixed multiple of the difference of randomly chosen members of the population using various genetic operators that are adaptively updated during the search. In addition, the algorithm includes randomized subspace sampling to further improve convergence and acceptance rate. Detailed balance and ergodicity of the algorithm are proved, and hydrologic examples show that the proposed method significantly enhances the efficiency and applicability of MCMC simulations to complex, multi-modal search problems.
Combined survival analysis of cardiac patients by a Cox PH model and a Markov chain.
Shauly, Michal; Rabinowitz, Gad; Gilutz, Harel; Parmet, Yisrael
2011-10-01
The control and treatment of dyslipidemia is a major public health challenge, particularly for patients with coronary heart diseases. In this paper we propose a framework for survival analysis of patients who had a major cardiac event, focusing on assessment of the effect of changing LDL-cholesterol level and statins consumption on survival. This framework includes a Cox PH model and a Markov chain, and combines their results into reinforced conclusions regarding the factors that affect survival time. We prospectively studied 2,277 cardiac patients, and the results show high congruence between the Markov model and the PH model; both evidence that diabetes, history of stroke, peripheral vascular disease and smoking significantly increase hazard rate and reduce survival time. On the other hand, statin consumption is correlated with a lower hazard rate and longer survival time in both models. The role of such a framework in understanding the therapeutic behavior of patients and implementing effective secondary and primary prevention of heart diseases is discussed here.
A Markov chain analysis of fish movements to determine entrainment zones
Johnson, Gary E.; Hedgepeth, J.; Skalski, John R.; Giorgi, Albert E.
2004-06-01
The extent of the biological zone of influence (BZI) of a water withdrawal port, such as a cooling water intake or a smolt bypass, directly reflects its local effect on fish. This study produced a new technique to determine the BZI, defined as the region immediately upstream of a portal where the probability of fish movement toward the portal is greater than 90%. We developed and applied the technique at The Dalles Dam on the Columbia River, where the ice/trash sluiceway functions as a surface flow smolt bypass. To map the BZI, we applied a Markov-Chain analysis to smolt movement data collected with an active fish tracking sonar system. Probabilities of fish movement from cell to cell in the sample volume, calculated from tracked fish data, formed a Markov transition matrix. Multiplying this matrix by itself many times with absorption at the boundaries produced estimates of probability of passage out each side of the sample volume from the cells within. The BZI of a sluiceway entrance at The Dalles Dam was approximately 5 m across and extended 6-8 m out from the face of the dam in the surface layer 2-3 m deep. BZI mapping is applicable to many bioengineering efforts to protect fish populations.
Modeling and Computing of Stock Index Forecasting Based on Neural Network and Markov Chain
Dai, Yonghui; Han, Dongmei; Dai, Weihui
2014-01-01
The stock index reflects the fluctuation of the stock market. For a long time, there have been a lot of researches on the forecast of stock index. However, the traditional method is limited to achieving an ideal precision in the dynamic market due to the influences of many factors such as the economic situation, policy changes, and emergency events. Therefore, the approach based on adaptive modeling and conditional probability transfer causes the new attention of researchers. This paper presents a new forecast method by the combination of improved back-propagation (BP) neural network and Markov chain, as well as its modeling and computing technology. This method includes initial forecasting by improved BP neural network, division of Markov state region, computing of the state transition probability matrix, and the prediction adjustment. Results of the empirical study show that this method can achieve high accuracy in the stock index prediction, and it could provide a good reference for the investment in stock market. PMID:24782659
Simplification of reversible Markov chains by removal of states with low equilibrium occupancy.
Ullah, Ghanim; Bruno, William J; Pearson, John E
2012-10-21
We present a practical method for simplifying Markov chains on a potentially large state space when detailed balance holds. A simple and transparent technique is introduced to remove states with low equilibrium occupancy. The resulting system has fewer parameters. The resulting effective rates between the remaining nodes give dynamics identical to the original system's except on very fast timescales. This procedure amounts to using separation of timescales to neglect small capacitance nodes in a network of resistors and capacitors. We illustrate the technique by simplifying various reaction networks, including transforming an acyclic four-node network to a three-node cyclic network. For a reaction step in which a ligand binds, the law of mass action implies a forward rate proportional to ligand concentration. The effective rates in the simplified network are found to be rational functions of ligand concentration.
Application of Markov chain model to daily maximum temperature for thermal comfort in Malaysia
NASA Astrophysics Data System (ADS)
Nordin, Muhamad Asyraf bin Che; Hassan, Husna
2015-10-01
The Markov chain's first order principle has been widely used to model various meteorological fields, for prediction purposes. In this study, a 14-year (2000-2013) data of daily maximum temperatures in Bayan Lepas were used. Earlier studies showed that the outdoor thermal comfort range based on physiologically equivalent temperature (PET) index in Malaysia is less than 34°C, thus the data obtained were classified into two state: normal state (within thermal comfort range) and hot state (above thermal comfort range). The long-run results show the probability of daily temperature exceed TCR will be only 2.2%. On the other hand, the probability daily temperature within TCR will be 97.8%.
On the reliability of NMR relaxation data analyses: a Markov Chain Monte Carlo approach.
Abergel, Daniel; Volpato, Andrea; Coutant, Eloi P; Polimeno, Antonino
2014-09-01
The analysis of NMR relaxation data is revisited along the lines of a Bayesian approach. Using a Markov Chain Monte Carlo strategy of data fitting, we investigate conditions under which relaxation data can be effectively interpreted in terms of internal dynamics. The limitations to the extraction of kinetic parameters that characterize internal dynamics are analyzed, and we show that extracting characteristic time scales shorter than a few tens of ps is very unlikely. However, using MCMC methods, reliable estimates of the marginal probability distributions and estimators (average, standard deviations, etc.) can still be obtained for subsets of the model parameters. Thus, unlike more conventional strategies of data analysis, the method avoids a model selection process. In addition, it indicates what information may be extracted from the data, but also what cannot.
Irreversible Markov chain Monte Carlo algorithm for self-avoiding walk
NASA Astrophysics Data System (ADS)
Hu, Hao; Chen, Xiaosong; Deng, Youjin
2017-02-01
We formulate an irreversible Markov chain Monte Carlo algorithm for the self-avoiding walk (SAW), which violates the detailed balance condition and satisfies the balance condition. Its performance improves significantly compared to that of the Berretti-Sokal algorithm, which is a variant of the Metropolis-Hastings method. The gained efficiency increases with spatial dimension (D), from approximately 10 times in 2D to approximately 40 times in 5D. We simulate the SAW on a 5D hypercubic lattice with periodic boundary conditions, for a linear system with a size up to L = 128, and confirm that as for the 5D Ising model, the finite-size scaling of the SAW is governed by renormalized exponents, v* = 2/ d and γ/ v* = d/2. The critical point is determined, which is approximately 8 times more precise than the best available estimate.
Study of behavior and determination of customer lifetime value(CLV) using Markov chain model
Permana, Dony; Indratno, Sapto Wahyu; Pasaribu, Udjianna S.
2014-03-24
Customer Lifetime Value or CLV is a restriction on interactive marketing to help a company in arranging financial for the marketing of new customer acquisition and customer retention. Additionally CLV can be able to segment customers for financial arrangements. Stochastic models for the fairly new CLV used a Markov chain. In this model customer retention probability and new customer acquisition probability play an important role. This model is originally introduced by Pfeifer and Carraway in 2000 [1]. They introduced several CLV models, one of them only involves customer and former customer. In this paper we expand the model by adding the assumption of the transition from former customer to customer. In the proposed model, the CLV value is higher than the CLV value obtained by Pfeifer and Caraway model. But our model still requires a longer convergence time.
Use of Bayesian Markov Chain Monte Carlo methods to model cost-of-illness data.
Cooper, Nicola J; Sutton, Alex J; Mugford, Miranda; Abrams, Keith R
2003-01-01
It is well known that the modeling of cost data is often problematic due to the distribution of such data. Commonly observed problems include 1) a strongly right-skewed data distribution and 2) a significant percentage of zero-cost observations. This article demonstrates how a hurdle model can be implemented from a Bayesian perspective by means of Markov Chain Monte Carlo simulation methods using the freely available software WinBUGS. Assessment of model fit is addressed through the implementation of two cross-validation methods. The relative merits of this Bayesian approach compared to the classical equivalent are discussed in detail. To illustrate the methods described, patient-specific non-health-care resource-use data from a prospective longitudinal study and the Norfolk Arthritis Register (NOAR) are utilized for 218 individuals with early inflammatory polyarthritis (IP). The NOAR database also includes information on various patient-level covariates.
Ideal-observer computation in medical imaging with use of Markov-chain Monte Carlo techniques.
Kupinski, Matthew A; Hoppin, John W; Clarkson, Eric; Barrett, Harrison H
2003-03-01
The ideal observer sets an upper limit on the performance of an observer on a detection or classification task. The performance of the ideal observer can be used to optimize hardware components of imaging systems and also to determine another observer's relative performance in comparison with the best possible observer. The ideal observer employs complete knowledge of the statistics of the imaging system, including the noise and object variability. Thus computing the ideal observer for images (large-dimensional vectors) is burdensome without severely restricting the randomness in the imaging system, e.g., assuming a flat object. We present a method for computing the ideal-observer test statistic and performance by using Markov-chain Monte Carlo techniques when we have a well-characterized imaging system, knowledge of the noise statistics, and a stochastic object model. We demonstrate the method by comparing three different parallel-hole collimator imaging systems in simulation.
Thomas, Stuart C; Hill, William G
2002-06-01
Markov chain Monte Carlo procedures allow the reconstruction of full-sibships using data from genetic marker loci only. In this study, these techniques are extended to allow the reconstruction of nested full- within half-sib families, and to present an efficient method for calculating the likelihood of the observed marker data in a nested family. Simulation is used to examine the properties of the reconstructed sibships, and of estimates of heritability and common environmental variance of quantitative traits obtained from those populations. Accuracy of reconstruction increases with increasing marker information and with increasing size of the nested full-sibships, but decreases with increasing population size. Estimates of variance component are biased, with the direction and magnitude of bias being dependent upon the underlying errors made during pedigree reconstruction.
A methodology for stochastic analysis of share prices as Markov chains with finite states.
Mettle, Felix Okoe; Quaye, Enoch Nii Boi; Laryea, Ravenhill Adjetey
2014-01-01
Price volatilities make stock investments risky, leaving investors in critical position when uncertain decision is made. To improve investor evaluation confidence on exchange markets, while not using time series methodology, we specify equity price change as a stochastic process assumed to possess Markov dependency with respective state transition probabilities matrices following the identified state pace (i.e. decrease, stable or increase). We established that identified states communicate, and that the chains are aperiodic and ergodic thus possessing limiting distributions. We developed a methodology for determining expected mean return time for stock price increases and also establish criteria for improving investment decision based on highest transition probabilities, lowest mean return time and highest limiting distributions. We further developed an R algorithm for running the methodology introduced. The established methodology is applied to selected equities from Ghana Stock Exchange weekly trading data.
Study of behavior and determination of customer lifetime value(CLV) using Markov chain model
NASA Astrophysics Data System (ADS)
Permana, Dony; Indratno, Sapto Wahyu; Pasaribu, Udjianna S.
2014-03-01
Customer Lifetime Value or CLV is a restriction on interactive marketing to help a company in arranging financial for the marketing of new customer acquisition and customer retention. Additionally CLV can be able to segment customers for financial arrangements. Stochastic models for the fairly new CLV used a Markov chain. In this model customer retention probability and new customer acquisition probability play an important role. This model is originally introduced by Pfeifer and Carraway in 2000 [1]. They introduced several CLV models, one of them only involves customer and former customer. In this paper we expand the model by adding the assumption of the transition from former customer to customer. In the proposed model, the CLV value is higher than the CLV value obtained by Pfeifer and Caraway model. But our model still requires a longer convergence time.
A Markov Chain Model for Changes in Users’ Assessment of Search Results
Zhitomirsky-Geffet, Maayan; Bar-Ilan, Judit; Levene, Mark
2016-01-01
Previous research shows that users tend to change their assessment of search results over time. This is a first study that investigates the factors and reasons for these changes, and describes a stochastic model of user behaviour that may explain these changes. In particular, we hypothesise that most of the changes are local, i.e. between results with similar or close relevance to the query, and thus belong to the same”coarse” relevance category. According to the theory of coarse beliefs and categorical thinking, humans tend to divide the range of values under consideration into coarse categories, and are thus able to distinguish only between cross-category values but not within them. To test this hypothesis we conducted five experiments with about 120 subjects divided into 3 groups. Each student in every group was asked to rank and assign relevance scores to the same set of search results over two or three rounds, with a period of three to nine weeks between each round. The subjects of the last three-round experiment were then exposed to the differences in their judgements and were asked to explain them. We make use of a Markov chain model to measure change in users’ judgments between the different rounds. The Markov chain demonstrates that the changes converge, and that a majority of the changes are local to a neighbouring relevance category. We found that most of the subjects were satisfied with their changes, and did not perceive them as mistakes but rather as a legitimate phenomenon, since they believe that time has influenced their relevance assessment. Both our quantitative analysis and user comments support the hypothesis of the existence of coarse relevance categories resulting from categorical thinking in the context of user evaluation of search results. PMID:27171426
Fuzzy hidden Markov chains segmentation for volume determination and quantitation in PET.
Hatt, M; Lamare, F; Boussion, N; Turzo, A; Collet, C; Salzenstein, F; Roux, C; Jarritt, P; Carson, K; Cheze-Le Rest, C; Visvikis, D
2007-06-21
Accurate volume of interest (VOI) estimation in PET is crucial in different oncology applications such as response to therapy evaluation and radiotherapy treatment planning. The objective of our study was to evaluate the performance of the proposed algorithm for automatic lesion volume delineation; namely the fuzzy hidden Markov chains (FHMC), with that of current state of the art in clinical practice threshold based techniques. As the classical hidden Markov chain (HMC) algorithm, FHMC takes into account noise, voxel intensity and spatial correlation, in order to classify a voxel as background or functional VOI. However the novelty of the fuzzy model consists of the inclusion of an estimation of imprecision, which should subsequently lead to a better modelling of the 'fuzzy' nature of the object of interest boundaries in emission tomography data. The performance of the algorithms has been assessed on both simulated and acquired datasets of the IEC phantom, covering a large range of spherical lesion sizes (from 10 to 37 mm), contrast ratios (4:1 and 8:1) and image noise levels. Both lesion activity recovery and VOI determination tasks were assessed in reconstructed images using two different voxel sizes (8 mm3 and 64 mm3). In order to account for both the functional volume location and its size, the concept of % classification errors was introduced in the evaluation of volume segmentation using the simulated datasets. Results reveal that FHMC performs substantially better than the threshold based methodology for functional volume determination or activity concentration recovery considering a contrast ratio of 4:1 and lesion sizes of <28 mm. Furthermore differences between classification and volume estimation errors evaluated were smaller for the segmented volumes provided by the FHMC algorithm. Finally, the performance of the automatic algorithms was less susceptible to image noise levels in comparison to the threshold based techniques. The analysis of both
Markov Chain-Like Quantum Biological Modeling of Mutations, Aging, and Evolution
Djordjevic, Ivan B.
2015-01-01
Recent evidence suggests that quantum mechanics is relevant in photosynthesis, magnetoreception, enzymatic catalytic reactions, olfactory reception, photoreception, genetics, electron-transfer in proteins, and evolution; to mention few. In our recent paper published in Life, we have derived the operator-sum representation of a biological channel based on codon basekets, and determined the quantum channel model suitable for study of the quantum biological channel capacity. However, this model is essentially memoryless and it is not able to properly model the propagation of mutation errors in time, the process of aging, and evolution of genetic information through generations. To solve for these problems, we propose novel quantum mechanical models to accurately describe the process of creation spontaneous, induced, and adaptive mutations and their propagation in time. Different biological channel models with memory, proposed in this paper, include: (i) Markovian classical model, (ii) Markovian-like quantum model, and (iii) hybrid quantum-classical model. We then apply these models in a study of aging and evolution of quantum biological channel capacity through generations. We also discuss key differences of these models with respect to a multilevel symmetric channel-based Markovian model and a Kimura model-based Markovian process. These models are quite general and applicable to many open problems in biology, not only biological channel capacity, which is the main focus of the paper. We will show that the famous quantum Master equation approach, commonly used to describe different biological processes, is just the first-order approximation of the proposed quantum Markov chain-like model, when the observation interval tends to zero. One of the important implications of this model is that the aging phenotype becomes determined by different underlying transition probabilities in both programmed and random (damage) Markov chain-like models of aging, which are mutually
Effects of tour boats on dolphin activity examined with sensitivity analysis of Markov chains.
Dans, Silvana Laura; Degrati, Mariana; Pedraza, Susana Noemí; Crespo, Enrique Alberto
2012-08-01
In Patagonia, Argentina, watching dolphins, especially dusky dolphins (Lagenorhynchus obscurus), is a new tourist activity. Feeding time decreases and time to return to feeding after feeding is abandoned and time it takes a group of dolphins to feed increase in the presence of boats. Such effects on feeding behavior may exert energetic costs on dolphins and thus reduce an individual's survival and reproductive capacity or maybe associated with shifts in distribution. We sought to predict which behavioral changes modify the activity pattern of dolphins the most. We modeled behavioral sequences of dusky dolphins with Markov chains. We calculated transition probabilities from one activity to another and arranged them in a stochastic matrix model. The proportion of time dolphins dedicated to a given activity (activity budget) and the time it took a dolphin to resume that activity after it had been abandoned (recurrence time) were calculated. We used a sensitivity analysis of Markov chains to calculate the sensitivity of the time budget and the activity-resumption time to changes in behavioral transition probabilities. Feeding-time budget was most sensitive to changes in the probability of dolphins switching from traveling to feeding behavior and of maintaining feeding behavior. Thus, an increase in these probabilities would be associated with the largest reduction in the time dedicated to feeding. A reduction in the probability of changing from traveling to feeding would also be associated with the largest increases in the time it takes dolphins to resume feeding. To approach dolphins when they are traveling would not affect behavior less because presence of the boat may keep dolphins from returning to feeding. Our results may help operators of dolphin-watching vessels minimize negative effects on dolphins.
Markov Chain-Like Quantum Biological Modeling of Mutations, Aging, and Evolution.
Djordjevic, Ivan B
2015-08-24
Recent evidence suggests that quantum mechanics is relevant in photosynthesis, magnetoreception, enzymatic catalytic reactions, olfactory reception, photoreception, genetics, electron-transfer in proteins, and evolution; to mention few. In our recent paper published in Life, we have derived the operator-sum representation of a biological channel based on codon basekets, and determined the quantum channel model suitable for study of the quantum biological channel capacity. However, this model is essentially memoryless and it is not able to properly model the propagation of mutation errors in time, the process of aging, and evolution of genetic information through generations. To solve for these problems, we propose novel quantum mechanical models to accurately describe the process of creation spontaneous, induced, and adaptive mutations and their propagation in time. Different biological channel models with memory, proposed in this paper, include: (i) Markovian classical model, (ii) Markovian-like quantum model, and (iii) hybrid quantum-classical model. We then apply these models in a study of aging and evolution of quantum biological channel capacity through generations. We also discuss key differences of these models with respect to a multilevel symmetric channel-based Markovian model and a Kimura model-based Markovian process. These models are quite general and applicable to many open problems in biology, not only biological channel capacity, which is the main focus of the paper. We will show that the famous quantum Master equation approach, commonly used to describe different biological processes, is just the first-order approximation of the proposed quantum Markov chain-like model, when the observation interval tends to zero. One of the important implications of this model is that the aging phenotype becomes determined by different underlying transition probabilities in both programmed and random (damage) Markov chain-like models of aging, which are mutually
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Fiske, Ian J.; Royle, J. Andrew; Gross, Kevin
2014-01-01
Ecologists and wildlife biologists increasingly use latent variable models to study patterns of species occurrence when detection is imperfect. These models have recently been generalized to accommodate both a more expansive description of state than simple presence or absence, and Markovian dynamics in the latent state over successive sampling seasons. In this paper, we write these multi-season, multi-state models as hidden Markov models to find both maximum likelihood estimates of model parameters and finite-sample estimators of the trajectory of the latent state over time. These estimators are especially useful for characterizing population trends in species of conservation concern. We also develop parametric bootstrap procedures that allow formal inference about latent trend. We examine model behavior through simulation, and we apply the model to data from the North American Amphibian Monitoring Program.
Bayesian inference of local trees along chromosomes by the sequential Markov coalescent.
Zheng, Chaozhi; Kuhner, Mary K; Thompson, Elizabeth A
2014-05-01
We propose a genealogy-sampling algorithm, Sequential Markov Ancestral Recombination Tree (SMARTree), that provides an approach to estimation from SNP haplotype data of the patterns of coancestry across a genome segment among a set of homologous chromosomes. To enable analysis across longer segments of genome, the sequence of coalescent trees is modeled via the modified sequential Markov coalescent (Marjoram and Wall, Genetics 7:16, 2006). To assess performance in estimating these local trees, our SMARTree implementation is tested on simulated data. Our base data set is of the SNPs in 10 DNA sequences over 50 kb. We examine the effects of longer sequences and of more sequences, and of a recombination and/or mutational hotspot. The model underlying SMARTree is an approximation to the full recombinant-coalescent distribution. However, in a small trial on simulated data, recovery of local trees was similar to that of LAMARC (Kuhner et al. Genetics 156:1393-1401, 2000a), a sampler which uses the full model.
Multi-Physics Markov Chain Monte Carlo Methods for Subsurface Flows
NASA Astrophysics Data System (ADS)
Rigelo, J.; Ginting, V.; Rahunanthan, A.; Pereira, F.
2014-12-01
For CO2 sequestration in deep saline aquifers, contaminant transport in subsurface, and oil or gas recovery, we often need to forecast flow patterns. Subsurface characterization is a critical and challenging step in flow forecasting. To characterize subsurface properties we establish a statistical description of the subsurface properties that are conditioned to existing dynamic and static data. A Markov Chain Monte Carlo (MCMC) algorithm is used in a Bayesian statistical description to reconstruct the spatial distribution of rock permeability and porosity. The MCMC algorithm requires repeatedly solving a set of nonlinear partial differential equations describing displacement of fluids in porous media for different values of permeability and porosity. The time needed for the generation of a reliable MCMC chain using the algorithm can be too long to be practical for flow forecasting. In this work we develop fast and effective computational methods for generating MCMC chains in the Bayesian framework for the subsurface characterization. Our strategy consists of constructing a family of computationally inexpensive preconditioners based on simpler physics as well as on surrogate models such that the number of fine-grid simulations is drastically reduced in the generated MCMC chains. In particular, we introduce a huff-puff technique as screening step in a three-stage multi-physics MCMC algorithm to reduce the number of expensive final stage simulations. The huff-puff technique in the algorithm enables a better characterization of subsurface near wells. We assess the quality of the proposed multi-physics MCMC methods by considering Monte Carlo simulations for forecasting oil production in an oil reservoir.
Improving Bayesian analysis for LISA Pathfinder using an efficient Markov Chain Monte Carlo method
NASA Astrophysics Data System (ADS)
Ferraioli, Luigi; Porter, Edward K.; Armano, Michele; Audley, Heather; Congedo, Giuseppe; Diepholz, Ingo; Gibert, Ferran; Hewitson, Martin; Hueller, Mauro; Karnesis, Nikolaos; Korsakova, Natalia; Nofrarias, Miquel; Plagnol, Eric; Vitale, Stefano
2014-02-01
We present a parameter estimation procedure based on a Bayesian framework by applying a Markov Chain Monte Carlo algorithm to the calibration of the dynamical parameters of the LISA Pathfinder satellite. The method is based on the Metropolis-Hastings algorithm and a two-stage annealing treatment in order to ensure an effective exploration of the parameter space at the beginning of the chain. We compare two versions of the algorithm with an application to a LISA Pathfinder data analysis problem. The two algorithms share the same heating strategy but with one moving in coordinate directions using proposals from a multivariate Gaussian distribution, while the other uses the natural logarithm of some parameters and proposes jumps in the eigen-space of the Fisher Information matrix. The algorithm proposing jumps in the eigen-space of the Fisher Information matrix demonstrates a higher acceptance rate and a slightly better convergence towards the equilibrium parameter distributions in the application to LISA Pathfinder data. For this experiment, we return parameter values that are all within ˜1 σ of the injected values. When we analyse the accuracy of our parameter estimation in terms of the effect they have on the force-per-unit of mass noise, we find that the induced errors are three orders of magnitude less than the expected experimental uncertainty in the power spectral density.
Statistical Inference in Hidden Markov Models Using k-Segment Constraints.
Titsias, Michalis K; Holmes, Christopher C; Yau, Christopher
2016-01-02
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward-backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online.
Statistical Inference in Hidden Markov Models Using k-Segment Constraints
Titsias, Michalis K.; Holmes, Christopher C.; Yau, Christopher
2016-01-01
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward–backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online. PMID:27226674
Fuzzy hidden Markov chains segmentation for volume determination and quantitation in PET
Hatt, Mathieu; Lamare, Frédéric; Boussion, Nicolas; Roux, Christian; Turzo, Alexandre; Cheze-Lerest, Catherine; Jarritt, Peter; Carson, Kathryn; Salzenstein, Fabien; Collet, Christophe; Visvikis, Dimitris
2007-01-01
Accurate volume of interest (VOI) estimation in PET is crucial in different oncology applications such as response to therapy evaluation and radiotherapy treatment planning. The objective of our study was to evaluate the performance of the proposed algorithm for automatic lesion volume delineation; namely the Fuzzy Hidden Markov Chains (FHMC), with that of current state of the art in clinical practice threshold based techniques. As the classical Hidden Markov Chain (HMC) algorithm, FHMC takes into account noise, voxel’s intensity and spatial correlation, in order to classify a voxel as background or functional VOI. However the novelty of the fuzzy model consists of the inclusion of an estimation of imprecision, which should subsequently lead to a better modelling of the “fuzzy” nature of the object on interest boundaries in emission tomography data. The performance of the algorithms has been assessed on both simulated and acquired datasets of the IEC phantom, covering a large range of spherical lesion sizes (from 10 to 37mm), contrast ratios (4:1 and 8:1) and image noise levels. Both lesion activity recovery and VOI determination tasks were assessed in reconstructed images using two different voxel sizes (8mm3 and 64mm3). In order to account for both the functional volume location and its size, the concept of % classification errors was introduced in the evaluation of volume segmentation using the simulated datasets. Results reveal that FHMC performs substantially better than the threshold based methodology for functional volume determination or activity concentration recovery considering a contrast ratio of 4:1 and lesion sizes of <28mm. Furthermore differences between classification and volume estimation errors evaluated were smaller for the segmented volumes provided by the FHMC algorithm. Finally, the performance of the automatic algorithms was less susceptible to image noise levels in comparison to the threshold based techniques. The analysis of both
Ensemble Smoothing and Markov Chain Monte Carlo for Data Assimilation in Highly Nonlinear Systems
NASA Astrophysics Data System (ADS)
Turmon, M.; Chin, T. M.; Jewell, J. B.; Ghil, M.
2005-12-01
Current methods for atmosphere and ocean data assimilation propagate Gaussian distributions for gridded state variables forward in time. Powerful as these methods are, they do not handle outliers well and cannot simultaneously entertain multiple hypotheses about system state. The alternative of propagating the system's full probability distribution is burdensome, and ensemble methods have been introduced into data assimilation for nonlinear systems to get around this problem. By propagating an ensemble of representative states, algorithms like the Ensemble Kalman Filter (EnKF) and the Resampled Particle Filter (RPF) rely on existing modeling infrastructure and capture the weights to be assigned to the data based on the evolution of this ensemble. We present an ensemble-based smoother that is applicable to Monte Carlo filtering schemes like the EnKF and the RPF. At the minor cost of retrospectively updating a set of weights for ensemble members, this smoother provides superior state tracking for two simple nonlinear problems, the double-well potential and the trivariate Lorenz system. The algorithm does not require retrospective adaptation of the ensemble members themselves, and is thus suited to a streaming operational mode. The accuracy of the proposed backward-update scheme in estimating non-Gaussian distributions is evaluated by comparison of its posterior distributions with ground truth provided by a Markov chain Monte Carlo algorithm.
Quantum-correlation breaking channels, broadcasting scenarios, and finite Markov chains
NASA Astrophysics Data System (ADS)
Korbicz, J. K.; Horodecki, P.; Horodecki, R.
2012-10-01
One of the classical results concerning quantum channels is the characterization of entanglement-breaking channels [M. Horodecki, P. W. Shor, and M. B. Ruskai, Rev. Math. Phys.RMPHEX0129-055X10.1142/S0129055X03001709 15, 629 (2003)]. We address the question whether there exists a similar characterization on the level of quantum correlations which may go beyond entanglement. The answer is fully affirmative in the case of breaking quantum correlations down to the, so-called, QC (quantum-classical) type, while it is no longer true in the CC (classical-classical) case. The corresponding channels turn out to be measurement maps. Our study also reveals an unexpected link between quantum state and local correlation broadcasting and finite Markov chains. We present a possibility of broadcasting via non von Neumann measurements, which relies on the Perron-Frobenius theorem. Surprisingly, this is not the typical generalized controlled-not (c-not) gate scenario appearing naturally in this context.
Study on the Calculation Models of Bus Delay at Bays Using Queueing Theory and Markov Chain
Sun, Li; Sun, Shao-wei; Wang, Dian-hai
2015-01-01
Traffic congestion at bus bays has decreased the service efficiency of public transit seriously in China, so it is crucial to systematically study its theory and methods. However, the existing studies lack theoretical model on computing efficiency. Therefore, the calculation models of bus delay at bays are studied. Firstly, the process that buses are delayed at bays is analyzed, and it was found that the delay can be divided into entering delay and exiting delay. Secondly, the queueing models of bus bays are formed, and the equilibrium distribution functions are proposed by applying the embedded Markov chain to the traditional model of queuing theory in the steady state; then the calculation models of entering delay are derived at bays. Thirdly, the exiting delay is studied by using the queueing theory and the gap acceptance theory. Finally, the proposed models are validated using field-measured data, and then the influencing factors are discussed. With these models the delay is easily assessed knowing the characteristics of the dwell time distribution and traffic volume at the curb lane in different locations and different periods. It can provide basis for the efficiency evaluation of bus bays. PMID:25759720
Uimari, P.; Hoeschele, I.
1997-01-01
A Bayesian method for mapping linked quantitative trait loci (QTL) using multiple linked genetic markers is presented. Parameter estimation and hypothesis testing was implemented via Markov chain Monte Carlo (MCMC) algorithms. Parameters included were allele frequencies and substitution effects for two biallelic QTL, map positions of the QTL and markers, allele frequencies of the markers, and polygenic and residual variances. Missing data were polygenic effects and multi-locus marker-QTL genotypes. Three different MCMC schemes for testing the presence of a single or two linked QTL on the chromosome were compared. The first approach includes a model indicator variable representing two unlinked QTL affecting the trait, one linked and one unlinked QTL, or both QTL linked with the markers. The second approach incorporates an indicator variable for each QTL into the model for phenotype, allowing or not allowing for a substitution effect of a QTL on phenotype, and the third approach is based on model determination by reversible jump MCMC. Methods were evaluated empirically by analyzing simulated granddaughter designs. All methods identified correctly a second, linked QTL and did not reject the one-QTL model when there was only a single QTL and no additional or an unlinked QTL. PMID:9178021
NASA Astrophysics Data System (ADS)
Al-Ma'shumah, Fathimah; Permana, Dony; Sidarto, Kuntjoro Adji
2015-12-01
Customer Lifetime Value is an important and useful concept in marketing. One of its benefits is to help a company for budgeting marketing expenditure for customer acquisition and customer retention. Many mathematical models have been introduced to calculate CLV considering the customer retention/migration classification scheme. A fairly new class of these models which will be described in this paper uses Markov Chain Models (MCM). This class of models has the major advantage for its flexibility to be modified to several different cases/classification schemes. In this model, the probabilities of customer retention and acquisition play an important role. From Pfeifer and Carraway, 2000, the final formula of CLV obtained from MCM usually contains nonlinear form of the transition probability matrix. This nonlinearity makes the inverse problem of CLV difficult to solve. This paper aims to solve this inverse problem, yielding the approximate transition probabilities for the customers, by applying metaheuristic optimization algorithm developed by Yang, 2013, Flower Pollination Algorithm. The major interpretation of obtaining the transition probabilities are to set goals for marketing teams in keeping the relative frequencies of customer acquisition and customer retention.
Sanov and central limit theorems for output statistics of quantum Markov chains
Horssen, Merlijn van; Guţă, Mădălin
2015-02-15
In this paper, we consider the statistics of repeated measurements on the output of a quantum Markov chain. We establish a large deviations result analogous to Sanov’s theorem for the multi-site empirical measure associated to finite sequences of consecutive outcomes of a classical stochastic process. Our result relies on the construction of an extended quantum transition operator (which keeps track of previous outcomes) in terms of which we compute moment generating functions, and whose spectral radius is related to the large deviations rate function. As a corollary to this, we obtain a central limit theorem for the empirical measure. Such higher level statistics may be used to uncover critical behaviour such as dynamical phase transitions, which are not captured by lower level statistics such as the sample mean. As a step in this direction, we give an example of a finite system whose level-1 (empirical mean) rate function is independent of a model parameter while the level-2 (empirical measure) rate is not.
Empirical Markov Chain Monte Carlo Bayesian analysis of fMRI data.
de Pasquale, F; Del Gratta, C; Romani, G L
2008-08-01
In this work an Empirical Markov Chain Monte Carlo Bayesian approach to analyse fMRI data is proposed. The Bayesian framework is appealing since complex models can be adopted in the analysis both for the image and noise model. Here, the noise autocorrelation is taken into account by adopting an AutoRegressive model of order one and a versatile non-linear model is assumed for the task-related activation. Model parameters include the noise variance and autocorrelation, activation amplitudes and the hemodynamic response function parameters. These are estimated at each voxel from samples of the Posterior Distribution. Prior information is included by means of a 4D spatio-temporal model for the interaction between neighbouring voxels in space and time. The results show that this model can provide smooth estimates from low SNR data while important spatial structures in the data can be preserved. A simulation study is presented in which the accuracy and bias of the estimates are addressed. Furthermore, some results on convergence diagnostic of the adopted algorithm are presented. To validate the proposed approach a comparison of the results with those from a standard GLM analysis, spatial filtering techniques and a Variational Bayes approach is provided. This comparison shows that our approach outperforms the classical analysis and is consistent with other Bayesian techniques. This is investigated further by means of the Bayes Factors and the analysis of the residuals. The proposed approach applied to Blocked Design and Event Related datasets produced reliable maps of activation.
Fitting complex population models by combining particle filters with Markov chain Monte Carlo.
Knape, Jonas; de Valpine, Perry
2012-02-01
We show how a recent framework combining Markov chain Monte Carlo (MCMC) with particle filters (PFMCMC) may be used to estimate population state-space models. With the purpose of utilizing the strengths of each method, PFMCMC explores hidden states by particle filters, while process and observation parameters are estimated using an MCMC algorithm. PFMCMC is exemplified by analyzing time series data on a red kangaroo (Macropus rufus) population in New South Wales, Australia, using MCMC over model parameters based on an adaptive Metropolis-Hastings algorithm. We fit three population models to these data; a density-dependent logistic diffusion model with environmental variance, an unregulated stochastic exponential growth model, and a random-walk model. Bayes factors and posterior model probabilities show that there is little support for density dependence and that the random-walk model is the most parsimonious model. The particle filter Metropolis-Hastings algorithm is a brute-force method that may be used to fit a range of complex population models. Implementation is straightforward and less involved than standard MCMC for many models, and marginal densities for model selection can be obtained with little additional effort. The cost is mainly computational, resulting in long running times that may be improved by parallelizing the algorithm.
Ades, A E; Lu, G
2003-12-01
Monte Carlo simulation has become the accepted method for propagating parameter uncertainty through risk models. It is widely appreciated, however, that correlations between input variables must be taken into account if models are to deliver correct assessments of uncertainty in risk. Various two-stage methods have been proposed that first estimate a correlation structure and then generate Monte Carlo simulations, which incorporate this structure while leaving marginal distributions of parameters unchanged. Here we propose a one-stage alternative, in which the correlation structure is estimated from the data directly by Bayesian Markov Chain Monte Carlo methods. Samples from the posterior distribution of the outputs then correctly reflect the correlation between parameters, given the data and the model. Besides its computational simplicity, this approach utilizes the available evidence from a wide variety of structures, including incomplete data and correlated and uncorrelated repeat observations. The major advantage of a Bayesian approach is that, rather than assuming the correlation structure is fixed and known, it captures the joint uncertainty induced by the data in all parameters, including variances and covariances, and correctly propagates this through the decision or risk model. These features are illustrated with examples on emissions of dioxin congeners from solid waste incinerators.
Edla, Shwetha; Kovvali, Narayan; Papandreou-Suppappola, Antonia
2012-01-01
Constructing statistical models of electrocardiogram (ECG) signals, whose parameters can be used for automated disease classification, is of great importance in precluding manual annotation and providing prompt diagnosis of cardiac diseases. ECG signals consist of several segments with different morphologies (namely the P wave, QRS complex and the T wave) in a single heart beat, which can vary across individuals and diseases. Also, existing statistical ECG models exhibit a reliance upon obtaining a priori information from the ECG data by using preprocessing algorithms to initialize the filter parameters, or to define the user-specified model parameters. In this paper, we propose an ECG modeling technique using the sequential Markov chain Monte Carlo (SMCMC) filter that can perform simultaneous model selection, by adaptively choosing from different representations depending upon the nature of the data. Our results demonstrate the ability of the algorithm to track various types of ECG morphologies, including intermittently occurring ECG beats. In addition, we use the estimated model parameters as the feature set to classify between ECG signals with normal sinus rhythm and four different types of arrhythmia.
Identifying influential observations in Bayesian models by using Markov chain Monte Carlo.
Jackson, Dan; White, Ian R; Carpenter, James
2012-05-20
In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov chain Monte Carlo (MCMC), as obtaining complete sample estimates is often in itself a very time-consuming task. Here we propose two efficient ways to approximate the case-deleted estimates by using output from MCMC estimation. Our first proposal, which directly approximates the usual influence statistics in maximum likelihood analyses of generalised linear models (GLMs), is easy to implement and avoids any further evaluation of the likelihood. Hence, unlike the existing alternatives, it does not become more computationally intensive as the model complexity increases. Our second proposal, which utilises model perturbations, also has this advantage and does not require the form of the GLM to be specified. We show how our two proposed methods are related and evaluate them against the existing method of importance sampling and case deletion in a logistic regression analysis with missing covariates. We also provide practical advice for those implementing our procedures, so that they may be used in many situations where MCMC is used to fit statistical models.
Ma, Jianzhong; Amos, Christopher I; Warwick Daw, E
2007-09-01
Although extended pedigrees are often sampled through probands with extreme levels of a quantitative trait, Markov chain Monte Carlo (MCMC) methods for segregation and linkage analysis have not been able to perform ascertainment corrections. Further, the extent to which ascertainment of pedigrees leads to biases in the estimation of segregation and linkage parameters has not been previously studied for MCMC procedures. In this paper, we studied these issues with a Bayesian MCMC approach for joint segregation and linkage analysis, as implemented in the package Loki. We first simulated pedigrees ascertained through individuals with extreme values of a quantitative trait in spirit of the sequential sampling theory of Cannings and Thompson [Cannings and Thompson [1977] Clin. Genet. 12:208-212]. Using our simulated data, we detected no bias in estimates of the trait locus location. However, in addition to allele frequencies, when the ascertainment threshold was higher than or close to the true value of the highest genotypic mean, bias was also found in the estimation of this parameter. When there were multiple trait loci, this bias destroyed the additivity of the effects of the trait loci, and caused biases in the estimation all genotypic means when a purely additive model was used for analyzing the data. To account for pedigree ascertainment with sequential sampling, we developed a Bayesian ascertainment approach and implemented Metropolis-Hastings updates in the MCMC samplers used in Loki. Ascertainment correction greatly reduced biases in parameter estimates. Our method is designed for multiple, but a fixed number of trait loci.
Markov chain Monte Carlo linkage analysis: effect of bin width on the probability of linkage.
Slager, S L; Juo, S H; Durner, M; Hodge, S E
2001-01-01
We analyzed part of the Genetic Analysis Workshop (GAW) 12 simulated data using Monte Carlo Markov chain (MCMC) methods that are implemented in the computer program Loki. The MCMC method reports the "probability of linkage" (PL) across the chromosomal regions of interest. The point of maximum PL can then be taken as a "location estimate" for the location of the quantitative trait locus (QTL). However, Loki does not provide a formal statistical test of linkage. In this paper, we explore how the bin width used in the calculations affects the max PL and the location estimate. We analyzed age at onset (AO) and quantitative trait number 5, Q5, from 26 replicates of the general simulated data in one region where we knew a major gene, MG5, is located. For each trait, we found the max PL and the corresponding location estimate, using four different bin widths. We found that bin width, as expected, does affect the max PL and the location estimate, and we recommend that users of Loki explore how their results vary with different bin widths.
Study on the calculation models of bus delay at bays using queueing theory and Markov chain.
Sun, Feng; Sun, Li; Sun, Shao-Wei; Wang, Dian-Hai
2015-01-01
Traffic congestion at bus bays has decreased the service efficiency of public transit seriously in China, so it is crucial to systematically study its theory and methods. However, the existing studies lack theoretical model on computing efficiency. Therefore, the calculation models of bus delay at bays are studied. Firstly, the process that buses are delayed at bays is analyzed, and it was found that the delay can be divided into entering delay and exiting delay. Secondly, the queueing models of bus bays are formed, and the equilibrium distribution functions are proposed by applying the embedded Markov chain to the traditional model of queuing theory in the steady state; then the calculation models of entering delay are derived at bays. Thirdly, the exiting delay is studied by using the queueing theory and the gap acceptance theory. Finally, the proposed models are validated using field-measured data, and then the influencing factors are discussed. With these models the delay is easily assessed knowing the characteristics of the dwell time distribution and traffic volume at the curb lane in different locations and different periods. It can provide basis for the efficiency evaluation of bus bays.
Variational method for estimating the rate of convergence of Markov-chain Monte Carlo algorithms.
Casey, Fergal P; Waterfall, Joshua J; Gutenkunst, Ryan N; Myers, Christopher R; Sethna, James P
2008-10-01
We demonstrate the use of a variational method to determine a quantitative lower bound on the rate of convergence of Markov chain Monte Carlo (MCMC) algorithms as a function of the target density and proposal density. The bound relies on approximating the second largest eigenvalue in the spectrum of the MCMC operator using a variational principle and the approach is applicable to problems with continuous state spaces. We apply the method to one dimensional examples with Gaussian and quartic target densities, and we contrast the performance of the random walk Metropolis-Hastings algorithm with a "smart" variant that incorporates gradient information into the trial moves, a generalization of the Metropolis adjusted Langevin algorithm. We find that the variational method agrees quite closely with numerical simulations. We also see that the smart MCMC algorithm often fails to converge geometrically in the tails of the target density except in the simplest case we examine, and even then care must be taken to choose the appropriate scaling of the deterministic and random parts of the proposed moves. Again, this calls into question the utility of smart MCMC in more complex problems. Finally, we apply the same method to approximate the rate of convergence in multidimensional Gaussian problems with and without importance sampling. There we demonstrate the necessity of importance sampling for target densities which depend on variables with a wide range of scales.
NASA Astrophysics Data System (ADS)
Xu, Feng; Davis, Anthony B.; Diner, David J.
2016-11-01
A Markov chain formalism is developed for computing the transport of polarized radiation according to Generalized Radiative Transfer (GRT) theory, which was developed recently to account for unresolved random fluctuations of scattering particle density and can also be applied to unresolved spectral variability of gaseous absorption as an improvement over the standard correlated-k method. Using Gamma distribution to describe the probability density function of the extinction or absorption coefficient, a shape parameter a that quantifies the variability is introduced, defined as the mean extinction or absorption coefficient squared divided by its variance. It controls the decay rate of a power-law transmission that replaces the usual exponential Beer-Lambert-Bouguer law. Exponential transmission, hence classic RT, is recovered when a→∞. The new approach is verified to high accuracy against numerical benchmark results obtained with a custom Monte Carlo method. For a<∞, angular reciprocity is violated to a degree that increases with the spatial variability, as observed for finite portions of real-world cloudy scenes. While the degree of linear polarization in liquid water cloudbows, supernumerary bows, and glories is affected by spatial heterogeneity, the positions in scattering angle of these features are relatively unchanged. As a result, a single-scattering model based on the assumption of subpixel homogeneity can still be used to derive droplet size distributions from polarimetric measurements of extended stratocumulus clouds.
Smart pilot points using reversible-jump Markov-chain Monte Carlo
NASA Astrophysics Data System (ADS)
Jiménez, S.; Mariethoz, G.; Brauchler, R.; Bayer, P.
2016-05-01
Pilot points are typical means for calibration of highly parameterized numerical models. We propose a novel procedure based on estimating not only the pilot point values, but also their number and suitable locations. This is accomplished by a trans-dimensional Bayesian inversion procedure that also allows for uncertainty quantification. The utilized algorithm, reversible-jump Markov-Chain Monte Carlo (RJ-MCMC), is computationally demanding and this challenges the application for model calibration. We present a solution for fast, approximate simulation through the application of a Bayesian inversion. A fast pathfinding algorithm is used to estimate tracer travel times instead of doing a full transport simulation. This approach extracts the information from measured breakthrough curves, which is crucial for the reconstruction of aquifer heterogeneity. As a result, the "smart pilot points" can be tuned during thousands of rapid model evaluations. This is demonstrated for both a synthetic and a field application. For the selected synthetic layered aquifer, two different hydrofacies are reconstructed. For the field investigation, multiple fluorescent tracers were injected in different well screens in a shallow alluvial aquifer and monitored in a tomographic source-receiver configuration. With the new inversion procedure, a sand layer was identified and reconstructed with a high spatial resolution in 3-D. The sand layer was successfully validated through additional slug tests at the site. The promising results encourage further applications in hydrogeological model calibration, especially for cases with simulation of transport.
Mathematical modeling, analysis and Markov Chain Monte Carlo simulation of Ebola epidemics
NASA Astrophysics Data System (ADS)
Tulu, Thomas Wetere; Tian, Boping; Wu, Zunyou
Ebola virus infection is a severe infectious disease with the highest case fatality rate which become the global public health treat now. What makes the disease the worst of all is no specific effective treatment available, its dynamics is not much researched and understood. In this article a new mathematical model incorporating both vaccination and quarantine to study the dynamics of Ebola epidemic has been developed and comprehensively analyzed. The existence as well as uniqueness of the solution to the model is also verified and the basic reproduction number is calculated. Besides, stability conditions are also checked and finally simulation is done using both Euler method and one of the top ten most influential algorithm known as Markov Chain Monte Carlo (MCMC) method. Different rates of vaccination to predict the effect of vaccination on the infected individual over time and that of quarantine are discussed. The results show that quarantine and vaccination are very effective ways to control Ebola epidemic. From our study it was also seen that there is less possibility of an individual for getting Ebola virus for the second time if they survived his/her first infection. Last but not least real data has been fitted to the model, showing that it can used to predict the dynamic of Ebola epidemic.
Geometrically Constructed Markov Chain Monte Carlo Study of Quantum Spin-phonon Complex Systems
NASA Astrophysics Data System (ADS)
Suwa, Hidemaro
2013-03-01
We have developed novel Monte Carlo methods for precisely calculating quantum spin-boson models and investigated the critical phenomena of the spin-Peierls systems. Three significant methods are presented. The first is a new optimization algorithm of the Markov chain transition kernel based on the geometric weight allocation. This algorithm, for the first time, satisfies the total balance generally without imposing the detailed balance and always minimizes the average rejection rate, being better than the Metropolis algorithm. The second is the extension of the worm (directed-loop) algorithm to non-conserved particles, which cannot be treated efficiently by the conventional methods. The third is the combination with the level spectroscopy. Proposing a new gap estimator, we are successful in eliminating the systematic error of the conventional moment method. Then we have elucidated the phase diagram and the universality class of the one-dimensional XXZ spin-Peierls system. The criticality is totally consistent with the J1 -J2 model, an effective model in the antiadiabatic limit. Through this research, we have succeeded in investigating the critical phenomena of the effectively frustrated quantum spin system by the quantum Monte Carlo method without the negative sign. JSPS Postdoctoral Fellow for Research Abroad
Improving Hydrologic Data Assimilation by a Multivariate Particle Filter-Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Yan, H.; DeChant, C. M.; Moradkhani, H.
2014-12-01
Data assimilation (DA) is a popular method for merging information from multiple sources (i.e. models and remotely sensing), leading to improved hydrologic prediction. With the increasing availability of satellite observations (such as soil moisture) in recent years, DA is emerging in operational forecast systems. Although these techniques have seen widespread application, developmental research has continued to further refine their effectiveness. This presentation will examine potential improvements to the Particle Filter (PF) through the inclusion of multivariate correlation structures. Applications of the PF typically rely on univariate DA schemes (such as assimilating the outlet observed discharge), and multivariate schemes generally ignore the spatial correlation of the observations. In this study, a multivariate DA scheme is proposed by introducing geostatistics into the newly developed particle filter with Markov chain Monte Carlo (PF-MCMC) method. This new method is assessed by a case study over one of the basin with natural hydrologic process in Model Parameter Estimation Experiment (MOPEX), located in Arizona. The multivariate PF-MCMC method is used to assimilate the Advanced Scatterometer (ASCAT) grid (12.5 km) soil moisture retrievals and the observed streamflow in five gages (four inlet and one outlet gages) into the Sacramento Soil Moisture Accounting (SAC-SMA) model for the same scale (12.5 km), leading to greater skill in hydrologic predictions.
Yoo, Chulsang; Lee, Jinwook; Ro, Yonghun
2016-01-01
This paper evaluates the effect of climate change on daily rainfall, especially on the mean number of wet days and the mean rainfall intensity. Assuming that the mechanism of daily rainfall occurrences follows the first-order Markov chain model, the possible changes in the transition probabilities are estimated by considering the climate change scenarios. Also, the change of the stationary probabilities of wet and dry day occurrences and finally the change in the number of wet days are derived for the comparison of current (1x CO2) and 2x CO2conditions. As a result of this study, the increase or decrease in themore » mean number of wet days was found to be not enough to explain all of the change in monthly rainfall amounts, so rainfall intensity should also be modified. The application to the Seoul weather station in Korea shows that about 30% of the total change in monthly rainfall amount can be explained by the change in the number of wet days and the remaining 70% by the change in the rainfall intensity. That is, as an effect of climate change, the increase in the rainfall intensity could be more significant than the increase in the wet days and, thus, the risk of flood will be much highly increased.« less
Yoo, Chulsang; Lee, Jinwook; Ro, Yonghun
2016-01-01
This paper evaluates the effect of climate change on daily rainfall, especially on the mean number of wet days and the mean rainfall intensity. Assuming that the mechanism of daily rainfall occurrences follows the first-order Markov chain model, the possible changes in the transition probabilities are estimated by considering the climate change scenarios. Also, the change of the stationary probabilities of wet and dry day occurrences and finally the change in the number of wet days are derived for the comparison of current (1x CO_{2}) and 2x CO_{2}conditions. As a result of this study, the increase or decrease in the mean number of wet days was found to be not enough to explain all of the change in monthly rainfall amounts, so rainfall intensity should also be modified. The application to the Seoul weather station in Korea shows that about 30% of the total change in monthly rainfall amount can be explained by the change in the number of wet days and the remaining 70% by the change in the rainfall intensity. That is, as an effect of climate change, the increase in the rainfall intensity could be more significant than the increase in the wet days and, thus, the risk of flood will be much highly increased.
Markov chain Monte Carlo analysis to constrain dark matter properties with directional detection
Billard, J.; Mayet, F.; Santos, D.
2011-04-01
Directional detection is a promising dark matter search strategy. Indeed, weakly interacting massive particle (WIMP)-induced recoils would present a direction dependence toward the Cygnus constellation, while background-induced recoils exhibit an isotropic distribution in the Galactic rest frame. Taking advantage of these characteristic features, and even in the presence of a sizeable background, it has recently been shown that data from forthcoming directional detectors could lead either to a competitive exclusion or to a conclusive discovery, depending on the value of the WIMP-nucleon cross section. However, it is possible to further exploit these upcoming data by using the strong dependence of the WIMP signal with: the WIMP mass and the local WIMP velocity distribution. Using a Markov chain Monte Carlo analysis of recoil events, we show for the first time the possibility to constrain the unknown WIMP parameters, both from particle physics (mass and cross section) and Galactic halo (velocity dispersion along the three axis), leading to an identification of non-baryonic dark matter.
Mapping systematic errors in helium abundance determinations using Markov Chain Monte Carlo
Aver, Erik; Olive, Keith A.; Skillman, Evan D. E-mail: olive@umn.edu
2011-03-01
Monte Carlo techniques have been used to evaluate the statistical and systematic uncertainties in the helium abundances derived from extragalactic H II regions. The helium abundance is sensitive to several physical parameters associated with the H II region. In this work, we introduce Markov Chain Monte Carlo (MCMC) methods to efficiently explore the parameter space and determine the helium abundance, the physical parameters, and the uncertainties derived from observations of metal poor nebulae. Experiments with synthetic data show that the MCMC method is superior to previous implementations (based on flux perturbation) in that it is not affected by biases due to non-physical parameter space. The MCMC analysis allows a detailed exploration of degeneracies, and, in particular, a false minimum that occurs at large values of optical depth in the He I emission lines. We demonstrate that introducing the electron temperature derived from the [O III] emission lines as a prior, in a very conservative manner, produces negligible bias and effectively eliminates the false minima occurring at large optical depth. We perform a frequentist analysis on data from several ''high quality'' systems. Likelihood plots illustrate degeneracies, asymmetries, and limits of the determination. In agreement with previous work, we find relatively large systematic errors, limiting the precision of the primordial helium abundance for currently available spectra.
A Markov chain model for image ranking system in social networks
NASA Astrophysics Data System (ADS)
Zin, Thi Thi; Tin, Pyke; Toriu, Takashi; Hama, Hiromitsu
2014-03-01
In today world, different kinds of networks such as social, technological, business and etc. exist. All of the networks are similar in terms of distributions, continuously growing and expanding in large scale. Among them, many social networks such as Facebook, Twitter, Flickr and many others provides a powerful abstraction of the structure and dynamics of diverse kinds of inter personal connection and interaction. Generally, the social network contents are created and consumed by the influences of all different social navigation paths that lead to the contents. Therefore, identifying important and user relevant refined structures such as visual information or communities become major factors in modern decision making world. Moreover, the traditional method of information ranking systems cannot be successful due to their lack of taking into account the properties of navigation paths driven by social connections. In this paper, we propose a novel image ranking system in social networks by using the social data relational graphs from social media platform jointly with visual data to improve the relevance between returned images and user intentions (i.e., social relevance). Specifically, we propose a Markov chain based Social-Visual Ranking algorithm by taking social relevance into account. By using some extensive experiments, we demonstrated the significant and effectiveness of the proposed social-visual ranking method.
Hössjer, Ola; Tyvand, Peder A; Miloh, Touvia
2016-02-01
The classical Kimura solution of the diffusion equation is investigated for a haploid random mating (Wright-Fisher) model, with one-way mutations and initial-value specified by the founder population. The validity of the transient diffusion solution is checked by exact Markov chain computations, using a Jordan decomposition of the transition matrix. The conclusion is that the one-way diffusion model mostly works well, although the rate of convergence depends on the initial allele frequency and the mutation rate. The diffusion approximation is poor for mutation rates so low that the non-fixation boundary is regular. When this happens we perturb the diffusion solution around the non-fixation boundary and obtain a more accurate approximation that takes quasi-fixation of the mutant allele into account. The main application is to quantify how fast a specific genetic variant of the infinite alleles model is lost. We also discuss extensions of the quasi-fixation approach to other models with small mutation rates.
Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences.
Schbath, S; Prum, B; de Turckheim, E
1995-01-01
Identifying exceptional motifs is often used for extracting information from long DNA sequences. The two difficulties of the method are the choice of the model that defines the expected frequencies of words and the approximation of the variance of the difference T(W) between the number of occurrences of a word W and its estimation. We consider here different Markov chain models, either with stationary or periodic transition probabilities. We estimate the variance of the difference T(W) by the conditional variance of the number of occurrences of W given the oligonucleotides counts that define the model. Two applications show how to use asymptotically standard normal statistics associated with the counts to describe a given sequence in terms of its outlying words. Sequences of Escherichia coli and of Bacillus subtilis are compared with respect to their exceptional tri- and tetranucleotides. For both bacteria, exceptional 3-words are mainly found in the coding frame. E. coli palindrome counts are analyzed in different models, showing that many overabundant words are one-letter mutations of avoided palindromes.
Markov-chain approach to the distribution of ancestors in species of biparental reproduction
NASA Astrophysics Data System (ADS)
Caruso, M.; Jarne, C.
2014-08-01
We studied how to obtain a distribution for the number of ancestors in species of sexual reproduction. Present models concentrate on the estimation of distributions repetitions of ancestors in genealogical trees. It has been shown that it is not possible to reconstruct the genealogical history of each species along all its generations by means of a geometric progression. This analysis demonstrates that it is possible to rebuild the tree of progenitors by modeling the problem with a Markov chain. For each generation, the maximum number of possible ancestors is different. This presents huge problems for the resolution. We found a solution through a dilation of the sample space, although the distribution defined there takes smaller values with respect to the initial problem. In order to correct the distribution for each generation, we introduced the invariance under a gauge (local) group of dilations. These ideas can be used to study the interaction of several processes and provide a new approach on the problem of the common ancestor. In the same direction, this model also provides some elements that can be used to improve models of animal reproduction.
Two-state Markov-chain Poisson nature of individual cellphone call statistics
NASA Astrophysics Data System (ADS)
Jiang, Zhi-Qiang; Xie, Wen-Jie; Li, Ming-Xia; Zhou, Wei-Xing; Sornette, Didier
2016-07-01
Unfolding the burst patterns in human activities and social interactions is a very important issue especially for understanding the spreading of disease and information and the formation of groups and organizations. Here, we conduct an in-depth study of the temporal patterns of cellphone conversation activities of 73 339 anonymous cellphone users, whose inter-call durations are Weibull distributed. We find that the individual call events exhibit a pattern of bursts, that high activity periods are alternated with low activity periods. In both periods, the number of calls are exponentially distributed for individuals, but power-law distributed for the population. Together with the exponential distributions of inter-call durations within bursts and of the intervals between consecutive bursts, we demonstrate that the individual call activities are driven by two independent Poisson processes, which can be combined within a minimal model in terms of a two-state first-order Markov chain, giving significant fits for nearly half of the individuals. By measuring directly the distributions of call rates across the population, which exhibit power-law tails, we purport the existence of power-law distributions, via the ‘superposition of distributions’ mechanism. Our findings shed light on the origins of bursty patterns in other human activities.
NASA Astrophysics Data System (ADS)
Shargel, Benjamin Hertz; Chou, Tom
2009-10-01
Asymptotic fluctuation theorems are statements of a Gallavotti-Cohen symmetry in the rate function of either the time-averaged entropy production or heat dissipation of a process. Such theorems have been proved for various general classes of continuous-time deterministic and stochastic processes, but always under the assumption that the forces driving the system are time independent, and often relying on the existence of a limiting ergodic distribution. In this paper we extend the asymptotic fluctuation theorem for the first time to inhomogeneous continuous-time processes without a stationary distribution, considering specifically a finite state Markov chain driven by periodic transition rates. We find that for both entropy production and heat dissipation, the usual Gallavotti-Cohen symmetry of the rate function is generalized to an analogous relation between the rate functions of the original process and its corresponding backward process, in which the trajectory and the driving protocol have been time-reversed. The effect is that spontaneous positive fluctuations in the long time average of each quantity in the forward process are exponentially more likely than spontaneous negative fluctuations in the backward process, and vice-versa, revealing that the distributions of fluctuations in universes in which time moves forward and backward are related. As an additional result, the asymptotic time-averaged entropy production is obtained as the integral of a periodic entropy production rate that generalizes the constant rate pertaining to homogeneous dynamics.
Yang, J; Wongsa, S; Kadirkamanathan, V; Billings, S A; Wright, P C
2005-12-01
Metabolic flux analysis using 13C-tracer experiments is an important tool in metabolic engineering since intracellular fluxes are non-measurable quantities in vivo. Current metabolic flux analysis approaches are fully based on stoichiometric constraints and carbon atom balances, where the over-determined system is iteratively solved by a parameter estimation approach. However, the unavoidable measurement noises involved in the fractional enrichment data obtained by 13C-enrichment experiment and the possible existence of unknown pathways prevent a simple parameter estimation method for intracellular flux quantification. The MCMC (Markov chain-Monte Carlo) method, which obtains intracellular flux distributions through delicately constructed Markov chains, is shown to be an effective approach for deep understanding of the intracellular metabolic network. Its application is illustrated through the simulation of an example metabolic network.
NASA Astrophysics Data System (ADS)
Yoo, Jiyoung; Kwon, Hyun-Han; So, Byung-Jin; Rajagopalan, Balaji; Kim, Tae-Woong
2015-04-01
This study proposed a hidden Markov chain model-based drought analysis (HMM-DA) tool to understand the beginning and ending of meteorological drought and to further characterize typhoon-induced drought busters (TDB) by exploring spatiotemporal drought patterns in South Korea. It was found that typhoons have played a dominant role in ending drought events (EDE) during the typhoon season (July-September) over the last four decades (1974-2013). The percentage of EDEs terminated by TDBs was about 43-90% mainly along coastal regions in South Korea. Furthermore, the TDBs, mainly during summer, have a positive role in managing extreme droughts during the subsequent autumn and spring seasons. The HMM-DA models the temporal dependencies between drought states using Markov chain, consequently capturing the dependencies between droughts and typhoons well, thus, enabling a better performance in modeling spatiotemporal drought attributes compared to traditional methods.
NASA Astrophysics Data System (ADS)
Esquível, Manuel L.; Fernandes, José Moniz; Guerreiro, Gracinda R.
2016-06-01
We introduce a schematic formalism for the time evolution of a random population entering some set of classes and such that each member of the population evolves among these classes according to a scheme based on a Markov chain model. We consider that the flow of incoming members is modeled by a time series and we detail the time series structure of the elements in each of the classes. We present a practical application to data from a credit portfolio of a Cape Verdian bank; after modeling the entering population in two different ways - namely as an ARIMA process and as a deterministic sigmoid type trend plus a SARMA process for the residues - we simulate the behavior of the population and compare the results. We get that the second method is more accurate in describing the behavior of the populations when compared to the observed values in a direct simulation of the Markov chain.
Xu, Feng; Davis, Anthony B; West, Robert A; Esposito, Larry W
2011-01-17
Building on the Markov chain formalism for scalar (intensity only) radiative transfer, this paper formulates the solution to polarized diffuse reflection from and transmission through a vertically inhomogeneous atmosphere. For verification, numerical results are compared to those obtained by the Monte Carlo method, showing deviations less than 1% when 90 streams are used to compute the radiation from two types of atmospheres, pure Rayleigh and Rayleigh plus aerosol, when they are divided into sublayers of optical thicknesses of less than 0.03.
Lin, Yen-Jen; Chen, Yu-Tin; Hsu, Shu-Ni; Peng, Chien-Hua; Tang, Chuan-Yi; Yen, Tzu-Chen; Hsieh, Wen-Ping
2014-01-01
Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states.
Markov Chain Model-Based Optimal Cluster Heads Selection for Wireless Sensor Networks
Ahmed, Gulnaz; Zou, Jianhua; Zhao, Xi; Sadiq Fareed, Mian Muhammad
2017-01-01
The longer network lifetime of Wireless Sensor Networks (WSNs) is a goal which is directly related to energy consumption. This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area. The hierarchal clustering architecture is the best choice for these kind of issues. In this paper, we introduce a novel clustering protocol called Markov chain model-based optimal cluster heads (MOCHs) selection for WSNs. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster. We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. We compare MOCHs against Sleep-awake Energy Efficient Distributed (SEED) clustering, Artificial Bee Colony (ABC), Zone Based Routing (ZBR), and Centralized Energy Efficient Clustering (CEEC) using the above-discussed quality metrics and found that the lifetime of the proposed model is almost 1095, 2630, 3599, and 2045 rounds (time steps) greater than SEED, ABC, ZBR, and CEEC, respectively. The obtained results demonstrate that the MOCHs is better than SEED, ABC, ZBR, and CEEC in terms of energy efficiency and the network throughput. PMID:28241492
Dynamical Models for NGC 6503 Using a Markov Chain Monte Carlo Technique
NASA Astrophysics Data System (ADS)
Puglielli, David; Widrow, Lawrence M.; Courteau, Stéphane
2010-06-01
We use Bayesian statistics and Markov chain Monte Carlo (MCMC) techniques to construct dynamical models for the spiral galaxy NGC 6503. The constraints include surface brightness (SB) profiles which display a Freeman Type II structure; H I and ionized gas rotation curves; the stellar rotation, which is nearly coincident with the ionized gas curve; and the line of sight stellar dispersion, which displays a σ-drop at the center. The galaxy models consist of a Sérsic bulge, an exponential disk with an optional inner truncation and a cosmologically motivated dark halo. The Bayesian/MCMC technique yields the joint posterior probability distribution function for the input parameters, allowing constraints on model parameters such as the halo cusp strength, structural parameters for the disk and bulge, and mass-to-light ratios. We examine several interpretations of the data: the Type II SB profile may be due to dust extinction, to an inner truncated disk, or to a ring of bright stars, and we test separate fits to the gas and stellar rotation curves to determine if the gas traces the gravitational potential. We test each of these scenarios for bar stability, ruling out dust extinction. We also find that the gas likely does not trace the gravitational potential, since the predicted stellar rotation curve, which includes asymmetric drift, is then inconsistent with the observed stellar rotation curve. The disk is well fit by an inner-truncated profile, but the possibility of ring formation by a bar to reproduce the Type II profile is also a realistic model. We further find that the halo must have a cuspy profile with γ >~ 1; the bulge has a lower M/L than the disk, suggesting a star-forming component in the center of the galaxy; and the bulge, as expected for this late-type galaxy, has a low Sérsic index with nb ~ 1-2, suggesting a formation history dominated by secular evolution.
Discovering beaten paths in collaborative ontology-engineering projects using Markov chains.
Walk, Simon; Singer, Philipp; Strohmaier, Markus; Tudorache, Tania; Musen, Mark A; Noy, Natalya F
2014-10-01
Biomedical taxonomies, thesauri and ontologies in the form of the International Classification of Diseases as a taxonomy or the National Cancer Institute Thesaurus as an OWL-based ontology, play a critical role in acquiring, representing and processing information about human health. With increasing adoption and relevance, biomedical ontologies have also significantly increased in size. For example, the 11th revision of the International Classification of Diseases, which is currently under active development by the World Health Organization contains nearly 50,000 classes representing a vast variety of different diseases and causes of death. This evolution in terms of size was accompanied by an evolution in the way ontologies are engineered. Because no single individual has the expertise to develop such large-scale ontologies, ontology-engineering projects have evolved from small-scale efforts involving just a few domain experts to large-scale projects that require effective collaboration between dozens or even hundreds of experts, practitioners and other stakeholders. Understanding the way these different stakeholders collaborate will enable us to improve editing environments that support such collaborations. In this paper, we uncover how large ontology-engineering projects, such as the International Classification of Diseases in its 11th revision, unfold by analyzing usage logs of five different biomedical ontology-engineering projects of varying sizes and scopes using Markov chains. We discover intriguing interaction patterns (e.g., which properties users frequently change after specific given ones) that suggest that large collaborative ontology-engineering projects are governed by a few general principles that determine and drive development. From our analysis, we identify commonalities and differences between different projects that have implications for project managers, ontology editors, developers and contributors working on collaborative ontology
BENCHMARK TESTS FOR MARKOV CHAIN MONTE CARLO FITTING OF EXOPLANET ECLIPSE OBSERVATIONS
Rogers, Justin; Lopez-Morales, Mercedes; Apai, Daniel; Adams, Elisabeth
2013-04-10
Ground-based observations of exoplanet eclipses provide important clues to the planets' atmospheric physics, yet systematics in light curve analyses are not fully understood. It is unknown if measurements suggesting near-infrared flux densities brighter than models predict are real, or artifacts of the analysis processes. We created a large suite of model light curves, using both synthetic and real noise, and tested the common process of light curve modeling and parameter optimization with a Markov Chain Monte Carlo algorithm. With synthetic white noise models, we find that input eclipse signals are generally recovered within 10% accuracy for eclipse depths greater than the noise amplitude, and to smaller depths for higher sampling rates and longer baselines. Red noise models see greater discrepancies between input and measured eclipse signals, often biased in one direction. Finally, we find that in real data, systematic biases result even with a complex model to account for trends, and significant false eclipse signals may appear in a non-Gaussian distribution. To quantify the bias and validate an eclipse measurement, we compare both the planet-hosting star and several of its neighbors to a separately chosen control sample of field stars. Re-examining the Rogers et al. Ks-band measurement of CoRoT-1b finds an eclipse 3190{sup +370}{sub -440} ppm deep centered at {phi}{sub me} = 0.50418{sup +0.00197}{sub -0.00203}. Finally, we provide and recommend the use of selected data sets we generated as a benchmark test for eclipse modeling and analysis routines, and propose criteria to verify eclipse detections.
Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains
Walk, Simon; Singer, Philipp; Strohmaier, Markus; Tudorache, Tania; Musen, Mark A.; Noy, Natalya F.
2014-01-01
Biomedical taxonomies, thesauri and ontologies in the form of the International Classification of Diseases as a taxonomy or the National Cancer Institute Thesaurus as an OWL-based ontology, play a critical role in acquiring, representing and processing information about human health. With increasing adoption and relevance, biomedical ontologies have also significantly increased in size. For example, the 11th revision of the International Classification of Diseases, which is currently under active development by the World Health Organization contains nearly 50, 000 classes representing a vast variety of different diseases and causes of death. This evolution in terms of size was accompanied by an evolution in the way ontologies are engineered. Because no single individual has the expertise to develop such large-scale ontologies, ontology-engineering projects have evolved from small-scale efforts involving just a few domain experts to large-scale projects that require effective collaboration between dozens or even hundreds of experts, practitioners and other stakeholders. Understanding the way these different stakeholders collaborate will enable us to improve editing environments that support such collaborations. In this paper, we uncover how large ontology-engineering projects, such as the International Classification of Diseases in its 11th revision, unfold by analyzing usage logs of five different biomedical ontology-engineering projects of varying sizes and scopes using Markov chains. We discover intriguing interaction patterns (e.g., which properties users frequently change after specific given ones) that suggest that large collaborative ontology-engineering projects are governed by a few general principles that determine and drive development. From our analysis, we identify commonalities and differences between different projects that have implications for project managers, ontology editors, developers and contributors working on collaborative ontology
A Markov Chain Monte Carlo Approach to Estimate AIDS after HIV Infection.
Apenteng, Ofosuhene O; Ismail, Noor Azina
2015-01-01
The spread of human immunodeficiency virus (HIV) infection and the resulting acquired immune deficiency syndrome (AIDS) is a major health concern in many parts of the world, and mathematical models are commonly applied to understand the spread of the HIV epidemic. To understand the spread of HIV and AIDS cases and their parameters in a given population, it is necessary to develop a theoretical framework that takes into account realistic factors. The current study used this framework to assess the interaction between individuals who developed AIDS after HIV infection and individuals who did not develop AIDS after HIV infection (pre-AIDS). We first investigated how probabilistic parameters affect the model in terms of the HIV and AIDS population over a period of time. We observed that there is a critical threshold parameter, R0, which determines the behavior of the model. If R0 ≤ 1, there is a unique disease-free equilibrium; if R0 < 1, the disease dies out; and if R0 > 1, the disease-free equilibrium is unstable. We also show how a Markov chain Monte Carlo (MCMC) approach could be used as a supplement to forecast the numbers of reported HIV and AIDS cases. An approach using a Monte Carlo analysis is illustrated to understand the impact of model-based predictions in light of uncertain parameters on the spread of HIV. Finally, to examine this framework and demonstrate how it works, a case study was performed of reported HIV and AIDS cases from an annual data set in Malaysia, and then we compared how these approaches complement each other. We conclude that HIV disease in Malaysia shows epidemic behavior, especially in the context of understanding and predicting emerging cases of HIV and AIDS.
NASA Astrophysics Data System (ADS)
Wirth, Erin A.; Long, Maureen D.; Moriarty, John C.
2016-10-01
Teleseismic receiver functions contain information regarding Earth structure beneath a seismic station. P-to-SV converted phases are often used to characterize crustal and upper mantle discontinuities and isotropic velocity structures. More recently, P-to-SH converted energy has been used to interrogate the orientation of anisotropy at depth, as well as the geometry of dipping interfaces. Many studies use a trial-and-error forward modeling approach to the interpretation of receiver functions, generating synthetic receiver functions from a user-defined input model of Earth structure and amending this model until it matches major features in the actual data. While often successful, such an approach makes it impossible to explore model space in a systematic and robust manner, which is especially important given that solutions are likely non-unique. Here, we present a Markov chain Monte Carlo algorithm with Gibbs sampling for the interpretation of anisotropic receiver functions. Synthetic examples are used to test the viability of the algorithm, suggesting that it works well for models with a reasonable number of free parameters (< ˜20). Additionally, the synthetic tests illustrate that certain parameters are well constrained by receiver function data, while others are subject to severe tradeoffs - an important implication for studies that attempt to interpret Earth structure based on receiver function data. Finally, we apply our algorithm to receiver function data from station WCI in the central United States. We find evidence for a change in anisotropic structure at mid-lithospheric depths, consistent with previous work that used a grid search approach to model receiver function data at this station. Forward modeling of receiver functions using model space search algorithms, such as the one presented here, provide a meaningful framework for interrogating Earth structure from receiver function data.
Input estimation for drug discovery using optimal control and Markov chain Monte Carlo approaches.
Trägårdh, Magnus; Chappell, Michael J; Ahnmark, Andrea; Lindén, Daniel; Evans, Neil D; Gennemark, Peter
2016-04-01
Input estimation is employed in cases where it is desirable to recover the form of an input function which cannot be directly observed and for which there is no model for the generating process. In pharmacokinetic and pharmacodynamic modelling, input estimation in linear systems (deconvolution) is well established, while the nonlinear case is largely unexplored. In this paper, a rigorous definition of the input-estimation problem is given, and the choices involved in terms of modelling assumptions and estimation algorithms are discussed. In particular, the paper covers Maximum a Posteriori estimates using techniques from optimal control theory, and full Bayesian estimation using Markov Chain Monte Carlo (MCMC) approaches. These techniques are implemented using the optimisation software CasADi, and applied to two example problems: one where the oral absorption rate and bioavailability of the drug eflornithine are estimated using pharmacokinetic data from rats, and one where energy intake is estimated from body-mass measurements of mice exposed to monoclonal antibodies targeting the fibroblast growth factor receptor (FGFR) 1c. The results from the analysis are used to highlight the strengths and weaknesses of the methods used when applied to sparsely sampled data. The presented methods for optimal control are fast and robust, and can be recommended for use in drug discovery. The MCMC-based methods can have long running times and require more expertise from the user. The rigorous definition together with the illustrative examples and suggestions for software serve as a highly promising starting point for application of input-estimation methods to problems in drug discovery.
NASA Astrophysics Data System (ADS)
Wirth, Erin A.; Long, Maureen D.; Moriarty, John C.
2017-01-01
Teleseismic receiver functions contain information regarding Earth structure beneath a seismic station. P-to-SV converted phases are often used to characterize crustal and upper-mantle discontinuities and isotropic velocity structures. More recently, P-to-SH converted energy has been used to interrogate the orientation of anisotropy at depth, as well as the geometry of dipping interfaces. Many studies use a trial-and-error forward modeling approach for the interpretation of receiver functions, generating synthetic receiver functions from a user-defined input model of Earth structure and amending this model until it matches major features in the actual data. While often successful, such an approach makes it impossible to explore model space in a systematic and robust manner, which is especially important given that solutions are likely non-unique. Here, we present a Markov chain Monte Carlo algorithm with Gibbs sampling for the interpretation of anisotropic receiver functions. Synthetic examples are used to test the viability of the algorithm, suggesting that it works well for models with a reasonable number of free parameters (<˜20). Additionally, the synthetic tests illustrate that certain parameters are well constrained by receiver function data, while others are subject to severe trade-offs-an important implication for studies that attempt to interpret Earth structure based on receiver function data. Finally, we apply our algorithm to receiver function data from station WCI in the central United States. We find evidence for a change in anisotropic structure at mid-lithospheric depths, consistent with previous work that used a grid search approach to model receiver function data at this station. Forward modeling of receiver functions using model space search algorithms, such as the one presented here, provide a meaningful framework for interrogating Earth structure from receiver function data.
Markov Chain Model-Based Optimal Cluster Heads Selection for Wireless Sensor Networks.
Ahmed, Gulnaz; Zou, Jianhua; Zhao, Xi; Sadiq Fareed, Mian Muhammad
2017-02-23
The longer network lifetime of Wireless Sensor Networks (WSNs) is a goal which is directly related to energy consumption. This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area. The hierarchal clustering architecture is the best choice for these kind of issues. In this paper, we introduce a novel clustering protocol called Markov chain model-based optimal cluster heads (MOCHs) selection for WSNs. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster. We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. We compare MOCHs against Sleep-awake Energy Efficient Distributed (SEED) clustering, Artificial Bee Colony (ABC), Zone Based Routing (ZBR), and Centralized Energy Efficient Clustering (CEEC) using the above-discussed quality metrics and found that the lifetime of the proposed model is almost 1095, 2630, 3599, and 2045 rounds (time steps) greater than SEED, ABC, ZBR, and CEEC, respectively. The obtained results demonstrate that the MOCHs is better than SEED, ABC, ZBR, and CEEC in terms of energy efficiency and the network throughput.
NASA Astrophysics Data System (ADS)
Lopes, Artur O.; Neumann, Adriana
2015-05-01
In the present paper, we consider a family of continuous time symmetric random walks indexed by , . For each the matching random walk take values in the finite set of states ; notice that is a subset of , where is the unitary circle. The infinitesimal generator of such chain is denoted by . The stationary probability for such process converges to the uniform distribution on the circle, when . Here we want to study other natural measures, obtained via a limit on , that are concentrated on some points of . We will disturb this process by a potential and study for each the perturbed stationary measures of this new process when . We disturb the system considering a fixed potential and we will denote by the restriction of to . Then, we define a non-stochastic semigroup generated by the matrix , where is the infinifesimal generator of . From the continuous time Perron's Theorem one can normalized such semigroup, and, then we get another stochastic semigroup which generates a continuous time Markov Chain taking values on . This new chain is called the continuous time Gibbs state associated to the potential , see (Lopes et al. in J Stat Phys 152:894-933, 2013). The stationary probability vector for such Markov Chain is denoted by . We assume that the maximum of is attained in a unique point of , and from this will follow that . Thus, here, our main goal is to analyze the large deviation principle for the family , when . The deviation function , which is defined on , will be obtained from a procedure based on fixed points of the Lax-Oleinik operator and Aubry-Mather theory. In order to obtain the associated Lax-Oleinik operator we use the Varadhan's Lemma for the process . For a careful analysis of the problem we present full details of the proof of the Large Deviation Principle, in the Skorohod space, for such family of Markov Chains, when . Finally, we compute the entropy of the invariant probabilities on the Skorohod space associated to the Markov Chains we analyze.
Han, Chao; Chen, Jian; Wu, Qingyao; Mu, Shuai; Min, Huaqing
2015-10-01
Automated assignment of protein function has received considerable attention in recent years for genome-wide study. With the rapid accumulation of genome sequencing data produced by high-throughput experimental techniques, the process of manually predicting functional properties of proteins has become increasingly cumbersome. Such large genomics data sets can only be annotated computationally. However, automated assignment of functions to unknown protein is challenging due to its inherent difficulty and complexity. Previous studies have revealed that solving problems involving complicated objects with multiple semantic meanings using the multi-instance multi-label (MIML) framework is effective. For the protein function prediction problems, each protein object in nature may associate with distinct structural units (instances) and multiple functional properties (class labels) where each unit is described by an instance and each functional property is considered as a class label. Thus, it is convenient and natural to tackle the protein function prediction problem by using the MIML framework. In this paper, we propose a sparse Markov chain-based semi-supervised MIML method, called Sparse-Markov. A sparse transductive probability graph is constructed to encode the affinity information of the data based on ensemble of Hausdorff distance metrics. Our goal is to exploit the affinity between protein objects in the sparse transductive probability graph to seek a sparse steady state probability of the Markov chain model to do protein function prediction, such that two proteins are given similar functional labels if they are close to each other in terms of an ensemble Hausdorff distance in the graph. Experimental results on seven real-world organism data sets covering three biological domains show that our proposed Sparse-Markov method is able to achieve better performance than four state-of-the-art MIML learning algorithms.
NASA Astrophysics Data System (ADS)
Ramirez, A. L.; Foxall, W.
2011-12-01
Surface displacements caused by reservoir pressure perturbations resulting from CO2 injection can often be measured by geodetic methods such as InSAR, tilt and GPS. We have developed a Markov Chain Monte Carlo (MCMC) approach to invert surface displacements measured by InSAR to map the pressure distribution associated with CO2 injection at the In Salah Krechba field, Algeria. The MCMC inversion entails sampling the solution space by proposing a series of trial 3D pressure-plume models. In the case of In Salah, the range of allowable models is constrained by prior information provided by well and geophysical data for the reservoir and possible fluid pathways in the overburden, and injection pressures and volumes. Each trial pressure distribution source is run through a (mathematical) forward model to calculate a set of synthetic surface deformation data. The likelihood that a particular proposal represents the true source is determined from the fit of the calculated data to the InSAR measurements, and those having higher likelihoods are passed to the posterior distribution. This procedure is repeated over typically ~104 - 105 trials until the posterior distribution converges to a stable solution. The solution to each stochastic inversion is in the form of Bayesian posterior probability density function (pdf) over the range of the alternative models that are consistent with the measured data and prior information. Therefore, the solution provides not only the highest likelihood model but also a realistic estimate of the solution uncertainty. Our InSalah work considered three flow model alternatives: 1) The first model assumed that the CO2 saturation and fluid pressure changes were confined to the reservoir; 2) the second model allowed the perturbations to occur also in a damage zone inferred in the lower caprock from 3D seismic surveys; and 3) the third model allowed fluid pressure changes anywhere within the reservoir and overburden. Alternative (2) yielded optimal
Sargeant, Glen A.; Sovada, Marsha A.; Slivinski, Christiane C.; Johnson, Douglas H.
2005-01-01
Accurate maps of species distributions are essential tools for wildlife research and conservation. Unfortunately, biologists often are forced to rely on maps derived from observed occurrences recorded opportunistically during observation periods of variable length. Spurious inferences are likely to result because such maps are profoundly affected by the duration and intensity of observation and by methods used to delineate distributions, especially when detection is uncertain. We conducted a systematic survey of swift fox (Vulpes velox) distribution in western Kansas, USA, and used Markov chain Monte Carlo (MCMC) image restoration to rectify these problems. During 1997–1999, we searched 355 townships (ca. 93 km) 1–3 times each for an average cost of $7,315 per year and achieved a detection rate (probability of detecting swift foxes, if present, during a single search) of = 0.69 (95% Bayesian confidence interval [BCI] = [0.60, 0.77]). Our analysis produced an estimate of the underlying distribution, rather than a map of observed occurrences, that reflected the uncertainty associated with estimates of model parameters. To evaluate our results, we analyzed simulated data with similar properties. Results of our simulations suggest negligible bias and good precision when probabilities of detection on ≥1 survey occasions (cumulative probabilities of detection) exceed 0.65. Although the use of MCMC image restoration has been limited by theoretical and computational complexities, alternatives do not possess the same advantages. Image models accommodate uncertain detection, do not require spatially independent data or a census of map units, and can be used to estimate species distributions directly from observations without relying on habitat covariates or parameters that must be estimated subjectively. These features facilitate economical surveys of large regions, the detection of temporal trends in distribution, and assessments of landscape-level relations between
Sargeant, G.A.; Sovada, M.A.; Slivinski, C.C.; Johnson, D.H.
2005-01-01
Accurate maps of species distributions are essential tools for wildlife research and conservation. Unfortunately, biologists often are forced to rely on maps derived from observed occurrences recorded opportunistically during observation periods of variable length. Spurious inferences are likely to result because such maps are profoundly affected by the duration and intensity of observation and by methods used to delineate distributions, especially when detection is uncertain. We conducted a systematic survey of swift fox (Vulpes velox) distribution in western Kansas, USA, and used Markov chain Monte Carlo (MCMC) image restoration to rectify these problems. During 1997-1999, we searched 355 townships (ca. 93 km2) 1-3 times each for an average cost of $7,315 per year and achieved a detection rate (probability of detecting swift foxes, if present, during a single search) of ?? = 0.69 (95% Bayesian confidence interval [BCI] = [0.60, 0.77]). Our analysis produced an estimate of the underlying distribution, rather than a map of observed occurrences, that reflected the uncertainty associated with estimates of model parameters. To evaluate our results, we analyzed simulated data with similar properties. Results of our simulations suggest negligible bias and good precision when probabilities of detection on ???1 survey occasions (cumulative probabilities of detection) exceed 0.65. Although the use of MCMC image restoration has been limited by theoretical and computational complexities, alternatives do not possess the same advantages. Image models accommodate uncertain detection, do not require spatially independent data or a census of map units, and can be used to estimate species distributions directly from observations without relying on habitat covariates or parameters that must be estimated subjectively. These features facilitate economical surveys of large regions, the detection of temporal trends in distribution, and assessments of landscape-level relations between
Multiple-Event Seismic Location Using the Markov-Chain Monte Carlo Technique
NASA Astrophysics Data System (ADS)
Myers, S. C.; Johannesson, G.; Hanley, W.
2005-12-01
We develop a new multiple-event location algorithm (MCMCloc) that utilizes the Markov-Chain Monte Carlo (MCMC) method. Unlike most inverse methods, the MCMC approach produces a suite of solutions, each of which is consistent with observations and prior estimates of data and model uncertainties. Model parameters in MCMCloc consist of event hypocenters, and travel-time predictions. Data are arrival time measurements and phase assignments. Posteriori estimates of event locations, path corrections, pick errors, and phase assignments are made through analysis of the posteriori suite of acceptable solutions. Prior uncertainty estimates include correlations between travel-time predictions, correlations between measurement errors, the probability of misidentifying one phase for another, and the probability of spurious data. Inclusion of prior constraints on location accuracy allows direct utilization of ground-truth locations or well-constrained location parameters (e.g. from InSAR) that aid in the accuracy of the solution. Implementation of a correlation structure for travel-time predictions allows MCMCloc to operate over arbitrarily large geographic areas. Transition in behavior between a multiple-event locator for tightly clustered events and a single-event locator for solitary events is controlled by the spatial correlation of travel-time predictions. We test the MCMC locator on a regional data set of Nevada Test Site nuclear explosions. Event locations and origin times are known for these events, allowing us to test the features of MCMCloc using a high-quality ground truth data set. Preliminary tests suggest that MCMCloc provides excellent relative locations, often outperforming traditional multiple-event location algorithms, and excellent absolute locations are attained when constraints from one or more ground truth event are included. When phase assignments are switched, we find that MCMCloc properly corrects the error when predicted arrival times are separated by
NASA Astrophysics Data System (ADS)
Yang, P.; Ng, T. L.; Yang, W.
2015-12-01
Effective water resources management depends on the reliable estimation of the uncertainty of drought events. Confidence intervals (CIs) are commonly applied to quantify this uncertainty. A CI seeks to be at the minimal length necessary to cover the true value of the estimated variable with the desired probability. In drought analysis where two or more variables (e.g., duration and severity) are often used to describe a drought, copulas have been found suitable for representing the joint probability behavior of these variables. However, the comprehensive assessment of the parameter uncertainties of copulas of droughts has been largely ignored, and the few studies that have recognized this issue have not explicitly compared the various methods to produce the best CIs. Thus, the objective of this study to compare the CIs generated using two widely applied uncertainty estimation methods, bootstrapping and Markov Chain Monte Carlo (MCMC). To achieve this objective, (1) the marginal distributions lognormal, Gamma, and Generalized Extreme Value, and the copula functions Clayton, Frank, and Plackett are selected to construct joint probability functions of two drought related variables. (2) The resulting joint functions are then fitted to 200 sets of simulated realizations of drought events with known distribution and extreme parameters and (3) from there, using bootstrapping and MCMC, CIs of the parameters are generated and compared. The effect of an informative prior on the CIs generated by MCMC is also evaluated. CIs are produced for different sample sizes (50, 100, and 200) of the simulated drought events for fitting the joint probability functions. Preliminary results assuming lognormal marginal distributions and the Clayton copula function suggest that for cases with small or medium sample sizes (~50-100), MCMC to be superior method if an informative prior exists. Where an informative prior is unavailable, for small sample sizes (~50), both bootstrapping and MCMC
NASA Astrophysics Data System (ADS)
Sedlmeier, Katrin; Mieruch, Sebastian; Schädler, Gerd
2014-05-01
Compound extremes are receiving more and more attention in the scientific world because of their great impact on society. It is therefore of great interest how well state-of-the-art regional climate models can represent the dynamics of multivariate extremes. Furthermore, the near future climate change signal of compound extremes is interesting especially on the regional scale because high resolution information is needed for impact studies and mitigation and adaptation strategies. We use a method based on Markov Chains to assess these two questions. It is based on the representation of multivariate climate anomalies by first order Markov Chains. We partition our dataset into extreme and non-extreme regimes and reduce the multivariate dataset to a univariate time series which can then be described as a discrete stochastic process, a Markov Chain. From the transition matrix several descriptors such as persistence, recurrence time and entropy are derived which characterize the dynamic properties of the multivariate system. By comparing these descriptors for model and observation data, the representation of the dynamics of the climate system by different models is evaluated. Near future shifts or changes of the dynamics of compound extremes are detected by using regional climate projections and comparing the descriptors for different time periods. In order to obtain reliable estimates of a climate change signal, we use an ensemble of simulations to assess the uncertainty which arise in climate projections. Our work is based on an ensemble of high resolution (7 km) regional climate simulations for Central Europe with the COSMO-CLM regional climate model using different global driving data. The time periods considered are a control period (1971-200) and the near future (2021-2050) and running windows within these time periods. For comparison, E-Obs and HYRAS gridded observational datasets are used. The presentation will mainly focus on bivariate temperature and
Sieh, Weiva; Basu, Saonli; Fu, Audrey Q; Rothstein, Joseph H; Scheet, Paul A; Stewart, William C L; Sung, Yun J; Thompson, Elizabeth A; Wijsman, Ellen M
2005-12-30
We performed multipoint linkage analysis of the electrophysiological trait ECB21 on chromosome 4 in the full pedigrees provided by the Collaborative Study on the Genetics of Alcoholism (COGA). Three Markov chain Monte Carlo (MCMC)-based approaches were applied to the provided and re-estimated genetic maps and to five different marker panels consisting of microsatellite (STRP) and/or SNP markers at various densities. We found evidence of linkage near the GABRB1 STRP using all methods, maps, and marker panels. Difficulties encountered with SNP panels included convergence problems and demanding computations.
NASA Astrophysics Data System (ADS)
Lalande, Jean-Marie; Waxler, Roger; Velea, Doru
2016-04-01
As infrasonic waves propagate at long ranges through atmospheric ducts it has been suggested that observations of such waves can be used as a remote sensing techniques in order to update properties such as temperature and wind speed. In this study we investigate a new inverse approach based on Markov Chain Monte Carlo methods. This approach as the advantage of searching for the full Probability Density Function in the parameter space at a lower computational cost than extensive parameters search performed by the standard Monte Carlo approach. We apply this inverse methods to observations from the Humming Roadrunner experiment (New Mexico) and discuss implications for atmospheric updates, explosion characterization, localization and yield estimation.
Obesity status transitions across the elementary years: Use of Markov chain modeling
Technology Transfer Automated Retrieval System (TEKTRAN)
Overweight and obesity status transition probabilities using first-order Markov transition models applied to elementary school children were assessed. Complete longitudinal data across eleven assessments were available from 1,494 elementary school children (from 7,599 students in 41 out of 45 school...
Schofield, Jeremy Bayat, Hanif
2014-09-07
A Markov state model of the dynamics of a protein-like chain immersed in an implicit hard sphere solvent is derived from first principles for a system of monomers that interact via discontinuous potentials designed to account for local structure and bonding in a coarse-grained sense. The model is based on the assumption that the implicit solvent interacts on a fast time scale with the monomers of the chain compared to the time scale for structural rearrangements of the chain and provides sufficient friction so that the motion of monomers is governed by the Smoluchowski equation. A microscopic theory for the dynamics of the system is developed that reduces to a Markovian model of the kinetics under well-defined conditions. Microscopic expressions for the rate constants that appear in the Markov state model are analyzed and expressed in terms of a temperature-dependent linear combination of escape rates that themselves are independent of temperature. Excellent agreement is demonstrated between the theoretical predictions of the escape rates and those obtained through simulation of a stochastic model of the dynamics of bond formation. Finally, the Markov model is studied by analyzing the eigenvalues and eigenvectors of the matrix of transition rates, and the equilibration process for a simple helix-forming system from an ensemble of initially extended configurations to mainly folded configurations is investigated as a function of temperature for a number of different chain lengths. For short chains, the relaxation is primarily single-exponential and becomes independent of temperature in the low-temperature regime. The profile is more complicated for longer chains, where multi-exponential relaxation behavior is seen at intermediate temperatures followed by a low temperature regime in which the folding becomes rapid and single exponential. It is demonstrated that the behavior of the equilibration profile as the temperature is lowered can be understood in terms of the
Nortey, Ezekiel N N; Ansah-Narh, Theophilus; Asah-Asante, Richard; Minkah, Richard
2015-01-01
Although, there exists numerous literature on the procedure for forecasting or predicting election results, in Ghana only opinion poll strategies have been used. To fill this gap, the paper develops Markov chain models for forecasting the 2016 presidential election results at the Regional, Zonal (i.e. Savannah, Coastal and Forest) and the National levels using past presidential election results of Ghana. The methodology develops a model for prediction of the 2016 presidential election results in Ghana using the Markov chains Monte Carlo (MCMC) methodology with bootstrap estimates. The results were that the ruling NDC may marginally win the 2016 Presidential Elections but would not obtain the more than 50 % votes to be declared an outright winner. This means that there is going to be a run-off election between the two giant political parties: the ruling NDC and the major opposition party, NPP. The prediction for the 2016 Presidential run-off election between the NDC and the NPP was rather in favour of the major opposition party, the NPP with a little over the 50 % votes obtained.
spMC: an R-package for 3D lithological reconstructions based on spatial Markov chains
NASA Astrophysics Data System (ADS)
Sartore, Luca; Fabbri, Paolo; Gaetan, Carlo
2016-09-01
The paper presents the spatial Markov Chains (spMC) R-package and a case study of subsoil simulation/prediction located in a plain site of Northeastern Italy. spMC is a quite complete collection of advanced methods for data inspection, besides spMC implements Markov Chain models to estimate experimental transition probabilities of categorical lithological data. Furthermore, simulation methods based on most known prediction methods (as indicator Kriging and CoKriging) were implemented in spMC package. Moreover, other more advanced methods are available for simulations, e.g. path methods and Bayesian procedures, that exploit the maximum entropy. Since the spMC package was developed for intensive geostatistical computations, part of the code is implemented for parallel computations via the OpenMP constructs. A final analysis of this computational efficiency compares the simulation/prediction algorithms by using different numbers of CPU cores, and considering the example data set of the case study included in the package.
Geochemical Characterization Using Geophysical Data and Markov Chain Monte Carlo Methods
NASA Astrophysics Data System (ADS)
Chen, J.; Hubbard, S.; Rubin, Y.; Murray, C.; Roden, E.; Majer, E.
2002-12-01
if they were available from direct measurements or as variables otherwise. To estimate the geochemical parameters, we first assigned a prior model for each variable and a likelihood model for each type of data, which together define posterior probability distributions for each variable on the domain. Since the posterior probability distribution may involve hundreds of variables, we used a Markov Chain Monte Carlo (MCMC) method to explore each variable by generating and subsequently evaluating hundreds of realizations. Results from this case study showed that although geophysical attributes are not necessarily directly related to geochemical parameters, geophysical data could be very useful for providing accurate and high-resolution information about geochemical parameter distribution through their joint and indirect connections with hydrogeological properties such as lithofacies. This case study also demonstrated that MCMC methods were particularly useful for geochemical parameter estimation using geophysical data because they allow incorporation into the procedure of spatial correlation information, measurement errors, and cross correlations among different types of parameters.
Starfish: Robust spectroscopic inference tools
NASA Astrophysics Data System (ADS)
Czekala, Ian; Andrews, Sean M.; Mandel, Kaisey S.; Hogg, David W.; Green, Gregory M.
2015-05-01
Starfish is a set of tools used for spectroscopic inference. It robustly determines stellar parameters using high resolution spectral models and uses Markov Chain Monte Carlo (MCMC) to explore the full posterior probability distribution of the stellar parameters. Additional potential applications include other types of spectra, such as unresolved stellar clusters or supernovae spectra.
Inference of R(0) and transmission heterogeneity from the size distribution of stuttering chains.
Blumberg, Seth; Lloyd-Smith, James O
2013-01-01
For many infectious disease processes such as emerging zoonoses and vaccine-preventable diseases, [Formula: see text] and infections occur as self-limited stuttering transmission chains. A mechanistic understanding of transmission is essential for characterizing the risk of emerging diseases and monitoring spatio-temporal dynamics. Thus methods for inferring [Formula: see text] and the degree of heterogeneity in transmission from stuttering chain data have important applications in disease surveillance and management. Previous researchers have used chain size distributions to infer [Formula: see text], but estimation of the degree of individual-level variation in infectiousness (as quantified by the dispersion parameter, [Formula: see text]) has typically required contact tracing data. Utilizing branching process theory along with a negative binomial offspring distribution, we demonstrate how maximum likelihood estimation can be applied to chain size data to infer both [Formula: see text] and the dispersion parameter that characterizes heterogeneity. While the maximum likelihood value for [Formula: see text] is a simple function of the average chain size, the associated confidence intervals are dependent on the inferred degree of transmission heterogeneity. As demonstrated for monkeypox data from the Democratic Republic of Congo, this impacts when a statistically significant change in [Formula: see text] is detectable. In addition, by allowing for superspreading events, inference of [Formula: see text] shifts the threshold above which a transmission chain should be considered anomalously large for a given value of [Formula: see text] (thus reducing the probability of false alarms about pathogen adaptation). Our analysis of monkeypox also clarifies the various ways that imperfect observation can impact inference of transmission parameters, and highlights the need to quantitatively evaluate whether observation is likely to significantly bias results.
NASA Astrophysics Data System (ADS)
Julie, Hongki; Pasaribu, Udjianna S.; Pancoro, Adi
2015-12-01
This paper will allow Markov Chain's application in genome shared identical by descent by two individual at full sibs model. The full sibs model was a continuous time Markov Chain with three state. In the full sibs model, we look for the cumulative distribution function of the number of sub segment which have 2 IBD haplotypes from a segment of the chromosome which the length is t Morgan and the cumulative distribution function of the number of sub segment which have at least 1 IBD haplotypes from a segment of the chromosome which the length is t Morgan. This cumulative distribution function will be developed by the moment generating function.
NASA Astrophysics Data System (ADS)
Deo, C. S.; Srolovitz, D. J.
2002-09-01
We describe a first passage time Markov chain analysis of rare events in kinetic Monte Carlo (kMC) simulations and demonstrate how this analysis may be used to enhance kMC simulations of dislocation glide. Dislocation glide is described by the kink mechanism, which involves double kink nucleation, kink migration and kink-kink annihilation. Double kinks that nucleate on straight dislocations are unstable at small kink separations and tend to recombine immediately following nucleation. A very small fraction (<0.001) of nucleating double kinks survive to grow to a stable kink separation. The present approach replaces all of the events that lead up to the formation of a stable kink with a simple numerical calculation of the time required for stable kink formation. In this paper, we treat the double kink nucleation process as a temporally homogeneous birth-death Markov process and present a first passage time analysis of the Markov process in order to calculate the nucleation rate of a double kink with a stable kink separation. We discuss two methods to calculate the first passage time; one computes the distribution and the average of the first passage time, while the other uses a recursive relation to calculate the average first passage time. The average first passage times calculated by both approaches are shown to be in excellent agreement with direct Monte Carlo simulations for four idealized cases of double kink nucleation. Finally, we apply this approach to double kink nucleation on a screw dislocation in molybdenum and obtain the rates for formation of stable double kinks as a function of applied stress and temperature. Equivalent kMC simulations are too inefficient to be performed using commonly available computational resources.
Markov Chains for Random Urinalysis III: Daily Model and Drug Kinetics
1994-01-01
III: Daily M and Drug Kinetcs E L ’-- TF " 94-04546 9 4 2 0 9 0 6 2 Ap ,o.vod f r p c, tease cdsibutn Is un a ted NPRDC-TN-94-12 January 1994 Markov...and maintairi- 9 the d~ata needed. a~d corro’eting a-!d rev~ewing the collection of infrmt~ationi Seid conriments regarding this burden estimate or any...PERFORMiNG ORGANIZATION Navy Personnel Research -and Development Center REPORT NUMBER San Diego, CA 92152-7250 NPRDC-TN-94-12 9 . SPONSO R!NGIMO NTO
NASA Technical Reports Server (NTRS)
Graves, M. E.; Perlmutter, M.
1974-01-01
To aid the planning of the Apollo Soyuz Test Program (ASTP), certain natural environment statistical relationships are presented, based on Markov theory and empirical counts. The practical results are in terms of conditional probability of favorable and unfavorable launch conditions at Kennedy Space Center (KSC). They are based upon 15 years of recorded weather data which are analyzed under a set of natural environmental launch constraints. Three specific forecasting problems were treated: (1) the length of record of past weather which is useful to a prediction; (2) the effect of persistence in runs of favorable and unfavorable conditions; and (3) the forecasting of future weather in probabilistic terms.
Shaw, Milton Sam; Coe, Joshua D; Sewell, Thomas D
2009-01-01
An optimized version of the Nested Markov Chain Monte Carlo sampling method is applied to the calculation of the Hugoniot for liquid nitrogen. The 'full' system of interest is calculated using density functional theory (DFT) with a 6-31 G* basis set for the configurational energies. The 'reference' system is given by a model potential fit to the anisotropic pair interaction of two nitrogen molecules from DFT calculations. The EOS is sampled in the isobaric-isothermal (NPT) ensemble with a trial move constructed from many Monte Carlo steps in the reference system. The trial move is then accepted with a probability chosen to give the full system distribution. The P's and T's of the reference and full systems are chosen separately to optimize the computational time required to produce the full system EOS. The method is numerically very efficient and predicts a Hugoniot in excellent agreement with experimental data.
Hoti, Fabian J; Sillanpää, Mikko J; Holmström, Lasse
2002-04-01
We provide an overview of the use of kernel smoothing to summarize the quantitative trait locus posterior distribution from a Markov chain Monte Carlo sample. More traditional distributional summary statistics based on the histogram depend both on the bin width and on the sideway shift of the bin grid used. These factors influence both the overall mapping accuracy and the estimated location of the mode of the distribution. Replacing the histogram by kernel smoothing helps to alleviate these problems. Using simulated data, we performed numerical comparisons between the two approaches. The results clearly illustrate the superiority of the kernel method. The kernel approach is particularly efficient when one needs to point out the best putative quantitative trait locus position on the marker map. In such situations, the smoothness of the posterior estimate is especially important because rough posterior estimates easily produce biased mode estimates. Different kernel implementations are available from Rolf Nevanlinna Institute's web page (http://www.rni.helsinki.fi/;fjh).
Pranevicius, Henrikas; Pranevicius, Mindaugas; Pranevicius, Osvaldas; Bukauskas, Feliksas F.
2015-01-01
The primary goal of this work was to study advantages of numerical methods used for the creation of continuous time Markov chain models (CTMC) of voltage gating of gap junction (GJ) channels composed of connexin protein. This task was accomplished by describing gating of GJs using the formalism of the stochastic automata networks (SANs), which allowed for very efficient building and storing of infinitesimal generator of the CTMC that allowed to produce matrices of the models containing a distinct block structure. All of that allowed us to develop efficient numerical methods for a steady-state solution of CTMC models. This allowed us to accelerate CPU time, which is necessary to solve CTMC models, ∼20 times. PMID:25705700
NASA Astrophysics Data System (ADS)
Lunt, Mark F.; Rigby, Matt; Ganesan, Anita L.; Manning, Alistair J.
2016-09-01
Atmospheric trace gas inversions often attempt to attribute fluxes to a high-dimensional grid using observations. To make this problem computationally feasible, and to reduce the degree of under-determination, some form of dimension reduction is usually performed. Here, we present an objective method for reducing the spatial dimension of the parameter space in atmospheric trace gas inversions. In addition to solving for a set of unknowns that govern emissions of a trace gas, we set out a framework that considers the number of unknowns to itself be an unknown. We rely on the well-established reversible-jump Markov chain Monte Carlo algorithm to use the data to determine the dimension of the parameter space. This framework provides a single-step process that solves for both the resolution of the inversion grid, as well as the magnitude of fluxes from this grid. Therefore, the uncertainty that surrounds the choice of aggregation is accounted for in the posterior parameter distribution. The posterior distribution of this transdimensional Markov chain provides a naturally smoothed solution, formed from an ensemble of coarser partitions of the spatial domain. We describe the form of the reversible-jump algorithm and how it may be applied to trace gas inversions. We build the system into a hierarchical Bayesian framework in which other unknown factors, such as the magnitude of the model uncertainty, can also be explored. A pseudo-data example is used to show the usefulness of this approach when compared to a subjectively chosen partitioning of a spatial domain. An inversion using real data is also shown to illustrate the scales at which the data allow for methane emissions over north-west Europe to be resolved.
Favorov, Alexander V; Andreewski, Timophey V; Sudomoina, Marina A; Favorova, Olga O; Parmigiani, Giovanni; Ochs, Michael F
2005-12-01
In recent years, the number of studies focusing on the genetic basis of common disorders with a complex mode of inheritance, in which multiple genes of small effect are involved, has been steadily increasing. An improved methodology to identify the cumulative contribution of several polymorphous genes would accelerate our understanding of their importance in disease susceptibility and our ability to develop new treatments. A critical bottleneck is the inability of standard statistical approaches, developed for relatively modest predictor sets, to achieve power in the face of the enormous growth in our knowledge of genomics. The inability is due to the combinatorial complexity arising in searches for multiple interacting genes. Similar "curse of dimensionality" problems have arisen in other fields, and Bayesian statistical approaches coupled to Markov chain Monte Carlo (MCMC) techniques have led to significant improvements in understanding. We present here an algorithm, APSampler, for the exploration of potential combinations of allelic variations positively or negatively associated with a disease or with a phenotype. The algorithm relies on the rank comparison of phenotype for individuals with and without specific patterns (i.e., combinations of allelic variants) isolated in genetic backgrounds matched for the remaining significant patterns. It constructs a Markov chain to sample only potentially significant variants, minimizing the potential of large data sets to overwhelm the search. We tested APSampler on a simulated data set and on a case-control MS (multiple sclerosis) study for ethnic Russians. For the simulated data, the algorithm identified all the phenotype-associated allele combinations coded into the data and, for the MS data, it replicated the previously known findings.
Ma, Junsheng; Chan, Wenyaw; Tilley, Barbara C
2016-04-04
Continuous time Markov chain models are frequently employed in medical research to study the disease progression but are rarely applied to the transtheoretical model, a psychosocial model widely used in the studies of health-related outcomes. The transtheoretical model often includes more than three states and conceptually allows for all possible instantaneous transitions (referred to as general continuous time Markov chain). This complicates the likelihood function because it involves calculating a matrix exponential that may not be simplified for general continuous time Markov chain models. We undertook a Bayesian approach wherein we numerically evaluated the likelihood using ordinary differential equation solvers available from thegnuscientific library. We compared our Bayesian approach with the maximum likelihood method implemented with theRpackageMSM Our simulation study showed that the Bayesian approach provided more accurate point and interval estimates than the maximum likelihood method, especially in complex continuous time Markov chain models with five states. When applied to data from a four-state transtheoretical model collected from a nutrition intervention study in the next step trial, we observed results consistent with the results of the simulation study. Specifically, the two approaches provided comparable point estimates and standard errors for most parameters, but the maximum likelihood offered substantially smaller standard errors for some parameters. Comparable estimates of the standard errors are obtainable from packageMSM, which works only when the model estimation algorithm converges.
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
Application of Markov chain model to daily maximum temperature for thermal comfort in Malaysia
Nordin, Muhamad Asyraf bin Che; Hassan, Husna
2015-10-22
The Markov chain’s first order principle has been widely used to model various meteorological fields, for prediction purposes. In this study, a 14-year (2000-2013) data of daily maximum temperatures in Bayan Lepas were used. Earlier studies showed that the outdoor thermal comfort range based on physiologically equivalent temperature (PET) index in Malaysia is less than 34°C, thus the data obtained were classified into two state: normal state (within thermal comfort range) and hot state (above thermal comfort range). The long-run results show the probability of daily temperature exceed TCR will be only 2.2%. On the other hand, the probability daily temperature within TCR will be 97.8%.
Monaco, James Peter; Madabhushi, Anant
2011-07-01
The ability of classification systems to adjust their performance (sensitivity/specificity) is essential for tasks in which certain errors are more significant than others. For example, mislabeling cancerous lesions as benign is typically more detrimental than mislabeling benign lesions as cancerous. Unfortunately, methods for modifying the performance of Markov random field (MRF) based classifiers are noticeably absent from the literature, and thus most such systems restrict their performance to a single, static operating point (a paired sensitivity/specificity). To address this deficiency we present weighted maximum posterior marginals (WMPM) estimation, an extension of maximum posterior marginals (MPM) estimation. Whereas the MPM cost function penalizes each error equally, the WMPM cost function allows misclassifications associated with certain classes to be weighted more heavily than others. This creates a preference for specific classes, and consequently a means for adjusting classifier performance. Realizing WMPM estimation (like MPM estimation) requires estimates of the posterior marginal distributions. The most prevalent means for estimating these--proposed by Marroquin--utilizes a Markov chain Monte Carlo (MCMC) method. Though Marroquin's method (M-MCMC) yields estimates that are sufficiently accurate for MPM estimation, they are inadequate for WMPM. To more accurately estimate the posterior marginals we present an equally simple, but more effective extension of the MCMC method (E-MCMC). Assuming an identical number of iterations, E-MCMC as compared to M-MCMC yields estimates with higher fidelity, thereby 1) allowing a far greater number and diversity of operating points and 2) improving overall classifier performance. To illustrate the utility of WMPM and compare the efficacies of M-MCMC and E-MCMC, we integrate them into our MRF-based classification system for detecting cancerous glands in (whole-mount or quarter) histological sections of the prostate.
Vedadi, Farhang; Shirani, Shahram
2014-01-01
A new method of image resolution up-conversion (image interpolation) based on maximum a posteriori sequence estimation is proposed. Instead of making a hard decision about the value of each missing pixel, we estimate the missing pixels in groups. At each missing pixel of the high resolution (HR) image, we consider an ensemble of candidate interpolation methods (interpolation functions). The interpolation functions are interpreted as states of a Markov model. In other words, the proposed method undergoes state transitions from one missing pixel position to the next. Accordingly, the interpolation problem is translated to the problem of estimating the optimal sequence of interpolation functions corresponding to the sequence of missing HR pixel positions. We derive a parameter-free probabilistic model for this to-be-estimated sequence of interpolation functions. Then, we solve the estimation problem using a trellis representation and the Viterbi algorithm. Using directional interpolation functions and sequence estimation techniques, we classify the new algorithm as an adaptive directional interpolation using soft-decision estimation techniques. Experimental results show that the proposed algorithm yields images with higher or comparable peak signal-to-noise ratios compared with some benchmark interpolation methods in the literature while being efficient in terms of implementation and complexity considerations.
NASA Astrophysics Data System (ADS)
Pan, J.; Durand, M. T.; Vanderjagt, B. J.
2014-12-01
The Markov chain Monte Carlo (MCMC) method had been proved to be successful in snow water equivalent retrieval based on synthetic point-scale passive microwave brightness temperature (TB) observations. This method needs only general prior information about distribution of snow parameters, and could estimate layered snow properties, including the thickness, temperature, density and snow grain size (or exponential correlation length) of each layer. In this study, the multi-layer HUT (Helsinki University of Technology) model and the MEMLS (Microwave Emission Model of Layered Snowpacks) will be used as observation models to assimilate the observed TB into snow parameter prediction. Previous studies had shown that the multi-layer HUT model tends to underestimate TB at 37 GHz for deep snow, while the MEMLS does not show sensitivity of model bias to snow depth. Therefore, results using HUT model and MEMLS will be compared to see how the observation model will influence the retrieval of snow parameters. The radiometric measurements at 10.65, 18.7, 36.5 and 90 GHz at Sodankyla, Finland will be used as MCMC input, and the statistics of all snow property measurement will be used to calculate the prior information. 43 dry snowpits with complete measurements of all snow parameters will be used for validation. The entire dataset are from NorSREx (Nordic Snow Radar Experiment) experiments carried out by Juha Lemmetyinen, Anna Kontu and Jouni Pulliainen in FMI in 2009-2011 winters, and continued two more winters from 2011 to Spring of 2013. Besides the snow thickness and snow density that are directly related to snow water equivalent, other parameters will be compared with observations, too. For thin snow, the previous studies showed that influence of underlying soil is considerable, especially when the soil is half frozen with part of unfrozen liquid water and part of ice. Therefore, this study will also try to employ a simple frozen soil permittivity model to improve the
Paul, Sudeshna; Friedman, Alan M.; Bailey-Kellogg, Chris; Craig, Bruce A.
2013-01-01
The interatomic distance distribution, P(r), is a valuable tool for evaluating the structure of a molecule in solution and represents the maximum structural information that can be derived from solution scattering data without further assumptions. Most current instrumentation for scattering experiments (typically CCD detectors) generates a finely pixelated two-dimensional image. In continuation of the standard practice with earlier one-dimensional detectors, these images are typically reduced to a one-dimensional profile of scattering intensities, I(q), by circular averaging of the two-dimensional image. Indirect Fourier transformation methods are then used to reconstruct P(r) from I(q). Substantial advantages in data analysis, however, could be achieved by directly estimating the P(r) curve from the two-dimensional images. This article describes a Bayesian framework, using a Markov chain Monte Carlo method, for estimating the parameters of the indirect transform, and thus P(r), directly from the two-dimensional images. Using simulated detector images, it is demonstrated that this method yields P(r) curves nearly identical to the reference P(r). Furthermore, an approach for evaluating spatially correlated errors (such as those that arise from a detector point spread function) is evaluated. Accounting for these errors further improves the precision of the P(r) estimation. Experimental scattering data, where no ground truth reference P(r) is available, are used to demonstrate that this method yields a scattering and detector model that more closely reflects the two-dimensional data, as judged by smaller residuals in cross-validation, than P(r) obtained by indirect transformation of a one-dimensional profile. Finally, the method allows concurrent estimation of the beam center and D max, the longest interatomic distance in P(r), as part of the Bayesian Markov chain Monte Carlo method, reducing experimental effort and providing a well defined protocol for these
Paul, Sudeshna; Friedman, Alan M; Bailey-Kellogg, Chris; Craig, Bruce A
2013-04-01
The interatomic distance distribution, P(r), is a valuable tool for evaluating the structure of a molecule in solution and represents the maximum structural information that can be derived from solution scattering data without further assumptions. Most current instrumentation for scattering experiments (typically CCD detectors) generates a finely pixelated two-dimensional image. In contin-uation of the standard practice with earlier one-dimensional detectors, these images are typically reduced to a one-dimensional profile of scattering inten-sities, I(q), by circular averaging of the two-dimensional image. Indirect Fourier transformation methods are then used to reconstruct P(r) from I(q). Substantial advantages in data analysis, however, could be achieved by directly estimating the P(r) curve from the two-dimensional images. This article describes a Bayesian framework, using a Markov chain Monte Carlo method, for estimating the parameters of the indirect transform, and thus P(r), directly from the two-dimensional images. Using simulated detector images, it is demonstrated that this method yields P(r) curves nearly identical to the reference P(r). Furthermore, an approach for evaluating spatially correlated errors (such as those that arise from a detector point spread function) is evaluated. Accounting for these errors further improves the precision of the P(r) estimation. Experimental scattering data, where no ground truth reference P(r) is available, are used to demonstrate that this method yields a scattering and detector model that more closely reflects the two-dimensional data, as judged by smaller residuals in cross-validation, than P(r) obtained by indirect transformation of a one-dimensional profile. Finally, the method allows concurrent estimation of the beam center and Dmax, the longest interatomic distance in P(r), as part of the Bayesian Markov chain Monte Carlo method, reducing experimental effort and providing a well defined protocol for these
A Full Bayesian Approach for Boolean Genetic Network Inference
Han, Shengtong; Wong, Raymond K. W.; Lee, Thomas C. M.; Shen, Linghao; Li, Shuo-Yen R.; Fan, Xiaodan
2014-01-01
Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data. PMID:25551820
Martín, Fernando; Moreno, Luis; Garrido, Santiago; Blanco, Dolores
2015-09-16
One of the most important skills desired for a mobile robot is the ability to obtain its own location even in challenging environments. The information provided by the sensing system is used here to solve the global localization problem. In our previous work, we designed different algorithms founded on evolutionary strategies in order to solve the aforementioned task. The latest developments are presented in this paper. The engine of the localization module is a combination of the Markov chain Monte Carlo sampling technique and the Differential Evolution method, which results in a particle filter based on the minimization of a fitness function. The robot's pose is estimated from a set of possible locations weighted by a cost value. The measurements of the perceptive sensors are used together with the predicted ones in a known map to define a cost function to optimize. Although most localization methods rely on quadratic fitness functions, the sensed information is processed asymmetrically in this filter. The Kullback-Leibler divergence is the basis of a cost function that makes it possible to deal with different types of occlusions. The algorithm performance has been checked in a real map. The results are excellent in environments with dynamic and unmodeled obstacles, a fact that causes occlusions in the sensing area.
NASA Astrophysics Data System (ADS)
Rocha, G.; Pagano, L.; Górski, K. M.; Huffenberger, K. M.; Lawrence, C. R.; Lange, A. E.
2010-04-01
We introduce a new method to propagate uncertainties in the beam shapes used to measure the cosmic microwave background to cosmological parameters determined from those measurements. The method, called markov chain beam randomization (MCBR), randomly samples from a set of templates or functions that describe the beam uncertainties. The method is much faster than direct numerical integration over systematic “nuisance” parameters, and is not restricted to simple, idealized cases as is analytic marginalization. It does not assume the data are normally distributed, and does not require Gaussian priors on the specific systematic uncertainties. We show that MCBR properly accounts for and provides the marginalized errors of the parameters. The method can be generalized and used to propagate any systematic uncertainties for which a set of templates is available. We apply the method to the Planck satellite, and consider future experiments. Beam measurement errors should have a small effect on cosmological parameters as long as the beam fitting is performed after removal of 1/f noise.
Saloranta, Tuomo M; Armitage, James M; Haario, Heikki; Naes, Kristoffer; Cousins, Ian T; Barton, David N
2008-01-01
Multimedia environmental fate models are useful tools to investigate the long-term impacts of remediation measures designed to alleviate potential ecological and human health concerns in contaminated areas. Estimating and communicating the uncertainties associated with the model simulations is a critical task for demonstrating the transparency and reliability of the results. The Extended Fourier Amplitude Sensitivity Test(Extended FAST) method for sensitivity analysis and Bayesian Markov chain Monte Carlo (MCMC) method for uncertainty analysis and model calibration have several advantages over methods typically applied for multimedia environmental fate models. Most importantly, the simulation results and their uncertainties can be anchored to the available observations and their uncertainties. We apply these techniques for simulating the historical fate of polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs) in the Grenland fjords, Norway, and for predicting the effects of different contaminated sediment remediation (capping) scenarios on the future levels of PCDD/Fs in cod and crab therein. The remediation scenario simulations show that a significant remediation effect can first be seen when significant portions of the contaminated sediment areas are cleaned up, and that increase in capping area leads to both earlier achievement of good fjord status and narrower uncertainty in the predicted timing for this.
The Fate of Priority Areas for Conservation in Protected Areas: A Fine-Scale Markov Chain Approach
NASA Astrophysics Data System (ADS)
Tattoni, Clara; Ciolli, Marco; Ferretti, Fabrizio
2011-02-01
Park managers in alpine areas must deal with the increase in forest coverage that has been observed in most European mountain areas, where traditional farming and agricultural practices have been abandoned. The aim of this study is to develop a fine-scale model of a broad area to support the managers of Paneveggio Nature Park (Italy) in conservation planning by focusing on the fate of priority areas for conservation in the next 50-100 years. GIS analyses were performed to assess the afforestation dynamic over time using two historical maps (from 1859 and 1936) and a series of aerial photographs and ortho-photos (taken from 1954 to 2006) covering a time span of 150 years. The results show an increase in the forest surface area of about 35%. Additionally, the forest became progressively more compact and less fragmented, with a consequent loss of ecotones and open habitats that are important for biodiversity. Markov chain-cellular automata models were used to project future changes, evaluating the effects on a habitat scale. Simulations show that some habitats defined as priority by the EU Habitat Directive will be compromised by the forest expansion by 2050 and suffer a consistent loss by 2100. This protocol, applied to other areas, can be used for designing long-term management measures with a focus on habitats where conservation status is at risk.
Meng, Tianhui; Li, Xiaofan; Zhang, Sha; Zhao, Yubin
2016-01-01
Wireless sensor networks (WSNs) have recently gained popularity for a wide spectrum of applications. Monitoring tasks can be performed in various environments. This may be beneficial in many scenarios, but it certainly exhibits new challenges in terms of security due to increased data transmission over the wireless channel with potentially unknown threats. Among possible security issues are timing attacks, which are not prevented by traditional cryptographic security. Moreover, the limited energy and memory resources prohibit the use of complex security mechanisms in such systems. Therefore, balancing between security and the associated energy consumption becomes a crucial challenge. This paper proposes a secure scheme for WSNs while maintaining the requirement of the security-performance tradeoff. In order to proceed to a quantitative treatment of this problem, a hybrid continuous-time Markov chain (CTMC) and queueing model are put forward, and the tradeoff analysis of the security and performance attributes is carried out. By extending and transforming this model, the mean time to security attributes failure is evaluated. Through tradeoff analysis, we show that our scheme can enhance the security of WSNs, and the optimal rekeying rate of the performance and security tradeoff can be obtained. PMID:27690042
Cooper, Nicola J; Lambert, Paul C; Abrams, Keith R; Sutton, Alexander J
2007-01-01
This article focuses on the modelling and prediction of costs due to disease accrued over time, to inform the planning of future services and budgets. It is well documented that the modelling of cost data is often problematic due to the distribution of such data; for example, strongly right skewed with a significant percentage of zero-cost observations. An additional problem associated with modelling costs over time is that cost observations measured on the same individual at different time points will usually be correlated. In this study we compare the performance of four different multilevel/hierarchical models (which allow for both the within-subject and between-subject variability) for analysing healthcare costs in a cohort of individuals with early inflammatory polyarthritis (IP) who were followed-up annually over a 5-year time period from 1990/1991. The hierarchical models fitted included linear regression models and two-part models with log-transformed costs, and two-part model with gamma regression and a log link. The cohort was split into a learning sample, to fit the different models, and a test sample to assess the predictive ability of these models. To obtain predicted costs on the original cost scale (rather than the log-cost scale) two different retransformation factors were applied. All analyses were carried out using Bayesian Markov chain Monte Carlo (MCMC) simulation methods.
Ades, A E; Cliffe, S
2002-01-01
Decision models are usually populated 1 parameter at a time, with 1 item of information informing each parameter. Often, however, data may not be available on the parameters themselves but on several functions of parameters, and there may be more items of information than there are parameters to be estimated. The authors show how in these circumstances all the model parameters can be estimated simultaneously using Bayesian Markov chain Monte Carlo methods. Consistency of the information and/or the adequacy of the model can also be assessed within this framework. Statistical evidence synthesis using all available data should result in more precise estimates of parameters and functions of parameters, and is compatible with the emphasis currently placed on systematic use of evidence. To illustrate this, WinBUGS software is used to estimate a simple 9-parameter model of the epidemiology of HIV in women attending prenatal clinics, using information on 12 functions of parameters, and to thereby compute the expected net benefit of 2 alternative prenatal testing strategies, universal testing and targeted testing of high-risk groups. The authors demonstrate improved precision of estimates, and lower estimates of the expected value of perfect information, resulting from the use of all available data.
NASA Astrophysics Data System (ADS)
Rey, Sergio J.; Kang, Wei; Wolf, Levi
2016-10-01
Discrete Markov chain models (DMCs) have been widely applied to the study of regional income distribution dynamics and convergence. This popularity reflects the rich body of DMC theory on the one hand and the ability of this framework to provide insights on the internal and external properties of regional income distribution dynamics on the other. In this paper we examine the properties of tests for spatial effects in DMC models of regional distribution dynamics. We do so through a series of Monte Carlo simulations designed to examine the size, power and robustness of tests for spatial heterogeneity and spatial dependence in transitional dynamics. This requires that we specify a data generating process for not only the null, but also alternatives when spatial heterogeneity or spatial dependence is present in the transitional dynamics. We are not aware of any work which has examined these types of data generating processes in the spatial distribution dynamics literature. Results indicate that tests for spatial heterogeneity and spatial dependence display good power for the presence of spatial effects. However, tests for spatial heterogeneity are not robust to the presence of strong spatial dependence, while tests for spatial dependence are sensitive to the spatial configuration of heterogeneity. When the spatial configuration can be considered random, dependence tests are robust to the dynamic spatial heterogeneity, but not so to the process mean heterogeneity when the difference in process means is large relative to the variance of the time series.
Geng, Bo; Zhou, Xiaobo; Zhu, Jinmin; Hung, Y S; Wong, Stephen T C
2008-04-01
Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathematical model is required to predict the actual enzyme(s) catalyzing the reactions. In this study, several plausible predictive methods are considered for the classification problem in missing enzyme identification, and comparisons are performed with an aim to identify a method with better performance than the Bayesian model used in previous work. In particular, a regression model consisting of a linear term and a nonlinear term is proposed to apply to the problem, in which the reversible jump Markov-chain-Monte-Carlo (MCMC) learning technique (developed in [Andrieu C, Freitas Nando de, Doucet A. Robust full Bayesian learning for radial basis networks 2001;13:2359-407.]) is adopted to estimate the model order and the parameters. We evaluated the models using known reactions in Escherichia coli, Mycobacterium tuberculosis, Vibrio cholerae and Caulobacter cresentus bacteria, as well as one eukaryotic organism, Saccharomyces Cerevisiae. Although support vector regression also exhibits comparable performance in this application, it was demonstrated that the proposed model achieves favorable prediction performance, particularly sensitivity, compared with the Bayesian method.
Coe, Joshua D; Sewell, Thomas D; Shaw, M Sam
2009-08-21
An optimized variant of the nested Markov chain Monte Carlo [n(MC)(2)] method [J. Chem. Phys. 130, 164104 (2009)] is applied to fluid N(2). In this implementation of n(MC)(2), isothermal-isobaric (NPT) ensemble sampling on the basis of a pair potential (the "reference" system) is used to enhance the efficiency of sampling based on Perdew-Burke-Ernzerhof density functional theory with a 6-31G(*) basis set (PBE6-31G(*), the "full" system). A long sequence of Monte Carlo steps taken in the reference system is converted into a trial step taken in the full system; for a good choice of reference potential, these trial steps have a high probability of acceptance. Using decorrelated samples drawn from the reference distribution, the pressure and temperature of the full system are varied such that its distribution overlaps maximally with that of the reference system. Optimized pressures and temperatures then serve as input parameters for n(MC)(2) sampling of dense fluid N(2) over a wide range of thermodynamic conditions. The simulation results are combined to construct the Hugoniot of nitrogen fluid, yielding predictions in excellent agreement with experiment.
Nguyen, David P; Frank, Loren M; Brown, Emery N
2003-02-01
Multi-electrode recordings in neural tissue contain the action potential waveforms of many closely spaced neurons. While we can observe the action potential waveforms, we cannot observe which neuron is the source for which waveform nor how many source neurons are being recorded. Current spike-sorting algorithms solve this problem by assuming a fixed number of source neurons and assigning the action potentials given this fixed number. We model the spike waveforms as an anisotropic Gaussian mixture model and present, as an alternative, a reversible-jump Markov chain Monte Carlo (MCMC) algorithm to simultaneously estimate the number of source neurons and to assign each action potential to a source. We derive this MCMC algorithm and illustrate its application using simulated three-dimensional data and real four-dimensional feature vectors extracted from tetrode recordings of rat entorhinal cortex neurons. In the analysis of the simulated data our algorithm finds the correct number of mixture components (sources) and classifies the action potential waveforms with minimal error. In the analysis of real data, our algorithm identifies clusters closely resembling those previously identified by a user-dependent graphical clustering procedure. Our findings suggest that a reversible-jump MCMC algorithm could offer a new strategy for designing automated spike-sorting algorithms.
Sweeney, Lisa M; Parker, Ann; Haber, Lynne T; Tran, C Lang; Kuempel, Eileen D
2013-06-01
A biomathematical model was previously developed to describe the long-term clearance and retention of particles in the lungs of coal miners. The model structure was evaluated and parameters were estimated in two data sets, one from the United States and one from the United Kingdom. The three-compartment model structure consists of deposition of inhaled particles in the alveolar region, competing processes of either clearance from the alveolar region or translocation to the lung interstitial region, and very slow, irreversible sequestration of interstitialized material in the lung-associated lymph nodes. Point estimates of model parameter values were estimated separately for the two data sets. In the current effort, Bayesian population analysis using Markov chain Monte Carlo simulation was used to recalibrate the model while improving assessments of parameter variability and uncertainty. When model parameters were calibrated simultaneously to the two data sets, agreement between the derived parameters for the two groups was very good, and the central tendency values were similar to those derived from the deterministic approach. These findings are relevant to the proposed update of the ICRP human respiratory tract model with revisions to the alveolar-interstitial region based on this long-term particle clearance and retention model.
Liu, Ruimin; Men, Cong; Wang, Xiujuan; Xu, Fei; Yu, Wenwen
Soil and water conservation in the Three Gorges Reservoir Area of China is important, and soil erosion is a significant issue. In the present study, spatial Markov chains were applied to explore the impacts of the regional context on soil erosion in the Xiangxi River watershed, and Thematic Mapper remote sensing data from 1999 and 2007 were employed. The results indicated that the observed changes in soil erosion were closely related to the soil erosion levels of the surrounding areas. When neighboring regions were not considered, the probability that moderate erosion transformed into slight and severe erosion was 0.8330 and 0.0049, respectively. However, when neighboring regions that displayed intensive erosion were considered, the probabilities were 0.2454 and 0.7513, respectively. Moreover, the different levels of soil erosion in neighboring regions played different roles in soil erosion. If the erosion levels in the neighboring region were lower, the probability of a high erosion class transferring to a lower level was relatively high. In contrast, if erosion levels in the neighboring region were higher, the probability was lower. The results of the present study provide important information for the planning and implementation of soil conservation measures in the study area.
Lee, J K; Thomas, D C
2000-11-01
Markov chain-Monte Carlo (MCMC) techniques for multipoint mapping of quantitative trait loci have been developed on nuclear-family and extended-pedigree data. These methods are based on repeated sampling-peeling and gene dropping of genotype vectors and random sampling of each of the model parameters from their full conditional distributions, given phenotypes, markers, and other model parameters. We further refine such approaches by improving the efficiency of the marker haplotype-updating algorithm and by adopting a new proposal for adding loci. Incorporating these refinements, we have performed an extensive simulation study on simulated nuclear-family data, varying the number of trait loci, family size, displacement, and other segregation parameters. Our simulation studies show that our MCMC algorithm identifies the locations of the true trait loci and estimates their segregation parameters well-provided that the total number of sibship pairs in the pedigree data is reasonably large, heritability of each individual trait locus is not too low, and the loci are not too close together. Our MCMC algorithm was shown to be significantly more efficient than LOKI (Heath 1997) in our simulation study using nuclear-family data.
Meng, Tianhui; Li, Xiaofan; Zhang, Sha; Zhao, Yubin
2016-09-28
Wireless sensor networks (WSNs) have recently gained popularity for a wide spectrum of applications. Monitoring tasks can be performed in various environments. This may be beneficial in many scenarios, but it certainly exhibits new challenges in terms of security due to increased data transmission over the wireless channel with potentially unknown threats. Among possible security issues are timing attacks, which are not prevented by traditional cryptographic security. Moreover, the limited energy and memory resources prohibit the use of complex security mechanisms in such systems. Therefore, balancing between security and the associated energy consumption becomes a crucial challenge. This paper proposes a secure scheme for WSNs while maintaining the requirement of the security-performance tradeoff. In order to proceed to a quantitative treatment of this problem, a hybrid continuous-time Markov chain (CTMC) and queueing model are put forward, and the tradeoff analysis of the security and performance attributes is carried out. By extending and transforming this model, the mean time to security attributes failure is evaluated. Through tradeoff analysis, we show that our scheme can enhance the security of WSNs, and the optimal rekeying rate of the performance and security tradeoff can be obtained.
Martín, Fernando; Moreno, Luis; Garrido, Santiago; Blanco, Dolores
2015-01-01
One of the most important skills desired for a mobile robot is the ability to obtain its own location even in challenging environments. The information provided by the sensing system is used here to solve the global localization problem. In our previous work, we designed different algorithms founded on evolutionary strategies in order to solve the aforementioned task. The latest developments are presented in this paper. The engine of the localization module is a combination of the Markov chain Monte Carlo sampling technique and the Differential Evolution method, which results in a particle filter based on the minimization of a fitness function. The robot’s pose is estimated from a set of possible locations weighted by a cost value. The measurements of the perceptive sensors are used together with the predicted ones in a known map to define a cost function to optimize. Although most localization methods rely on quadratic fitness functions, the sensed information is processed asymmetrically in this filter. The Kullback-Leibler divergence is the basis of a cost function that makes it possible to deal with different types of occlusions. The algorithm performance has been checked in a real map. The results are excellent in environments with dynamic and unmodeled obstacles, a fact that causes occlusions in the sensing area. PMID:26389914
Johannesson, G; Glaser, R E; Lee, C L; Nitao, J J; Hanley, W G
2005-02-07
Estimating unknown system configurations/parameters by combining system knowledge gained from a computer simulation model on one hand and from observed data on the other hand is challenging. An example of such inverse problem is detecting and localizing potential flaws or changes in a structure by using a finite-element model and measured vibration/displacement data. We propose a probabilistic approach based on Bayesian methodology. This approach does not only yield a single best-guess solution, but a posterior probability distribution over the parameter space. In addition, the Bayesian approach provides a natural framework to accommodate prior knowledge. A Markov chain Monte Carlo (MCMC) procedure is proposed to generate samples from the posterior distribution (an ensemble of likely system configurations given the data). The MCMC procedure proposed explores the parameter space at different resolutions (scales), resulting in a more robust and efficient procedure. The large-scale exploration steps are carried out using coarser-resolution finite-element models, yielding a considerable decrease in computational time, which can be a crucial for large finite-element models. An application is given using synthetic displacement data from a simple cantilever beam with MCMC exploration carried out at three different resolutions.
NASA Astrophysics Data System (ADS)
Pan, J.; Durand, M. T.; Vanderjagt, B. J.
2015-12-01
Markov Chain Monte Carlo (MCMC) method is a retrieval algorithm based on Bayes' rule, which starts from an initial state of snow/soil parameters, and updates it to a series of new states by comparing the posterior probability of simulated snow microwave signals before and after each time of random walk. It is a realization of the Bayes' rule, which gives an approximation to the probability of the snow/soil parameters in condition of the measured microwave TB signals at different bands. Although this method could solve all snow parameters including depth, density, snow grain size and temperature at the same time, it still needs prior information of these parameters for posterior probability calculation. How the priors will influence the SWE retrieval is a big concern. Therefore, in this paper at first, a sensitivity test will be carried out to study how accurate the snow emission models and how explicit the snow priors need to be to maintain the SWE error within certain amount. The synthetic TB simulated from the measured snow properties plus a 2-K observation error will be used for this purpose. It aims to provide a guidance on the MCMC application under different circumstances. Later, the method will be used for the snowpits at different sites, including Sodankyla, Finland, Churchill, Canada and Colorado, USA, using the measured TB from ground-based radiometers at different bands. Based on the previous work, the error in these practical cases will be studied, and the error sources will be separated and quantified.
NASA Astrophysics Data System (ADS)
Moradkhani, Hamid; Yan, Hongxiang
2016-04-01
Soil moisture simulation and prediction are increasingly used to characterize agricultural droughts but the process suffers from data scarcity and quality. The satellite soil moisture observations could be used to improve model predictions with data assimilation. Remote sensing products, however, are typically discontinuous in spatial-temporal coverages; while simulated soil moisture products are potentially biased due to the errors in forcing data, parameters, and deficiencies of model physics. This study attempts to provide a detailed analysis of the joint and separate assimilation of streamflow and Advanced Scatterometer (ASCAT) surface soil moisture into a fully distributed hydrologic model, with the use of recently developed particle filter-Markov chain Monte Carlo (PF-MCMC) method. A geostatistical model is introduced to overcome the satellite soil moisture discontinuity issue where satellite data does not cover the whole study region or is significantly biased, and the dominant land cover is dense vegetation. The results indicate that joint assimilation of soil moisture and streamflow has minimal effect in improving the streamflow prediction, however, the surface soil moisture field is significantly improved. The combination of DA and geostatistical approach can further improve the surface soil moisture prediction.
NASA Astrophysics Data System (ADS)
He, Xin; Caffo, Brian S.; Frey, Eric C.
2007-03-01
The ideal observer (IO) employs complete knowledge of the available data statistics and sets an upper limit on the observer performance on a binary classification task. Kupinski proposed an IO estimation method using Markov chain Monte Carlo (MCMC) techniques. In principle, this method can be generalized to any parameterized phantoms and simulated imaging systems. In practice, however, it can be computationally burdensome, because it requires sampling the object distribution and simulating the imaging process a large number of times during the MCMC estimation process. In this work we propose methods that allow application of MCMC techniques to cardiac SPECT imaging IO estimation using a parameterized torso phantom and an accurate analytical projection algorithm that models the SPECT image formation process. To accelerate the imaging simulation process and thus enable the MCMC IO estimation, we used a phantom model with discretized anatomical parameters and continuous uptake parameters. The imaging process simulation was modeled by pre-computing projections for each organ in the finite number of discretely-parameterized anatomic models and taking linear combinations of the organ projections based on sampling of the continuous organ uptake parameters. The proposed method greatly reduces the computational burden and makes MCMC IO estimation for cardiac SPECT imaging possible.
Sweeney, Lisa M.; Parker, Ann; Haber, Lynne T.; Tran, C. Lang; Kuempel, Eileen D.
2015-01-01
A biomathematical model was previously developed to describe the long-term clearance and retention of particles in the lungs of coal miners. The model structure was evaluated and parameters were estimated in two data sets, one from the United States and one from the United Kingdom. The three-compartment model structure consists of deposition of inhaled particles in the alveolar region, competing processes of either clearance from the alveolar region or translocation to the lung interstitial region, and very slow, irreversible sequestration of interstitialized material in the lung-associated lymph nodes. Point estimates of model parameter values were estimated separately for the two data sets. In the current effort, Bayesian population analysis using Markov chain Monte Carlo simulation was used to recalibrate the model while improving assessments of parameter variability and uncertainty. When model parameters were calibrated simultaneously to the two data sets, agreement between the derived parameters for the two groups was very good, and the central tendency values were similar to those derived from the deterministic approach. These findings are relevant to the proposed update of the ICRP human respiratory tract model with revisions to the alveolar-interstitial region based on this long-term particle clearance and retention model. PMID:23454101
Hobolth, Asger; Stone, Eric A
2009-09-01
Analyses of serially-sampled data often begin with the assumption that the observations represent discrete samples from a latent continuous-time stochastic process. The continuous-time Markov chain (CTMC) is one such generative model whose popularity extends to a variety of disciplines ranging from computational finance to human genetics and genomics. A common theme among these diverse applications is the need to simulate sample paths of a CTMC conditional on realized data that is discretely observed. Here we present a general solution to this sampling problem when the CTMC is defined on a discrete and finite state space. Specifically, we consider the generation of sample paths, including intermediate states and times of transition, from a CTMC whose beginning and ending states are known across a time interval of length T. We first unify the literature through a discussion of the three predominant approaches: (1) modified rejection sampling, (2) direct sampling, and (3) uniformization. We then give analytical results for the complexity and efficiency of each method in terms of the instantaneous transition rate matrix Q of the CTMC, its beginning and ending states, and the length of sampling time T. In doing so, we show that no method dominates the others across all model specifications, and we give explicit proof of which method prevails for any given Q, T, and endpoints. Finally, we introduce and compare three applications of CTMCs to demonstrate the pitfalls of choosing an inefficient sampler.
Skolpap, Wanwisa; Scharer, J M; Douglas, P L; Moo-Young, M
2004-06-20
A stoichiometry-based model for the fed-batch culture of the recombinant bacterium Bacillus subtilis ATCC 6051a, producing extracellular alpha-amylase as a desirable product and proteases as undesirable products, was developed and verified. The model was then used for optimizing the feeding schedule in fed-batch culture. To handle higher-order model equations (14 state variables), an optimization methodology for the dual-enzyme system is proposed by integrating Pontryagin's optimum principle with fermentation measurements. Markov chain Monte Carlo (MCMC) procedures were appropriate for model parameter and decision variable estimation by using a priori parameter distributions reflecting the experimental results. Using a simplified Metropolis-Hastings algorithm, the specific productivity of alpha-amylase was maximized and the optimum path was confirmed by experimentation. The optimization process predicted a further 14% improvement of alpha-amylase productivity that could not be realized because of the onset of sporulation. Among the decision variables, the switching time from batch to fed-batch operation (t(s)) was the most sensitive decision variable.
Pirzkal, N.; Rothberg, B.; Koekemoer, Anton; Nilsson, Kim K.; Finkelstein, S.; Malhotra, Sangeeta; Rhoads, James
2012-04-01
We have developed a new method for fitting spectral energy distributions (SEDs) to identify and constrain the physical properties of high-redshift (4 < z < 8) galaxies. Our approach uses an implementation of Bayesian based Markov Chain Monte Carlo that we have dubbed '{pi}MC{sup 2}'. It allows us to compare observations to arbitrarily complex models and to compute 95% credible intervals that provide robust constraints for the model parameters. The work is presented in two sections. In the first, we test {pi}MC{sup 2} using simulated SEDs to not only confirm the recovery of the known inputs but to assess the limitations of the method and identify potential hazards of SED fitting when applied specifically to high-redshift (z > 4) galaxies. In the second part of the paper we apply {pi}MC{sup 2} to thirty-three 4 < z < 8 objects, including the spectroscopically confirmed Grism ACS Program for Extragalactic Science Ly{alpha} sample (4 < z < 6), supplemented by newly obtained Hubble Space Telescope/WFC3 near-IR observations, and several recently reported broadband selected z > 6 galaxies. Using {pi}MC{sup 2}, we are able to constrain the stellar mass of these objects and in some cases their stellar age and find no evidence that any of these sources formed at a redshift larger than z = 8, a time when the universe was Almost-Equal-To 0.6 Gyr old.
Jewell, J. B.; O'Dwyer, I. J.; Huey, Greg; Gorski, K. M.; Eriksen, H. K.; Wandelt, B. D. E-mail: h.k.k.eriksen@astro.uio.no
2009-05-20
We present a new Markov Chain Monte Carlo (MCMC) algorithm for cosmic microwave background (CMB) analysis in the low signal-to-noise regime. This method builds on and complements the previously described CMB Gibbs sampler, and effectively solves the low signal-to-noise inefficiency problem of the direct Gibbs sampler. The new algorithm is a simple Metropolis-Hastings sampler with a general proposal rule for the power spectrum, C {sub l}, followed by a particular deterministic rescaling operation of the sky signal, s. The acceptance probability for this joint move depends on the sky map only through the difference of {chi}{sup 2} between the original and proposed sky sample, which is close to unity in the low signal-to-noise regime. The algorithm is completed by alternating this move with a standard Gibbs move. Together, these two proposals constitute a computationally efficient algorithm for mapping out the full joint CMB posterior, both in the high and low signal-to-noise regimes.
NASA Astrophysics Data System (ADS)
LIU, B.; Liang, Y.
2015-12-01
Markov chain Monte Carlo (MCMC) simulation is a powerful statistical method in solving inverse problems that arise from a wide range of applications, such as nuclear physics, computational biology, financial engineering, among others. In Earth sciences applications of MCMC are primarily in the field of geophysics [1]. The purpose of this study is to introduce MCMC to geochemical inverse problems related to trace element fractionation during concurrent melting, melt transport and melt-rock reaction in the mantle. MCMC method has several advantages over linearized least squares methods in inverting trace element patterns in basalts and mantle rocks. First, MCMC can handle equations that have no explicit analytical solutions which are required by linearized least squares methods for gradient calculation. Second, MCMC converges to global minimum while linearized least squares methods may be stuck at a local minimum or converge slowly due to nonlinearity. Furthermore, MCMC can provide insight into uncertainties of model parameters with non-normal trade-off. We use MCMC to invert for extent of melting, amount of trapped melt, and extent of chemical disequilibrium between the melt and residual solid from REE data in abyssal peridotites from Central Indian Ridge and Mid-Atlantic Ridge. In the first step, we conduct forward calculation of REE evolution with melting models in a reasonable model space. We then build up a chain of melting models according to Metropolis-Hastings algorithm to represent the probability of specific model. We show that chemical disequilibrium is likely to play an important role in fractionating LREE in residual peridotites. In the future, MCMC will be applied to more realistic but also more complicated melting models in which partition coefficients, diffusion coefficients, as well as melting and melt suction rates vary as functions of temperature, pressure and mineral compositions. [1]. Sambridge & Mosegarrd [2002] Rev. Geophys.
Pouzat, Christophe; Delescluse, Matthieu; Viot, Pascal; Diebolt, Jean
2004-06-01
Spike-sorting techniques attempt to classify a series of noisy electrical waveforms according to the identity of the neurons that generated them. Existing techniques perform this classification ignoring several properties of actual neurons that can ultimately improve classification performance. In this study, we propose a more realistic spike train generation model. It incorporates both a description of "nontrivial" (i.e., non-Poisson) neuronal discharge statistics and a description of spike waveform dynamics (e.g., the events amplitude decays for short interspike intervals). We show that this spike train generation model is analogous to a one-dimensional Potts spin-glass model. We can therefore tailor to our particular case the computational methods that have been developed in fields where Potts models are extensively used, including statistical physics and image restoration. These methods are based on the construction of a Markov chain in the space of model parameters and spike train configurations, where a configuration is defined by specifying a neuron of origin for each spike. This Markov chain is built such that its unique stationary density is the posterior density of model parameters and configurations given the observed data. A Monte Carlo simulation of the Markov chain is then used to estimate the posterior density. We illustrate the way to build the transition matrix of the Markov chain with a simple, but realistic, model for data generation. We use simulated data to illustrate the performance of the method and to show that this approach can easily cope with neurons firing doublets of spikes and/or generating spikes with highly dynamic waveforms. The method cannot automatically find the "correct" number of neurons in the data. User input is required for this important problem and we illustrate how this can be done. We finally discuss further developments of the method.
Comen, E; Mason, J; Kuhn, P; Nieva, J; Newton, P; Norton, L; Venkatappa, N; Jochelson, M
2014-06-01
Purpose: Traditionally, breast cancer metastasis is described as a process wherein cancer cells spread from the breast to multiple organ systems via hematogenous and lymphatic routes. Mapping organ specific patterns of cancer spread over time is essential to understanding metastatic progression. In order to better predict sites of metastases, here we demonstrate modeling of the patterned migration of metastasis. Methods: We reviewed the clinical history of 453 breast cancer patients from Memorial Sloan Kettering Cancer Center who were non-metastatic at diagnosis but developed metastasis over time. We used the variables of organ site of metastases as well as time to create a Markov chain model of metastasis. We illustrate the probabilities of metastasis occurring at a given anatomic site together with the probability of spread to additional sites. Results: Based on the clinical histories of 453 breast cancer patients who developed metastasis, we have learned (i) how to create the Markov transition matrix governing the probabilities of cancer progression from site to site; (ii) how to create a systemic network diagram governing disease progression modeled as a random walk on a directed graph; (iii) how to classify metastatic sites as ‘sponges’ that tend to only receive cancer cells or ‘spreaders’ that receive and release them; (iv) how to model the time-scales of disease progression as a Weibull probability distribution function; (v) how to perform Monte Carlo simulations of disease progression; and (vi) how to interpret disease progression as an entropy-increasing stochastic process. Conclusion: Based on our modeling, metastatic spread may follow predictable pathways. Mapping metastasis not simply by organ site, but by function as either a ‘spreader’ or ‘sponge’ fundamentally reframes our understanding of metastatic processes. This model serves as a novel platform from which we may integrate the evolving genomic landscape that drives cancer
NASA Astrophysics Data System (ADS)
Jadoon, K. Z.; Altaf, M. U.; McCabe, M. F.; Hoteit, I.; Moghadas, D.
2014-12-01
In arid and semi-arid regions, soil salinity has a major impact on agro-ecosystems, agricultural productivity, environment and sustainability. High levels of soil salinity adversely affect plant growth and productivity, soil and water quality, and may eventually result in soil erosion and land degradation. Being essentially a hazard, it's important to monitor and map soil salinity at an early stage to effectively use soil resources and maintain soil salinity level below the salt tolerance of crops. In this respect, low frequency electromagnetic induction (EMI) systems can be used as a noninvasive method to map the distribution of soil salinity at the field scale and at a high spatial resolution. In this contribution, an EMI system (the CMD Mini-Explorer) is used to estimate soil salinity using a Bayesian approach implemented via a Markov chain Monte Carlo (MCMC) sampling for inversion of multi-configuration EMI measurements. In-situ and EMI measurements were conducted across a farm where Acacia trees are irrigated with brackish water using a drip irrigation system. The electromagnetic forward model is based on the full solution of Maxwell's equation, and the subsurface is considered as a three-layer problem. In total, five parameters (electrical conductivity of three layers and thickness of top two layers) were inverted and modeled electrical conductivities were converted into the universal standard of soil salinity measurement (i.e. using the method of electrical conductivity of a saturated soil paste extract). Simulation results demonstrate that the proposed scheme successfully recovers soil salinity and reduces the uncertainties in the prior estimate. Analysis of the resulting posterior distribution of parameters indicates that electrical conductivity of the top two layers and the thickness of the first layer are well constrained by the EMI measurements. The proposed approach allows for quantitative mapping and monitoring of the spatial electrical conductivity
NASA Astrophysics Data System (ADS)
WöHling, Thomas; Vrugt, Jasper A.
2011-04-01
In the past two decades significant progress has been made toward the application of inverse modeling to estimate the water retention and hydraulic conductivity functions of the vadose zone at different spatial scales. Many of these contributions have focused on estimating only a few soil hydraulic parameters, without recourse to appropriately capturing and addressing spatial variability. The assumption of a homogeneous medium significantly simplifies the complexity of the resulting inverse problem, allowing the use of classical parameter estimation algorithms. Here we present an inverse modeling study with a high degree of vertical complexity that involves calibration of a 25 parameter Richards'-based HYDRUS-1D model using in situ measurements of volumetric water content and pressure head from multiple depths in a heterogeneous vadose zone in New Zealand. We first determine the trade-off in the fitting of both data types using the AMALGAM multiple objective evolutionary search algorithm. Then we adopt a Bayesian framework and derive posterior probability density functions of parameter and model predictive uncertainty using the recently developed differential evolution adaptive metropolis, DREAMZS adaptive Markov chain Monte Carlo scheme. We use four different formulations of the likelihood function each differing in their underlying assumption about the statistical properties of the error residual and data used for calibration. We show that AMALGAM and DREAMZS can solve for the 25 hydraulic parameters describing the water retention and hydraulic conductivity functions of the multilayer heterogeneous vadose zone. Our study clearly highlights that multiple data types are simultaneously required in the likelihood function to result in an accurate soil hydraulic characterization of the vadose zone of interest. Remaining error residuals are most likely caused by model deficiencies that are not encapsulated by the multilayer model and can not be accessed by the
NASA Astrophysics Data System (ADS)
Yang, Y.; Zhou, X.; Weng, E.; Luo, Y.
2010-12-01
The Markov chain Monte Carlo (MCMC) method has been widely used to estimate terrestrial ecosystem model parameters. However, inverse analysis is now mainly applied to estimate parameters involved in terrestrial ecosystem carbon models, and yet not used to inverse terrestrial nitrogen model parameters. In this study, the Bayesian probability inversion and MCMC technique were applied to inverse model parameters in a coupled carbon-nitrogen model, and then the trained ecosystem model was used to predict nitrogen pool sizes at the Duke Forests FACE site. We used datasets of soil respiration, nitrogen mineralization, nitrogen uptake, carbon and nitrogen pools in wood, foliage, litterfall, microbial, forest floor, and mineral soil under ambient and elevated CO2 plots from 1996-2005. Our results showed that, the initial values of C pools in leaf, wood, root, litter, microbial and forest floor were well constrained. The transfer coefficients from pools of leaf biomass, woody biomass, root biomass, litter, forest floor were also well constrained by the actual measurements. The observed datasets gave moderate information to the transfer coefficient from the slow soil carbon pool. The parameters in nitrogen parts, such as C: N in plant, litter, and soil were also well constrained. In addition, parameters about nitrogen dynamics (i.e. nitrogen uptake, nitrogen loss, and nitrogen input through biological fixation and deposition) were also well constrained. The predicted carbon and nitrogen pool sizes using the constrained ecosystem models were well consistent with the observed values. Overall, these results suggest that the MCMC inversion technique is an effective method to synthesize information from various sources for predicting the responses of ecosystem carbon and nitrogen cycling to elevated CO2.
Li, Jun; Calo, Victor M.
2013-09-15
We present a single-particle Lennard–Jones (L-J) model for CO{sub 2} and N{sub 2}. Simplified L-J models for other small polyatomic molecules can be obtained following the methodology described herein. The phase-coexistence diagrams of single-component systems computed using the proposed single-particle models for CO{sub 2} and N{sub 2} agree well with experimental data over a wide range of temperatures. These diagrams are computed using the Markov Chain Monte Carlo method based on the Gibbs-NVT ensemble. This good agreement validates the proposed simplified models. That is, with properly selected parameters, the single-particle models have similar accuracy in predicting gas-phase properties as more complex, state-of-the-art molecular models. To further test these single-particle models, three binary mixtures of CH{sub 4}, CO{sub 2} and N{sub 2} are studied using a Gibbs-NPT ensemble. These results are compared against experimental data over a wide range of pressures. The single-particle model has similar accuracy in the gas phase as traditional models although its deviation in the liquid phase is greater. Since the single-particle model reduces the particle number and avoids the time-consuming Ewald summation used to evaluate Coulomb interactions, the proposed model improves the computational efficiency significantly, particularly in the case of high liquid density where the acceptance rate of the particle-swap trial move increases. We compare, at constant temperature and pressure, the Gibbs-NPT and Gibbs-NVT ensembles to analyze their performance differences and results consistency. As theoretically predicted, the agreement between the simulations implies that Gibbs-NVT can be used to validate Gibbs-NPT predictions when experimental data is not available.
Modeling kinetics of a large-scale fed-batch CHO cell culture by Markov chain Monte Carlo method.
Xing, Zizhuo; Bishop, Nikki; Leister, Kirk; Li, Zheng Jian
2010-01-01
Markov chain Monte Carlo (MCMC) method was applied to model kinetics of a fed-batch Chinese hamster ovary cell culture process in 5,000-L bioreactors. The kinetic model consists of six differential equations, which describe dynamics of viable cell density and concentrations of glucose, glutamine, ammonia, lactate, and the antibody fusion protein B1 (B1). The kinetic model has 18 parameters, six of which were calculated from the cell culture data, whereas the other 12 were estimated from a training data set that comprised of seven cell culture runs using a MCMC method. The model was confirmed in two validation data sets that represented a perturbation of the cell culture condition. The agreement between the predicted and measured values of both validation data sets may indicate high reliability of the model estimates. The kinetic model uniquely incorporated the ammonia removal and the exponential function of B1 protein concentration. The model indicated that ammonia and lactate play critical roles in cell growth and that low concentrations of glucose (0.17 mM) and glutamine (0.09 mM) in the cell culture medium may help reduce ammonia and lactate production. The model demonstrated that 83% of the glucose consumed was used for cell maintenance during the late phase of the cell cultures, whereas the maintenance coefficient for glutamine was negligible. Finally, the kinetic model suggests that it is critical for B1 production to sustain a high number of viable cells. The MCMC methodology may be a useful tool for modeling kinetics of a fed-batch mammalian cell culture process.
Mukhopadhyay, Anirban; Mondal, Parimal; Barik, Jyotiskona; Chowdhury, S M; Ghosh, Tuhin; Hazra, Sugata
2015-06-01
The composition and assemblage of mangroves in the Bangladesh Sundarbans are changing systematically in response to several environmental factors. In order to understand the impact of the changing environmental conditions on the mangrove forest, species composition maps for the years 1985, 1995 and 2005 were studied. In the present study, 1985 and 1995 species zonation maps were considered as base data and the cellular automata-Markov chain model was run to predict the species zonation for the year 2005. The model output was validated against the actual dataset for 2005 and calibrated. Finally, using the model, mangrove species zonation maps for the years 2025, 2055 and 2105 have been prepared. The model was run with the assumption that the continuation of the current tempo and mode of drivers of environmental factors (temperature, rainfall, salinity change) of the last two decades will remain the same in the next few decades. Present findings show that the area distribution of the following species assemblages like Goran (Ceriops), Sundari (Heritiera), Passur (Xylocarpus), and Baen (Avicennia) would decrease in the descending order, whereas the area distribution of Gewa (Excoecaria), Keora (Sonneratia) and Kankra (Bruguiera) dominated assemblages would increase. The spatial distribution of projected mangrove species assemblages shows that more salt tolerant species will dominate in the future; which may be used as a proxy to predict the increase of salinity and its spatial variation in Sundarbans. Considering the present rate of loss of forest land, 17% of the total mangrove cover is predicted to be lost by the year 2105 with a significant loss of fresh water loving mangroves and related ecosystem services. This paper describes a unique approach to assess future changes in species composition and future forest zonation in mangroves under the 'business as usual' scenario of climate change.
NASA Astrophysics Data System (ADS)
Mandal, K. G.; Padhi, J.; Kumar, A.; Ghosh, S.; Panda, D. K.; Mohanty, R. K.; Raychaudhuri, M.
2015-08-01
Rainfed agriculture plays and will continue to play a dominant role in providing food and livelihoods for an increasing world population. Rainfall analyses are helpful for proper crop planning under changing environment in any region. Therefore, in this paper, an attempt has been made to analyse 16 years of rainfall (1995-2010) at the Daspalla region in Odisha, eastern India for prediction using six probability distribution functions, forecasting the probable date of onset and withdrawal of monsoon, occurrence of dry spells by using Markov chain model and finally crop planning for the region. For prediction of monsoon and post-monsoon rainfall, log Pearson type III and Gumbel distribution were the best-fit probability distribution functions. The earliest and most delayed week of the onset of rainy season was the 20th standard meteorological week (SMW) (14th-20th May) and 25th SMW (18th-24th June), respectively. Similarly, the earliest and most delayed week of withdrawal of rainfall was the 39th SMW (24th-30th September) and 47th SMW (19th-25th November), respectively. The longest and shortest length of rainy season was 26 and 17 weeks, respectively. The chances of occurrence of dry spells are high from the 1st-22nd SMW and again the 42nd SMW to the end of the year. The probability of weeks (23rd-40th SMW) remaining wet varies between 62 and 100 % for the region. Results obtained through this analysis would be utilised for agricultural planning and mitigation of dry spells at the Daspalla region in Odisha, India.
SED Fitting with Markov Chain Monte Carlo: The Case of z=2.1 Lyman Alpha Emitters
NASA Astrophysics Data System (ADS)
Acquaviva, Viviana; Guaita, L.; Gawiser, E.; Padilla, N.
2011-01-01
The analysis of Spectral Energy Distributions (SEDs) of faraway galaxies provides us with valuable information on how the structures in the Universe evolved into what we see today. This requires a correct interpretation of data which are constantly improving in volume and precision, which can only be done by developing adequately sophisticated instruments of statistical analysis. We present our Markov Chain Monte Carlo (MCMC) algorithm, which is able to sample large parameter spaces and complicated star formation histories efficiently and can handle multiple stellar populations. This instrument is key for obtaining reliable estimates of SED parameters (e.g. age, stellar mass, dust content) and their uncertainties. It also reveals degeneracies between parameters and illustrates which physical quantities are best suited to describe certain samples of galaxies. We apply this method to the sample of 250 z = 2.1 Lyman Alpha Emitters (LAEs) from Guaita et al (2010a). High-redshift LAEs are of great interest because they probe the faint end of the galaxy luminosity function, where the bulk of galaxies reside, and have been shown to be building blocks of Milky-Way type galaxies today. This analysis complements the ones presented for z=3.1 LAEs in Gawiser et al (2007) and for a number of subsamples of the same z=2.1 LAE sample in Guaita et al (2010b), which were carried out using a grid-based maximum likelihood method. Our results confirm and strengthen the findings that LAEs at z = 2.1 have similar stellar masses to, but are dustier than, z=3.1 LAEs; typical values are respectively M* 5*108 MSun and Av 0.9. The current data don't allow us to discriminate among different star formation histories. We gratefully acknowledge support from NSF, DOE and NASA.
Lin, Chin; Chu, Chi-Ming; Su, Sui-Lung
2016-01-01
Conventional genome-wide association studies (GWAS) have been proven to be a successful strategy for identifying genetic variants associated with complex human traits. However, there is still a large heritability gap between GWAS and transitional family studies. The "missing heritability" has been suggested to be due to lack of studies focused on epistasis, also called gene-gene interactions, because individual trials have often had insufficient sample size. Meta-analysis is a common method for increasing statistical power. However, sufficient detailed information is difficult to obtain. A previous study employed a meta-regression-based method to detect epistasis, but it faced the challenge of inconsistent estimates. Here, we describe a Markov chain Monte Carlo-based method, called "Epistasis Test in Meta-Analysis" (ETMA), which uses genotype summary data to obtain consistent estimates of epistasis effects in meta-analysis. We defined a series of conditions to generate simulation data and tested the power and type I error rates in ETMA, individual data analysis and conventional meta-regression-based method. ETMA not only successfully facilitated consistency of evidence but also yielded acceptable type I error and higher power than conventional meta-regression. We applied ETMA to three real meta-analysis data sets. We found significant gene-gene interactions in the renin-angiotensin system and the polycyclic aromatic hydrocarbon metabolism pathway, with strong supporting evidence. In addition, glutathione S-transferase (GST) mu 1 and theta 1 were confirmed to exert independent effects on cancer. We concluded that the application of ETMA to real meta-analysis data was successful. Finally, we developed an R package, etma, for the detection of epistasis in meta-analysis [etma is available via the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/web/packages/etma/index.html].
NASA Astrophysics Data System (ADS)
Xu, F.; Diner, D. J.; Seidel, F. C.; Dubovik, O.; Zhai, P.
2014-12-01
A vector Markov chain radiative transfer method was developed for forward modeling of radiance and polarization fields in a coupled atmosphere-ocean system. The method was benchmarked against an independent Successive Orders of Scattering code and linearized through the use of Jacobians. Incorporated with the multi-patch optimization algorithm and look-up-table method, simultaneous aerosol and ocean color retrievals were performed using imagery acquired by the Airborne Multiangle SpectroPolarimetric Imager (AirMSPI) when it was operated in step-and-stare mode with 9 viewing angles ranging between ±67°. Data from channels near 355, 380, 445, 470*, 555, 660*, and 865* nm were used in the retrievals, where the asterisk denotes the polarimetric bands. Retrievals were run for AirMSPI overflights over Southern California and Monterey Bay, CA. For the relatively high aerosol optical depth (AOD) case (~0.28 at 550 nm), the retrieved aerosol concentration, size distribution, water-leaving radiance, and chlorophyll concentration were compared to those reported by the USC SeaPRISM AERONET-OC site off the coast of Southern California on 6 February 2013. For the relatively low AOD case (~0.08 at 550 nm), the retrieved aerosol concentration and size distribution were compared to those reported by the Monterey Bay AERONET site on 28 April 2014. Further, we evaluate the benefits of multi-angle and polarimetric observations by performing the retrievals using (a) all view angles and channels; (b) all view angles but radiances only (no polarization); (c) the nadir view angle only with both radiance and polarization; and (d) the nadir view angle without polarization. Optimized retrievals using different initial guesses were performed to provide a measure of retrieval uncertainty. Removal of multi-angular or polarimetric information resulted in increases in both parameter uncertainty and systematic bias. Potential accuracy improvements afforded by applying constraints on the surface
Lin, Chin; Chu, Chi-Ming; Su, Sui-Lung
2016-01-01
Conventional genome-wide association studies (GWAS) have been proven to be a successful strategy for identifying genetic variants associated with complex human traits. However, there is still a large heritability gap between GWAS and transitional family studies. The “missing heritability” has been suggested to be due to lack of studies focused on epistasis, also called gene–gene interactions, because individual trials have often had insufficient sample size. Meta-analysis is a common method for increasing statistical power. However, sufficient detailed information is difficult to obtain. A previous study employed a meta-regression-based method to detect epistasis, but it faced the challenge of inconsistent estimates. Here, we describe a Markov chain Monte Carlo-based method, called “Epistasis Test in Meta-Analysis” (ETMA), which uses genotype summary data to obtain consistent estimates of epistasis effects in meta-analysis. We defined a series of conditions to generate simulation data and tested the power and type I error rates in ETMA, individual data analysis and conventional meta-regression-based method. ETMA not only successfully facilitated consistency of evidence but also yielded acceptable type I error and higher power than conventional meta-regression. We applied ETMA to three real meta-analysis data sets. We found significant gene–gene interactions in the renin–angiotensin system and the polycyclic aromatic hydrocarbon metabolism pathway, with strong supporting evidence. In addition, glutathione S-transferase (GST) mu 1 and theta 1 were confirmed to exert independent effects on cancer. We concluded that the application of ETMA to real meta-analysis data was successful. Finally, we developed an R package, etma, for the detection of epistasis in meta-analysis [etma is available via the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/web/packages/etma/index.html]. PMID:27045371
Tracking Human Pose Using Max-Margin Markov Models.
Zhao, Lin; Gao, Xinbo; Tao, Dacheng; Li, Xuelong
2015-12-01
We present a new method for tracking human pose by employing max-margin Markov models. Representing a human body by part-based models, such as pictorial structure, the problem of pose tracking can be modeled by a discrete Markov random field. Considering max-margin Markov networks provide an efficient way to deal with both structured data and strong generalization guarantees, it is thus natural to learn the model parameters using the max-margin technique. Since tracking human pose needs to couple limbs in adjacent frames, the model will introduce loops and will be intractable for learning and inference. Previous work has resorted to pose estimation methods, which discard temporal information by parsing frames individually. Alternatively, approximate inference strategies have been used, which can overfit to statistics of a particular data set. Thus, the performance and generalization of these methods are limited. In this paper, we approximate the full model by introducing an ensemble of two tree-structured sub-models, Markov networks for spatial parsing and Markov chains for temporal parsing. Both models can be trained jointly using the max-margin technique, and an iterative parsing process is proposed to achieve the ensemble inference. We apply our model on three challengeable data sets, which contains highly varied and articulated poses. Comprehensive experimental results demonstrate the superior performance of our method over the state-of-the-art approaches.
Bayesian restoration of ion channel records using hidden Markov models.
Rosales, R; Stark, J A; Fitzgerald, W J; Hladky, S B
2001-03-01
Hidden Markov models have been used to restore recorded signals of single ion channels buried in background noise. Parameter estimation and signal restoration are usually carried out through likelihood maximization by using variants of the Baum-Welch forward-backward procedures. This paper presents an alternative approach for dealing with this inferential task. The inferences are made by using a combination of the framework provided by Bayesian statistics and numerical methods based on Markov chain Monte Carlo stochastic simulation. The reliability of this approach is tested by using synthetic signals of known characteristics. The expectations of the model parameters estimated here are close to those calculated using the Baum-Welch algorithm, but the present methods also yield estimates of their errors. Comparisons of the results of the Bayesian Markov Chain Monte Carlo approach with those obtained by filtering and thresholding demonstrate clearly the superiority of the new methods.
Bayesian Inference in Probabilistic Risk Assessment -- The Current State of the Art
Dana L. Kelly; Curtis L. Smith
2009-02-01
Markov chain Monte Carlo approaches to sampling directly from the joint posterior distribution of aleatory model parameters have led to tremendous advances in Bayesian inference capability in a wide variety of fields, including probabilistic risk analysis. The advent of freely available software coupled with inexpensive computing power has catalyzed this advance. This paper examines where the risk assessment community is with respect to implementing modern computational-based Bayesian approaches to inference. Through a series of examples in different topical areas, it introduces salient concepts and illustrates the practical application of Bayesian inference via Markov chain Monte Carlo sampling to a variety of important problems.
NASA Astrophysics Data System (ADS)
Olivares, G.; Teferle, F. N.
2013-12-01
Geodetic time series provide information which helps to constrain theoretical models of geophysical processes. It is well established that such time series, for example from GPS, superconducting gravity or mean sea level (MSL), contain time-correlated noise which is usually assumed to be a combination of a long-term stochastic process (characterized by a power-law spectrum) and random noise. Therefore, when fitting a model to geodetic time series it is essential to also estimate the stochastic parameters beside the deterministic ones. Often the stochastic parameters include the power amplitudes of both time-correlated and random noise, as well as, the spectral index of the power-law process. To date, the most widely used method for obtaining these parameter estimates is based on maximum likelihood estimation (MLE). We present an integration method, the Bayesian Monte Carlo Markov Chain (MCMC) method, which, by using Markov chains, provides a sample of the posteriori distribution of all parameters and, thereby, using Monte Carlo integration, all parameters and their uncertainties are estimated simultaneously. This algorithm automatically optimizes the Markov chain step size and estimates the convergence state by spectral analysis of the chain. We assess the MCMC method through comparison with MLE, using the recently released GPS position time series from JPL and apply it also to the MSL time series from the Revised Local Reference data base of the PSMSL. Although the parameter estimates for both methods are fairly equivalent, they suggest that the MCMC method has some advantages over MLE, for example, without further computations it provides the spectral index uncertainty, is computationally stable and detects multimodality.
Tani, Yuji
2016-01-01
Background Consistent with the “attention, interest, desire, memory, action” (AIDMA) model of consumer behavior, patients collect information about available medical institutions using the Internet to select information for their particular needs. Studies of consumer behavior may be found in areas other than medical institution websites. Such research uses Web access logs for visitor search behavior. At this time, research applying the patient searching behavior model to medical institution website visitors is lacking. Objective We have developed a hospital website search behavior model using a Bayesian approach to clarify the behavior of medical institution website visitors and determine the probability of their visits, classified by search keyword. Methods We used the website data access log of a clinic of internal medicine and gastroenterology in the Sapporo suburbs, collecting data from January 1 through June 31, 2011. The contents of the 6 website pages included the following: home, news, content introduction for medical examinations, mammography screening, holiday person-on-duty information, and other. The search keywords we identified as best expressing website visitor needs were listed as the top 4 headings from the access log: clinic name, clinic name + regional name, clinic name + medical examination, and mammography screening. Using the search keywords as the explaining variable, we built a binomial probit model that allows inspection of the contents of each purpose variable. Using this model, we determined a beta value and generated a posterior distribution. We performed the simulation using Markov Chain Monte Carlo methods with a noninformation prior distribution for this model and determined the visit probability classified by keyword for each category. Results In the case of the keyword “clinic name,” the visit probability to the website, repeated visit to the website, and contents page for medical examination was positive. In the case of the
Audren, Benjamin; Lesgourgues, Julien; Bird, Simeon; Haehnelt, Martin G.; Viel, Matteo E-mail: julien.lesgourgues@cern.ch E-mail: haehnelt@ast.cam.ac.uk
2013-01-01
We present forecasts for the accuracy of determining the parameters of a minimal cosmological model and the total neutrino mass based on combined mock data for a future Euclid-like galaxy survey and Planck. We consider two different galaxy surveys: a spectroscopic redshift survey and a cosmic shear survey. We make use of the Monte Carlo Markov Chains (MCMC) technique and assume two sets of theoretical errors. The first error is meant to account for uncertainties in the modelling of the effect of neutrinos on the non-linear galaxy power spectrum and we assume this error to be fully correlated in Fourier space. The second error is meant to parametrize the overall residual uncertainties in modelling the non-linear galaxy power spectrum at small scales, and is conservatively assumed to be uncorrelated and to increase with the ratio of a given scale to the scale of non-linearity. It hence increases with wavenumber and decreases with redshift. With these two assumptions for the errors and assuming further conservatively that the uncorrelated error rises above 2% at k = 0.4 h/Mpc and z = 0.5, we find that a future Euclid-like cosmic shear/galaxy survey achieves a 1-σ error on M{sub ν} close to 32 meV/25 meV, sufficient for detecting the total neutrino mass with good significance. If the residual uncorrelated errors indeed rises rapidly towards smaller scales in the non-linear regime as we have assumed here then the data on non-linear scales does not increase the sensitivity to the total neutrino mass. Assuming instead a ten times smaller theoretical error with the same scale dependence, the error on the total neutrino mass decreases moderately from σ(M{sub ν}) = 18 meV to 14 meV when mildly non-linear scales with 0.1 h/Mpc < k < 0.6 h/Mpc are included in the analysis of the galaxy survey data.
NASA Astrophysics Data System (ADS)
Shishmarev, Dmitry; Chapman, Bogdan E.; Naumann, Christoph; Mamone, Salvatore; Kuchel, Philip W.
2015-01-01
The 1H NMR signal of the methyl group of sodium acetate is shown to be a triplet in the anisotropic environment of stretched gelatin gel. The multiplet structure of the signal is due to the intra-methyl residual dipolar couplings. The relaxation properties of the spin system were probed by recording steady-state irradiation envelopes ('z-spectra'). A quantum-mechanical model based on irreducible spherical tensors formed by the three magnetically equivalent spins of the methyl group was used to simulate and fit experimental z-spectra. The multiple parameter values of the relaxation model were estimated by using a Bayesian-based Markov chain Monte Carlo algorithm.
Inference and Hierarchical Modeling in the Social Sciences.
ERIC Educational Resources Information Center
Draper, David
1995-01-01
The use of hierarchical models in social science research is discussed, with emphasis on causal inference and consideration of the limitations of hierarchical models. The increased use of Gibbs sampling and other Markov-chain Monte Carlo methods in the application of hierarchical models is recommended. (SLD)
King, Martin D; Crowder, Martin J; Hand, David J; Harris, Neil G; Williams, Stephen R; Obrenovitch, Tihomir P; Gadian, David G
2003-06-01
Markov chain Monte Carlo simulation was used in a reanalysis of the longitudinal data obtained by Harris et al. (J Cereb Blood Flow Metab 20:28-36) in a study of the direct current (DC) potential and apparent diffusion coefficient (ADC) responses to focal ischemia. The main purpose was to provide a formal analysis of the temporal relationship between the ADC and DC responses, to explore the possible involvement of a common latent (driving) process. A Bayesian nonlinear hierarchical random coefficients model was adopted. DC and ADC transition parameter posterior probability distributions were generated using three parallel Markov chains created using the Metropolis algorithm. Particular attention was paid to the within-subject differences between the DC and ADC time course characteristics. The results show that the DC response is biphasic, whereas the ADC exhibits monophasic behavior, and that the two DC components are each distinguishable from the ADC response in their time dependencies. The DC and ADC changes are not, therefore, driven by a common latent process. This work demonstrates a general analytical approach to the multivariate, longitudinal data-processing problem that commonly arises in stroke and other biomedical research.
NASA Astrophysics Data System (ADS)
Marko, K.; Zulkarnain, F.; Kusratmoko, E.
2016-11-01
Land cover changes particular in urban catchment area has been rapidly occur. Land cover changes occur as a result of increasing demand for built-up area. Various kinds of environmental and hydrological problems e.g. floods and urban heat island can happen if the changes are uncontrolled. This study aims to predict land cover changes using coupling of Markov chains and cellular automata. One of the most rapid land cover changes is occurs at upper Ci Leungsi catchment area that located near Bekasi City and Jakarta Metropolitan Area. Markov chains has a good ability to predict the probability of change statistically while cellular automata believed as a powerful method in reading the spatial patterns of change. Temporal land cover data was obtained by remote sensing satellite imageries. In addition, this study also used multi-criteria analysis to determine which driving factor that could stimulate the changes such as proximity, elevation, and slope. Coupling of these two methods could give better prediction model rather than just using it separately. The prediction model was validated using existing 2015 land cover data and shown a satisfactory kappa coefficient. The most significant increasing land cover is built-up area from 24% to 53%.
NASA Astrophysics Data System (ADS)
Raymond, V.; van der Sluys, M. V.; Mandel, I.; Kalogera, V.; Röver, C.; Christensen, N.
2010-06-01
Gravitational-wave signals from inspirals of binary compact objects (black holes and neutron stars) are primary targets of the ongoing searches by ground-based gravitational-wave (GW) interferometers (LIGO, Virgo and GEO-600). We present parameter estimation results from our Markov-chain Monte Carlo code SPINspiral on signals from binaries with precessing spins. Two data sets are created by injecting simulated GW signals either into synthetic Gaussian noise or into LIGO detector data. We compute the 15-dimensional probability-density functions (PDFs) for both data sets, as well as for a data set containing LIGO data with a known, loud artefact ('glitch'). We show that the analysis of the signal in detector noise yields accuracies similar to those obtained using simulated Gaussian noise. We also find that while the Markov chains from the glitch do not converge, the PDFs would look consistent with a GW signal present in the data. While our parameter estimation results are encouraging, further investigations into how to differentiate an actual GW signal from noise are necessary.
NASA Astrophysics Data System (ADS)
Määttänen, Anni; Douspis, Marian
2015-04-01
In the last years several datasets on deposition mode ice nucleation in Martian conditions have showed that the effectiveness of mineral dust as a condensation nucleus decreases with temperature (Iraci et al., 2010; Phebus et al., 2011; Trainer et al., 2009). Previously, nucleation modelling in Martian conditions used only constant values of this so-called contact parameter, provided by the few studies previously published on the topic. The new studies paved the way for possibly more realistic way of predicting ice crystal formation in the Martian environment. However, the caveat of these studies (Iraci et al., 2010; Phebus et al., 2011) was the limited temperature range that inhibits using the provided (linear) equations for the contact parameter temperature dependence in all conditions of cloud formation on Mars. One wide temperature range deposition mode nucleation dataset exists (Trainer et al., 2009), but the used substrate was silicon, which cannot imitate realistically the most abundant ice nucleus on Mars, mineral dust. Nevertheless, this dataset revealed, thanks to measurements spanning from 150 to 240 K, that the behaviour of the contact parameter as a function of temperature was exponential rather than linear as suggested by previous work. We have tried to combine the previous findings to provide realistic and practical formulae for application in nucleation and atmospheric models. We have analysed the three cited datasets using a Monte Carlo Markov Chain (MCMC) method. The used method allows us to test and evaluate different functional forms for the temperature dependence of the contact parameter. We perform a data inversion by finding the best fit to the measured data simultaneously at all points for different functional forms of the temperature dependence of the contact angle m(T). The method uses a full nucleation model (Määttänen et al., 2005; Vehkamäki et al., 2007) to calculate the observables at each data point. We suggest one new and test
NASA Astrophysics Data System (ADS)
Jalayer, Fatemeh; Ebrahimian, Hossein
2014-05-01
Introduction The first few days elapsed after the occurrence of a strong earthquake and in the presence of an ongoing aftershock sequence are quite critical for emergency decision-making purposes. Epidemic Type Aftershock Sequence (ETAS) models are used frequently for forecasting the spatio-temporal evolution of seismicity in the short-term (Ogata, 1988). The ETAS models are epidemic stochastic point process models in which every earthquake is a potential triggering event for subsequent earthquakes. The ETAS model parameters are usually calibrated a priori and based on a set of events that do not belong to the on-going seismic sequence (Marzocchi and Lombardi 2009). However, adaptive model parameter estimation, based on the events in the on-going sequence, may have several advantages such as, tuning the model to the specific sequence characteristics, and capturing possible variations in time of the model parameters. Simulation-based methods can be employed in order to provide a robust estimate for the spatio-temporal seismicity forecasts in a prescribed forecasting time interval (i.e., a day) within a post-main shock environment. This robust estimate takes into account the uncertainty in the model parameters expressed as the posterior joint probability distribution for the model parameters conditioned on the events that have already occurred (i.e., before the beginning of the forecasting interval) in the on-going seismic sequence. The Markov Chain Monte Carlo simulation scheme is used herein in order to sample directly from the posterior probability distribution for ETAS model parameters. Moreover, the sequence of events that is going to occur during the forecasting interval (and hence affecting the seismicity in an epidemic type model like ETAS) is also generated through a stochastic procedure. The procedure leads to two spatio-temporal outcomes: (1) the probability distribution for the forecasted number of events, and (2) the uncertainty in estimating the
Adams, Noah S.; Hatton, Tyson W.
2012-01-01
Passage and survival data for yearling and subyearling Chinook salmon and juvenile steelhead were collected at McNary Dam between 2006 and 2009. These data have provided critical information for resource managers to implement structural and operational changes designed to improve the survival of juvenile salmonids as they migrate past the dam. Much of the information collected at McNary Dam was in the form of three-dimensional tracks of fish movements in the forebay. These data depicted the behavior of multiple species (in three dimensions) during different diel periods, spill conditions, powerhouse operations, and test configurations of the surface bypass structures (temporary spillway weirs; TSWs). One of the challenges in reporting three-dimensional results is presenting the information in a manner that allows interested parties to summarize the behavior of many fish over many different conditions across multiple years. To accomplish this, we investigated the feasibility of using a Markov chain analysis to characterize fish movement patterns in the forebay of McNary Dam. The Markov chain analysis is one way that can be used to summarize numerically the behavior of fish in the forebay. Numerically summarizing the behavior of juvenile salmonids in the forebay of McNary Dam using the Markov chain analysis allowed us to confirm what had been previously summarized using visualization software. For example, proportions of yearling and subyearling Chinook salmon passing the three powerhouse areas was often greater in the southern and middle areas, compared to the northern area. The opposite generally was observed for steelhead. Results of this analysis also allowed us to confirm and quantify the extent of milling behavior that had been observed for steelhead. For fish that were first detected in the powerhouse region, less than 0.10 of the steelhead, on average, passed within each of the powerhouse areas. Instead, steelhead transitioned to adjoining areas in the
Metrics for Labeled Markov Systems
NASA Technical Reports Server (NTRS)
Desharnais, Josee; Jagadeesan, Radha; Gupta, Vineet; Panangaden, Prakash
1999-01-01
Partial Labeled Markov Chains are simultaneously generalizations of process algebra and of traditional Markov chains. They provide a foundation for interacting discrete probabilistic systems, the interaction being synchronization on labels as in process algebra. Existing notions of process equivalence are too sensitive to the exact probabilities of various transitions. This paper addresses contextual reasoning principles for reasoning about more robust notions of "approximate" equivalence between concurrent interacting probabilistic systems. The present results indicate that:We develop a family of metrics between partial labeled Markov chains to formalize the notion of distance between processes. We show that processes at distance zero are bisimilar. We describe a decision procedure to compute the distance between two processes. We show that reasoning about approximate equivalence can be done compositionally by showing that process combinators do not increase distance. We introduce an asymptotic metric to capture asymptotic properties of Markov chains; and show that parallel composition does not increase asymptotic distance.
TATARINOVA, TATIANA; BOUCK, JOHN; SCHUMITZKY, ALAN
2009-01-01
In this paper, we study Bayesian analysis of nonlinear hierarchical mixture models with a finite but unknown number of components. Our approach is based on Markov chain Monte Carlo (MCMC) methods. One of the applications of our method is directed to the clustering problem in gene expression analysis. From a mathematical and statistical point of view, we discuss the following topics: theoretical and practical convergence problems of the MCMC method; determination of the number of components in the mixture; and computational problems associated with likelihood calculations. In the existing literature, these problems have mainly been addressed in the linear case. One of the main contributions of this paper is developing a method for the nonlinear case. Our approach is based on a combination of methods including Gibbs sampling, random permutation sampling, birth-death MCMC, and Kullback-Leibler distance. PMID:18763739
Tatarinova, Tatiana; Bouck, John; Schumitzky, Alan
2008-08-01
In this paper, we study Bayesian analysis of nonlinear hierarchical mixture models with a finite but unknown number of components. Our approach is based on Markov chain Monte Carlo (MCMC) methods. One of the applications of our method is directed to the clustering problem in gene expression analysis. From a mathematical and statistical point of view, we discuss the following topics: theoretical and practical convergence problems of the MCMC method; determination of the number of components in the mixture; and computational problems associated with likelihood calculations. In the existing literature, these problems have mainly been addressed in the linear case. One of the main contributions of this paper is developing a method for the nonlinear case. Our approach is based on a combination of methods including Gibbs sampling, random permutation sampling, birth-death MCMC, and Kullback-Leibler distance.
Eaves, Lindon; Erkanli, Alaattin
2003-05-01
The linear structural model has provided the statistical backbone of the analysis of twin and family data for 25 years. A new generation of questions cannot easily be forced into the framework of current approaches to modeling and data analysis because they involve nonlinear processes. Maximizing the likelihood with respect to parameters of such nonlinear models is often cumbersome and does not yield easily to current numerical methods. The application of Markov Chain Monte Carlo (MCMC) methods to modeling the nonlinear effects of genes and environment in MZ and DZ twins is outlined. Nonlinear developmental change and genotype x environment interaction in the presence of genotype-environment correlation are explored in simulated twin data. The MCMC method recovers the simulated parameters and provides estimates of error and latent (missing) trait values. Possible limitations of MCMC methods are discussed. Further studies are necessary explore the value of an approach that could extend the horizons of research in developmental genetic epidemiology.
Kim, Jaiseung
2011-04-01
We have made a Markov Chain Monte Carlo (MCMC) analysis of primordial non-Gaussianity (f{sub NL}) using the WMAP bispectrum and power spectrum. In our analysis, we have simultaneously constrained f{sub NL} and cosmological parameters so that the uncertainties of cosmological parameters can properly propagate into the f{sub NL} estimation. Investigating the parameter likelihoods deduced from MCMC samples, we find slight deviation from Gaussian shape, which makes a Fisher matrix estimation less accurate. Therefore, we have estimated the confidence interval of f{sub NL} by exploring the parameter likelihood without using the Fisher matrix. We find that the best-fit values of our analysis make a good agreement with other results, but the confidence interval is slightly different.
Li, Ruochen; Englehardt, James D; Li, Xiaoguang
2012-02-01
Multivariate probability distributions, such as may be used for mixture dose-response assessment, are typically highly parameterized and difficult to fit to available data. However, such distributions may be useful in analyzing the large electronic data sets becoming available, such as dose-response biomarker and genetic information. In this article, a new two-stage computational approach is introduced for estimating multivariate distributions and addressing parameter uncertainty. The proposed first stage comprises a gradient Markov chain Monte Carlo (GMCMC) technique to find Bayesian posterior mode estimates (PMEs) of parameters, equivalent to maximum likelihood estimates (MLEs) in the absence of subjective information. In the second stage, these estimates are used to initialize a Markov chain Monte Carlo (MCMC) simulation, replacing the conventional burn-in period to allow convergent simulation of the full joint Bayesian posterior distribution and the corresponding unconditional multivariate distribution (not conditional on uncertain parameter values). When the distribution of parameter uncertainty is such a Bayesian posterior, the unconditional distribution is termed predictive. The method is demonstrated by finding conditional and unconditional versions of the recently proposed emergent dose-response function (DRF). Results are shown for the five-parameter common-mode and seven-parameter dissimilar-mode models, based on published data for eight benzene-toluene dose pairs. The common mode conditional DRF is obtained with a 21-fold reduction in data requirement versus MCMC. Example common-mode unconditional DRFs are then found using synthetic data, showing a 71% reduction in required data. The approach is further demonstrated for a PCB 126-PCB 153 mixture. Applicability is analyzed and discussed. Matlab(®) computer programs are provided.
Alfaro, Michael E; Zoller, Stefan; Lutzoni, François
2003-02-01
Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap.
Twelve years of succession on sandy substrates in a post-mining landscape: a Markov chain analysis.
Baasch, Annett; Tischew, Sabine; Bruelheide, Helge
2010-06-01
Knowledge of succession rates and pathways is crucial for devising restoration strategies for highly disturbed ecosystems such as surface-mined land. As these processes have often only been described in qualitative terms, we used Markov models to quantify transitions between successional stages. However, Markov models are often considered not attractive for some reasons, such as model assumptions (e.g., stationarity in space and time, or the high expenditure of time required to estimate successional transitions in the field). Here we present a solution for converting multivariate ecological time series into transition matrices and demonstrate the applicability of this approach for a data set that resulted from monitoring the succession of sandy dry grassland in a post-mining landscape. We analyzed five transition matrices, four one-step matrices referring to specific periods of transition (1995-1998, 1998-2001, 2001-2004, 2004-2007), and one matrix for the whole study period (stationary model, 1995-2007). Finally, the stationary model was enhanced to a partly time-variable model. Applying the stationary and the time-variable models, we started a prediction well outside our calibration period, beginning with 100% bare soil in 1974 as the known start of the succession, and generated the coverage of 12 predefined vegetation types in three-year intervals. Transitions among vegetation types changed significantly in space and over time. While the probability of colonization was almost constant over time, the replacement rate tended to increase, indicating that the speed of succession accelerated with time or fluctuations became stronger. The predictions of both models agreed surprisingly well with the vegetation data observed more than two decades later. This shows that our dry grassland succession in a post-mining landscape can be adequately described by comparably simple types of Markov models, although some model assumptions have not been fulfilled and within
Rigaux, Clémence; Ancelet, Sophie; Carlin, Frédéric; Nguyen-thé, Christophe; Albert, Isabelle
2013-05-01
The Monte Carlo (MC) simulation approach is traditionally used in food safety risk assessment to study quantitative microbial risk assessment (QMRA) models. When experimental data are available, performing Bayesian inference is a good alternative approach that allows backward calculation in a stochastic QMRA model to update the experts' knowledge about the microbial dynamics of a given food-borne pathogen. In this article, we propose a complex example where Bayesian inference is applied to a high-dimensional second-order QMRA model. The case study is a farm-to-fork QMRA model considering genetic diversity of Bacillus cereus in a cooked, pasteurized, and chilled courgette purée. Experimental data are Bacillus cereus concentrations measured in packages of courgette purées stored at different time-temperature profiles after pasteurization. To perform a Bayesian inference, we first built an augmented Bayesian network by linking a second-order QMRA model to the available contamination data. We then ran a Markov chain Monte Carlo (MCMC) algorithm to update all the unknown concentrations and unknown quantities of the augmented model. About 25% of the prior beliefs are strongly updated, leading to a reduction in uncertainty. Some updates interestingly question the QMRA model.
Raberto, Marco; Rapallo, Fabio; Scalas, Enrico
2011-01-01
In this paper, we outline a model of graph (or network) dynamics based on two ingredients. The first ingredient is a Markov chain on the space of possible graphs. The second ingredient is a semi-Markov counting process of renewal type. The model consists in subordinating the Markov chain to the semi-Markov counting process. In simple words, this means that the chain transitions occur at random time instants called epochs. The model is quite rich and its possible connections with algebraic geometry are briefly discussed. Moreover, for the sake of simplicity, we focus on the space of undirected graphs with a fixed number of nodes. However, in an example, we present an interbank market model where it is meaningful to use directed graphs or even weighted graphs. PMID:21887245
Adams, Noah S.; Hatton, Tyson W.
2012-01-01
Passage and survival data were collected at McNary Dam between 2006 and 2009. These data have provided critical information for resource managers to implement structural and operational changes designed to improve the survival of juvenile salmonids as they migrate past the dam. Much of the valuable information collected at McNary Dam was in the form of three-dimensional (hereafter referred to as 3-D) tracks of fish movements in the forebay. These data depicted the behavior of multiple species (in three dimensions) during different diel periods, spill conditions, powerhouse operations, and testing of the surface bypass structures (temporary spillway weirs; TSWs). One of the challenges in reporting 3-D results is presenting the information in a manner that allows interested parties to summarize the behavior of many fish over many different conditions across multiple years. To accomplish this, we used a Markov chain analysis to characterize fish movement patterns in the forebay of McNary Dam. The Markov chain analysis allowed us to numerically summarize the behavior of fish in the forebay. This report is the second report published in 2012 that uses this analytical method. The first report included only fish released as part of the annual studies conducted at McNary Dam. This second report includes sockeye salmon that were released as part of studies conducted by the Chelan and Grant County Public Utility Districts at mid-Columbia River dams. The studies conducted in the mid-Columbia used the same transmitters as were used for McNary Dam studies, but transmitter pulse width was different between studies. Additionally, no passive integrated transponder tags were implanted in sockeye salmon. Differences in transmitter pulse width resulted in lower detection probabilities for sockeye salmon at McNary Dam. The absence of passive integrated transponder tags prevented us from determining if fish passed the powerhouse through the juvenile bypass system (JBS) or turbines. To
Hiebeler, David E; Millett, Nicholas E
2011-06-21
We investigate a spatial lattice model of a population employing dispersal to nearest and second-nearest neighbors, as well as long-distance dispersal across the landscape. The model is studied via stochastic spatial simulations, ordinary pair approximation, and triplet approximation. The latter method, which uses the probabilities of state configurations of contiguous blocks of three sites as its state variables, is demonstrated to be greatly superior to pair approximations for estimating spatial correlation information at various scales. Correlations between pairs of sites separated by arbitrary distances are estimated by constructing spatial Markov processes using the information from both approximations. These correlations demonstrate why pair approximation misses basic qualitative features of the model, such as decreasing population density as a large proportion of offspring are dropped on second-nearest neighbors, and why triplet approximation is able to include them. Analytical and numerical results show that, excluding long-distance dispersal, the initial growth rate of an invading population is maximized and the equilibrium population density is also roughly maximized when the population spreads its offspring evenly over nearest and second-nearest neighboring sites.
Sacks-Davis, Rachel; McBryde, Emma; Grebely, Jason; Hellard, Margaret; Vickerman, Peter
2015-03-06
Hepatitis C virus (HCV) reinfection rates are probably underestimated due to reinfection episodes occurring between study visits. A Markov model of HCV reinfection and spontaneous clearance was fitted to empirical data. Bayesian post-estimation was used to project reinfection rates, reinfection spontaneous clearance probability and duration of reinfection. Uniform prior probability distributions were assumed for reinfection rate (more than 0), spontaneous clearance probability (0-1) and duration (0.25-6.00 months). Model estimates were 104 per 100 person-years (95% CrI: 21-344), 0.84 (95% CrI: 0.59-0.98) and 1.3 months (95% CrI: 0.3-4.1) for reinfection rate, spontaneous clearance probability and duration, respectively. Simulation studies were used to assess model validity, demonstrating that the Bayesian model estimates provided useful information about the possible sources and magnitude of bias in epidemiological estimates of reinfection rates, probability of reinfection clearance and duration or reinfection. The quality of the Bayesian estimates improved for larger samples and shorter test intervals. Uncertainty in model estimates notwithstanding, findings suggest that HCV reinfections frequently and quickly result in spontaneous clearance, with many reinfection events going unobserved.
NASA Astrophysics Data System (ADS)
Hasimoto Fengler, Felipe; Leite de Moraes, Jener Fernando; Irio Ribeiro, Admilson; Peche Filho, Afonso; Araujo de Medeiros, Gerson; Baldin Damame, Desirée; Márcia Longo, Regina
2015-04-01
In Brazil is common practice the concurrency of large urban centers water catchment in distant sites. There's no policy to preserve strategic springs in the urban territory. Thus, rural areas, located in the surrounds of municipals, usually provide water and others environment services to the population that reside on cities. The Jundiaí-Mirim river basin, located in the most urbanized state in Brazil, São Paulo, composes an interesting example of this situation. It is located in a rural area near large urban centers, with large industrial parks, near the capital of state. As result of expansion of the cities on its surrounds their lands have had a historic of monetary valorization, making its territories attractive to the housing market. Consequently, the region has an intense process of urbanization that resulted in an increasing environmental disturbance in the areas of natural vegetation. In the other hand, the watershed is the principal water supplier of Jundiaí city, and houses forest remaining of an important Biome in Brazil, the Atlantic Rain Forest. Given the need to preserve its water production capacity and the forest remnants there, this study modeled the environmental quality of forest fragments through indicators of disturbance and evaluated the changes that occur between 1972 and 2013 using the Markov Chain model. The environment quality was determined by nine indicators of environmental disturbance (distance of urban areas, roads, edge land use, size, distance of others forest fragments, land capacity of use, watershed forest cover, number of forest fragments in the watersheds, shape of the forest fragment), obtained by techniques of Geoprocessing, and integrated by Multicriteria Analysis. The Markov Chain model showed a constant tendency of deteriorating in natural vegetation environmental quality, attributed to the intense process of occupation of the river basin. The results showed a historical trend of transformation in forest fragments with
NASA Technical Reports Server (NTRS)
Panday, Prajjwal K.; Williams, Christopher A.; Frey, Karen E.; Brown, Molly E.
2013-01-01
Previous studies have drawn attention to substantial hydrological changes taking place in mountainous watersheds where hydrology is dominated by cryospheric processes. Modelling is an important tool for understanding these changes but is particularly challenging in mountainous terrain owing to scarcity of ground observations and uncertainty of model parameters across space and time. This study utilizes a Markov Chain Monte Carlo data assimilation approach to examine and evaluate the performance of a conceptual, degree-day snowmelt runoff model applied in the Tamor River basin in the eastern Nepalese Himalaya. The snowmelt runoff model is calibrated using daily streamflow from 2002 to 2006 with fairly high accuracy (average Nash-Sutcliffe metric approx. 0.84, annual volume bias <3%). The Markov Chain Monte Carlo approach constrains the parameters to which the model is most sensitive (e.g. lapse rate and recession coefficient) and maximizes model fit and performance. Model simulated streamflow using an interpolated precipitation data set decreases the fractional contribution from rainfall compared with simulations using observed station precipitation. The average snowmelt contribution to total runoff in the Tamor River basin for the 2002-2006 period is estimated to be 29.7+/-2.9% (which includes 4.2+/-0.9% from snowfall that promptly melts), whereas 70.3+/-2.6% is attributed to contributions from rainfall. On average, the elevation zone in the 4000-5500m range contributes the most to basin runoff, averaging 56.9+/-3.6% of all snowmelt input and 28.9+/-1.1% of all rainfall input to runoff. Model simulated streamflow using an interpolated precipitation data set decreases the fractional contribution from rainfall versus snowmelt compared with simulations using observed station precipitation. Model experiments indicate that the hydrograph itself does not constrain estimates of snowmelt versus rainfall contributions to total outflow but that this derives from the degree
Chen, Jinsong; Hubbard, Susan; Rubin, Yoram; Murray, Chris; Roden, Eric; Majer, Ernest
2003-11-18
The spatial distribution of field-scale geochemical parameters, such as extractable Fe(II) and Fe(III), influences microbial processes and thus the efficacy of bioremediation. Because traditional characterization of those parameters is invasive and laborious, it is rarely performed sufficiently at the field-scale. Since both geochemical and geophysical parameters often correlate to some common physical properties (such as lithofacies), we investigated the utility of tomographic radar attenuation data for improving estimation of geochemical parameters using a Markov Chain Monte Carlo (MCMC) approach. The data used in this study included physical, geophysical, and geochemical measurements collected in and between several boreholes at the DOE South Oyster Bacterial Transport Site in Virginia. Results show that geophysical data, constrained by physical data, provided field-scale information about extractable Fe(II) and Fe(III) in a minimally invasive manner and with a resolution unparalleled by other geochemical characterization methods. This study presents our estimation framework for estimating Fe(II) and Fe(III), and its application to a specific site. Our hypothesis--that geochemical parameters and geophysical attributes can be linked through their mutual dependence on physical properties--should be applicable for estimating other geochemical parameters at other sites.
Chen, Jinsong; Hubbard, Susan; Rubin, Yoram; Murray, Christopher J.; Roden, Eric E.; Majer, Ernest L.
2004-12-22
The paper demonstrates the use of ground-penetrating radar (GPR) tomographic data for estimating extractable Fe(II) and Fe(III) concentrations using a Markov chain Monte Carlo (MCMC) approach, based on data collected at the DOE South Oyster Bacterial Transport Site in Virginia. Analysis of multidimensional data including physical, geophysical, geochemical, and hydrogeological measurements collected at the site shows that GPR attenuation and lithofacies are most informative for the estimation. A statistical model is developed for integrating the GPR attenuation and lithofacies data. In the model, lithofacies is considered as a spatially correlated random variable and petrophysical models for linking GPR attenuation to geochemical parameters were derived from data at and near boreholes. Extractable Fe(II) and Fe(III) concentrations at each pixel between boreholes are estimated by conditioning to the co-located GPR data and the lithofacies measurements along boreholes through spatial correlation. Cross-validation results show that geophysical data, constrained by lithofacies, provided information about extractable Fe(II) and Fe(III) concentration in a minimally invasive manner and with a resolution unparalleled by other geochemical characterization methods. The developed model is effective and flexible, and should be applicable for estimating other geochemical parameters at other sites.
Seichter, Felicia; Vogt, Josef; Radermacher, Peter; Mizaikoff, Boris
2017-01-25
The calibration of analytical systems is time-consuming and the effort for daily calibration routines should therefore be minimized, while maintaining the analytical accuracy and precision. The 'calibration transfer' approach proposes to combine calibration data already recorded with actual calibrations measurements. However, this strategy was developed for the multivariate, linear analysis of spectroscopic data, and thus, cannot be applied to sensors with a single response channel and/or a non-linear relationship between signal and desired analytical concentration. To fill this gap for a non-linear calibration equation, we assume that the coefficients for the equation, collected over several calibration runs, are normally distributed. Considering that coefficients of an actual calibration are a sample of this distribution, only a few standards are needed for a complete calibration data set. The resulting calibration transfer approach is demonstrated for a fluorescence oxygen sensor and implemented as a hierarchical Bayesian model, combined with a Lagrange Multipliers technique and Monte-Carlo Markov-Chain sampling. The latter provides realistic estimates for coefficients and prediction together with accurate error bounds by simulating known measurement errors and system fluctuations. Performance criteria for validation and optimal selection of a reduced set of calibration samples were developed and lead to a setup which maintains the analytical performance of a full calibration. Strategies for a rapid determination of problems occurring in a daily calibration routine, are proposed, thereby opening the possibility of correcting the problem just in time.
Spahn, Philipp N.; Hansen, Anders H.; Hansen, Henning G.; Arnsdorf, Johnny; Kildegaard, Helene F.; Lewis, Nathan E.
2016-01-01
Glycosylation is a critical quality attribute of most recombinant biotherapeutics. Consequently, drug development requires careful control of glycoforms to meet bioactivity and biosafety requirements. However, glycoengineering can be extraordinarily difficult given the complex reaction networks underlying glycosylation and the vast number of different glycans that can be synthesized in a host cell. Computational modeling offers an intriguing option to rationally guide glycoengineering, but the high parametric demands of current modeling approaches pose challenges to their application. Here we present a novel low-parameter approach to describe glycosylation using flux-balance and Markov chain modeling. The model recapitulates the biological complexity of glycosylation, but does not require user-provided kinetic information. We use this method to predict and experimentally validate glycoprofiles on EPO, IgG as well as the endogenous secretome following glycosyltransferase knock-out in different Chinese hamster ovary (CHO) cell lines. Our approach offers a flexible and user-friendly platform that can serve as a basis for powerful computational engineering efforts in mammalian cell factories for biopharmaceutical production. PMID:26537759
Kadoura, Ahmad; Sun, Shuyu Salama, Amgad
2014-08-01
Accurate determination of thermodynamic properties of petroleum reservoir fluids is of great interest to many applications, especially in petroleum engineering and chemical engineering. Molecular simulation has many appealing features, especially its requirement of fewer tuned parameters but yet better predicting capability; however it is well known that molecular simulation is very CPU expensive, as compared to equation of state approaches. We have recently introduced an efficient thermodynamically consistent technique to regenerate rapidly Monte Carlo Markov Chains (MCMCs) at different thermodynamic conditions from the existing data points that have been pre-computed with expensive classical simulation. This technique can speed up the simulation more than a million times, making the regenerated molecular simulation almost as fast as equation of state approaches. In this paper, this technique is first briefly reviewed and then numerically investigated in its capability of predicting ensemble averages of primary quantities at different neighboring thermodynamic conditions to the original simulated MCMCs. Moreover, this extrapolation technique is extended to predict second derivative properties (e.g. heat capacity and fluid compressibility). The method works by reweighting and reconstructing generated MCMCs in canonical ensemble for Lennard-Jones particles. In this paper, system's potential energy, pressure, isochoric heat capacity and isothermal compressibility along isochors, isotherms and paths of changing temperature and density from the original simulated points were extrapolated. Finally, an optimized set of Lennard-Jones parameters (ε, σ) for single site models were proposed for methane, nitrogen and carbon monoxide.
NASA Astrophysics Data System (ADS)
Peng, C.; Zhou, X.
2015-12-01
To reduce simulation uncertainties due to inaccurate model parameters, the Markov Chain Monte Carlo (MCMC) method was applied in this study to improve the estimations of four key parameters used in the process-based ecosystem model of TRIPLEX-FLUX. These four key parameters include a maximum photosynthetic carboxylation rate of 25°C (Vcmax), an electron transport (Jmax) light-saturated rate within the photosynthetic carbon reduction cycle of leaves, a coefficient of stomatal conductance (m), and a reference respiration rate of 10ºC (R10). Seven forest flux tower sites located across North America were used to investigate and facilitate understanding of the daily variation in model parameters for three deciduous forests, three evergreen temperate forests, and one evergreen boreal forest. Eddy covariance CO2 exchange measurements were assimilated to optimize the parameters in the year 2006. After parameter optimization and adjustment took place, net ecosystem production prediction significantly improved (by approximately 25%) compared to the CO2 flux measurements taken at the seven forest ecosystem sites.
Nishizawa, Manami; Nishizawa, Kazuhisa
2002-12-01
To study the mechanisms for local evolutionary changes in DNA sequences involving slippage-type insertions and deletions, an alignment approach is explored that can consider the posterior probabilities of alignment models. Various patterns of insertion and deletion that can link the ancestor and descendant sequences are proposed and evaluated by simulation and compared by the Markov chain Monte Carlo (MCMC) method. Analyses of pseudogenes reveal that the introduction of the parameters that control the probability of slippage-type events markedly augments the probability of the observed sequence evolution, arguing that a cryptic involvement of slippage occurrences is manifested as insertions and deletions of short nucleotide segments. Strikingly, approximately 80% of insertions in human pseudogenes and approximately 50% of insertions in murids pseudogenes are likely to be caused by the slippage-mediated process, as represented by BC in ABCD --> ABCBCD. We suggest that, in both human and murids, even very short repetitive motifs, such as CAGCAG, CACACA, and CCCC, have approximately 10- to 15-fold susceptibility to insertions and deletions, compared to nonrepetitive sequences. Our protocol, namely, indel-MCMC, thus seems to be a reasonable approach for statistical analyses of the early phase of microsatellite evolution.
Li, Hong-Dong; Xu, Qing-Song; Liang, Yi-Zeng
2012-08-31
The identification of disease-relevant genes represents a challenge in microarray-based disease diagnosis where the sample size is often limited. Among established methods, reversible jump Markov Chain Monte Carlo (RJMCMC) methods have proven to be quite promising for variable selection. However, the design and application of an RJMCMC algorithm requires, for example, special criteria for prior distributions. Also, the simulation from joint posterior distributions of models is computationally extensive, and may even be mathematically intractable. These disadvantages may limit the applications of RJMCMC algorithms. Therefore, the development of algorithms that possess the advantages of RJMCMC methods and are also efficient and easy to follow for selecting disease-associated genes is required. Here we report a RJMCMC-like method, called random frog that possesses the advantages of RJMCMC methods and is much easier to implement. Using the colon and the estrogen gene expression datasets, we show that random frog is effective in identifying discriminating genes. The top 2 ranked genes for colon and estrogen are Z50753, U00968, and Y10871_at, Z22536_at, respectively. (The source codes with GNU General Public License Version 2.0 are freely available to non-commercial users at: http://code.google.com/p/randomfrog/.).
Karalidi, Theodora; Apai, Dániel; Schneider, Glenn; Hanson, Jake R.; Pasachoff, Jay M.
2015-11-20
Deducing the cloud cover and its temporal evolution from the observed planetary spectra and phase curves can give us major insight into the atmospheric dynamics. In this paper, we present Aeolus, a Markov chain Monte Carlo code that maps the structure of brown dwarf and other ultracool atmospheres. We validated Aeolus on a set of unique Jupiter Hubble Space Telescope (HST) light curves. Aeolus accurately retrieves the properties of the major features of the Jovian atmosphere, such as the Great Red Spot and a major 5 μm hot spot. Aeolus is the first mapping code validated on actual observations of a giant planet over a full rotational period. For this study, we applied Aeolus to J- and H-band HST light curves of 2MASS J21392676+0220226 and 2MASS J0136565+093347. Aeolus retrieves three spots at the top of the atmosphere (per observational wavelength) of these two brown dwarfs, with a surface coverage of 21% ± 3% and 20.3% ± 1.5%, respectively. The Jupiter HST light curves will be publicly available via ADS/VIZIR.
Chen, Jinsong; Kemna, Andreas; Hubbard, Susan S.
2008-05-15
We develop a Bayesian model to invert spectral induced polarization (SIP) data for Cole-Cole parameters using Markov chain Monte Carlo (MCMC) sampling methods. We compare the performance of the MCMC based stochastic method with an iterative Gauss-Newton based deterministic method for Cole-Cole parameter estimation through inversion of synthetic and laboratory SIP data. The Gauss-Newton based method can provide an optimal solution for given objective functions under constraints, but the obtained optimal solution generally depends on the choice of initial values and the estimated uncertainty information is often inaccurate or insufficient. In contrast, the MCMC based inversion method provides extensive global information on unknown parameters, such as the marginal probability distribution functions, from which we can obtain better estimates and tighter uncertainty bounds of the parameters than with the deterministic method. Additionally, the results obtained with the MCMC method are independent of the choice of initial values. Because the MCMC based method does not explicitly offer single optimal solution for given objective functions, the deterministic and stochastic methods can complement each other. For example, the stochastic method can first be used to obtain the means of the unknown parameters by starting from an arbitrary set of initial values and the deterministic method can then be initiated using the means as starting values to obtain the optimal estimates of the Cole-Cole parameters.
Spahn, Philipp N; Hansen, Anders H; Hansen, Henning G; Arnsdorf, Johnny; Kildegaard, Helene F; Lewis, Nathan E
2016-01-01
Glycosylation is a critical quality attribute of most recombinant biotherapeutics. Consequently, drug development requires careful control of glycoforms to meet bioactivity and biosafety requirements. However, glycoengineering can be extraordinarily difficult given the complex reaction networks underlying glycosylation and the vast number of different glycans that can be synthesized in a host cell. Computational modeling offers an intriguing option to rationally guide glycoengineering, but the high parametric demands of current modeling approaches pose challenges to their application. Here we present a novel low-parameter approach to describe glycosylation using flux-balance and Markov chain modeling. The model recapitulates the biological complexity of glycosylation, but does not require user-provided kinetic information. We use this method to predict and experimentally validate glycoprofiles on EPO, IgG as well as the endogenous secretome following glycosyltransferase knock-out in different Chinese hamster ovary (CHO) cell lines. Our approach offers a flexible and user-friendly platform that can serve as a basis for powerful computational engineering efforts in mammalian cell factories for biopharmaceutical production.
NASA Astrophysics Data System (ADS)
Berradja, Khadidja; Boughanmi, Nabil
2016-09-01
In dynamic cardiac PET FDG studies the assessment of myocardial metabolic rate of glucose (MMRG) requires the knowledge of the blood input function (IF). IF can be obtained by manual or automatic blood sampling and cross calibrated with PET. These procedures are cumbersome, invasive and generate uncertainties. The IF is contaminated by spillover of radioactivity from the adjacent myocardium and this could cause important error in the estimated MMRG. In this study, we show that the IF can be extracted from the images in a rat heart study with 18F-fluorodeoxyglucose (18F-FDG) by means of Independent Component Analysis (ICA) based on Bayesian theory and Markov Chain Monte Carlo (MCMC) sampling method (BICA). Images of the heart from rats were acquired with the Sherbrooke small animal PET scanner. A region of interest (ROI) was drawn around the rat image and decomposed into blood and tissue using BICA. The Statistical study showed that there is a significant difference (p < 0.05) between MMRG obtained with IF extracted by BICA with respect to IF extracted from measured images corrupted with spillover.
NASA Astrophysics Data System (ADS)
Liu, Boda; Liang, Yan
2017-04-01
Markov chain Monte Carlo (MCMC) simulation is a powerful statistical method in solving inverse problems that arise from a wide range of applications. In Earth sciences applications of MCMC simulations are primarily in the field of geophysics. The purpose of this study is to introduce MCMC methods to geochemical inverse problems related to trace element fractionation during mantle melting. MCMC methods have several advantages over least squares methods in deciphering melting processes from trace element abundances in basalts and mantle rocks. Here we use an MCMC method to invert for extent of melting, fraction of melt present during melting, and extent of chemical disequilibrium between the melt and residual solid from REE abundances in clinopyroxene in abyssal peridotites from Mid-Atlantic Ridge, Central Indian Ridge, Southwest Indian Ridge, Lena Trough, and American-Antarctic Ridge. We consider two melting models: one with exact analytical solution and the other without. We solve the latter numerically in a chain of melting models according to the Metropolis-Hastings algorithm. The probability distribution of inverted melting parameters depends on assumptions of the physical model, knowledge of mantle source composition, and constraints from the REE data. Results from MCMC inversion are consistent with and provide more reliable uncertainty estimates than results based on nonlinear least squares inversion. We show that chemical disequilibrium is likely to play an important role in fractionating LREE in residual peridotites during partial melting beneath mid-ocean ridge spreading centers. MCMC simulation is well suited for more complicated but physically more realistic melting problems that do not have analytical solutions.
Stochastic algorithms for Markov models estimation with intermittent missing data.
Deltour, I; Richardson, S; Le Hesran, J Y
1999-06-01
Multistate Markov models are frequently used to characterize disease processes, but their estimation from longitudinal data is often hampered by complex patterns of incompleteness. Two algorithms for estimating Markov chain models in the case of intermittent missing data in longitudinal studies, a stochastic EM algorithm and the Gibbs sampler, are described. The first can be viewed as a random perturbation of the EM algorithm and is appropriate when the M step is straightforward but the E step is computationally burdensome. It leads to a good approximation of the maximum likelihood estimates. The Gibbs sampler is used for a full Bayesian inference. The performances of the two algorithms are illustrated on two simulated data sets. A motivating example concerned with the modelling of the evolution of parasitemia by Plasmodium falciparum (malaria) in a cohort of 105 young children in Cameroon is described and briefly analyzed.
BIE: Bayesian Inference Engine
NASA Astrophysics Data System (ADS)
Weinberg, Martin D.
2013-12-01
The Bayesian Inference Engine (BIE) is an object-oriented library of tools written in C++ designed explicitly to enable Bayesian update and model comparison for astronomical problems. To facilitate "what if" exploration, BIE provides a command line interface (written with Bison and Flex) to run input scripts. The output of the code is a simulation of the Bayesian posterior distribution from which summary statistics e.g. by taking moments, or determine confidence intervals and so forth, can be determined. All of these quantities are fundamentally integrals and the Markov Chain approach produces variates heta distributed according to P( heta|D) so moments are trivially obtained by summing of the ensemble of variates.
Aberer, Andre J; Stamatakis, Alexandros; Ronquist, Fredrik
2016-01-01
Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes.
Efficient Markov Network Structure Discovery Using Independence Tests
Bromberg, Facundo; Margaritis, Dimitris; Honavar, Vasant
2011-01-01
We present two algorithms for learning the structure of a Markov network from data: GSMN* and GSIMN. Both algorithms use statistical independence tests to infer the structure by successively constraining the set of structures consistent with the results of these tests. Until very recently, algorithms for structure learning were based on maximum likelihood estimation, which has been proved to be NP-hard for Markov networks due to the difficulty of estimating the parameters of the network, needed for the computation of the data likelihood. The independence-based approach does not require the computation of the likelihood, and thus both GSMN* and GSIMN can compute the structure efficiently (as shown in our experiments). GSMN* is an adaptation of the Grow-Shrink algorithm of Margaritis and Thrun for learning the structure of Bayesian networks. GSIMN extends GSMN* by additionally exploiting Pearl’s well-known properties of the conditional independence relation to infer novel independences from known ones, thus avoiding the performance of statistical tests to estimate them. To accomplish this efficiently GSIMN uses the Triangle theorem, also introduced in this work, which is a simplified version of the set of Markov axioms. Experimental comparisons on artificial and real-world data sets show GSIMN can yield significant savings with respect to GSMN*, while generating a Markov network with comparable or in some cases improved quality. We also compare GSIMN to a forward-chaining implementation, called GSIMN-FCH, that produces all possible conditional independences resulting from repeatedly applying Pearl’s theorems on the known conditional independence tests. The results of this comparison show that GSIMN, by the sole use of the Triangle theorem, is nearly optimal in terms of the set of independences tests that it infers. PMID:22822297
Bayesian inference for agreement measures.
Vidal, Ignacio; de Castro, Mário
2016-08-25
The agreement of different measurement methods is an important issue in several disciplines like, for example, Medicine, Metrology, and Engineering. In this article, some agreement measures, common in the literature, were analyzed from a Bayesian point of view. Posterior inferences for such agreement measures were obtained based on well-known Bayesian inference procedures for the bivariate normal distribution. As a consequence, a general, simple, and effective method is presented, which does not require Markov Chain Monte Carlo methods and can be applied considering a great variety of prior distributions. Illustratively, the method was exemplified using five objective priors for the bivariate normal distribution. A tool for assessing the adequacy of the model is discussed. Results from a simulation study and an application to a real dataset are also reported.
Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.
Spade, David A; Herbei, Radu; Kubatko, Laura S
2015-10-01
One of the fundamental goals in phylogenetics is to make inferences about the evolutionary pattern among a group of individuals, such as genes or species, using present-day genetic material. This pattern is represented by a phylogenetic tree, and as computational methods have caught up to the statistical theory, Bayesian methods of making inferences about phylogenetic trees have become increasingly popular. Bayesian inference of phylogenetic trees requires sampling from intractable probability distributions. Common methods of sampling from these distributions include Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods, and one way that both of these methods can proceed is by first simulating a tree topology and then taking a sample from the posterior distribution of the branch lengths given the tree topology and the data set. In many MCMC methods, it is difficult to verify that the underlying Markov chain is geometrically ergodic, and thus, it is necessary to rely on output-based convergence diagnostics in order to assess convergence on an ad hoc basis. These diagnostics suffer from several important limitations, so in an effort to circumvent these limitations, this work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chain takes to converge.
NASA Technical Reports Server (NTRS)
Smith, R. M.
1991-01-01
Numerous applications in the area of computer system analysis can be effectively studied with Markov reward models. These models describe the behavior of the system with a continuous-time Markov chain, where a reward rate is associated with each state. In a reliability/availability model, upstates may have reward rate 1 and down states may have reward rate zero associated with them. In a queueing model, the number of jobs of certain type in a given state may be the reward rate attached to that state. In a combined model of performance and reliability, the reward rate of a state may be the computational capacity, or a related performance measure. Expected steady-state reward rate and expected instantaneous reward rate are clearly useful measures of the Markov reward model. More generally, the distribution of accumulated reward or time-averaged reward over a finite time interval may be determined from the solution of the Markov reward model. This information is of great practical significance in situations where the workload can be well characterized (deterministically, or by continuous functions e.g., distributions). The design process in the development of a computer system is an expensive and long term endeavor. For aerospace applications the reliability of the computer system is essential, as is the ability to complete critical workloads in a well defined real time interval. Consequently, effective modeling of such systems must take into account both performance and reliability. This fact motivates our use of Markov reward models to aid in the development and evaluation of fault tolerant computer systems.
Wang, Huiyuan; Mo, H. J.; Yang, Xiaohu; Lin, W. P.; Jing, Y. P.
2014-10-10
Simulating the evolution of the local universe is important for studying galaxies and the intergalactic medium in a way free of cosmic variance. Here we present a method to reconstruct the initial linear density field from an input nonlinear density field, employing the Hamiltonian Markov Chain Monte Carlo (HMC) algorithm combined with Particle-mesh (PM) dynamics. The HMC+PM method is applied to cosmological simulations, and the reconstructed linear density fields are then evolved to the present day with N-body simulations. These constrained simulations accurately reproduce both the amplitudes and phases of the input simulations at various z. Using a PM model with a grid cell size of 0.75 h {sup –1} Mpc and 40 time steps in the HMC can recover more than half of the phase information down to a scale k ∼ 0.85 h Mpc{sup –1} at high z and to k ∼ 3.4 h Mpc{sup –1} at z = 0, which represents a significant improvement over similar reconstruction models in the literature, and indicates that our model can reconstruct the formation histories of cosmic structures over a large dynamical range. Adopting PM models with higher spatial and temporal resolutions yields even better reconstructions, suggesting that our method is limited more by the availability of computer resource than by principle. Dynamic models of structure evolution adopted in many earlier investigations can induce non-Gaussianity in the reconstructed linear density field, which in turn can cause large systematic deviations in the predicted halo mass function. Such deviations are greatly reduced or absent in our reconstruction.
Eells, Samantha J.; Bharadwa, Kiran; McKinnell, James A.; Miller, Loren G.
2014-01-01
Background. Recurrent urinary tract infections (UTIs) are a common problem among women. However, comparative effectiveness strategies for managing recurrent UTIs are lacking. Methods. We performed a systematic literature review of management of women experiencing ≥3 UTIs per year. We then developed a Markov chain Monte Carlo model of recurrent UTI for each management strategy with ≥2 adequate trials published. We simulated a cohort that experienced 3 UTIs/year and a secondary cohort that experienced 8 UTIs/year. Model outcomes were treatment efficacy, patient and payer cost, and health-related quality of life. Results. Five strategies had ≥2 clinical trials published: (1) daily antibiotic (nitrofurantoin) prophylaxis; (2) daily estrogen prophylaxis; (3) daily cranberry prophylaxis; (4) acupuncture prophylaxis; and (5) symptomatic self-treatment. In the 3 UTIs/year model, nitrofurantoin prophylaxis was most effective, reducing the UTI rate to 0.4 UTIs/year, and the most expensive to the payer ($821/year). All other strategies resulted in payer cost savings but were less efficacious. Symptomatic self-treatment was the only strategy that resulted in patient cost savings, and was the most favorable strategy in term of cost per quality-adjusted life-year (QALY) gained. Conclusions. Daily antibiotic use is the most effective strategy for recurrent UTI prevention compared to daily cranberry pills, daily estrogen therapy, and acupuncture. Cost savings to payers and patients were seen for most regimens, and improvement in QALYs were seen with all. Our findings provide clinically meaningful data to guide the physician–patient partnership in determining a preferred method of prevention for this common clinical problem. PMID:24065333
Schmidt, Philip J; Pintar, Katarina D M; Fazil, Aamir M; Topp, Edward
2013-09-01
Dose-response models are the essential link between exposure assessment and computed risk values in quantitative microbial risk assessment, yet the uncertainty that is inherent to computed risks because the dose-response model parameters are estimated using limited epidemiological data is rarely quantified. Second-order risk characterization approaches incorporating uncertainty in dose-response model parameters can provide more complete information to decisionmakers by separating variability and uncertainty to quantify the uncertainty in computed risks. Therefore, the objective of this work is to develop procedures to sample from posterior distributions describing uncertainty in the parameters of exponential and beta-Poisson dose-response models using Bayes's theorem and Markov Chain Monte Carlo (in OpenBUGS). The theoretical origins of the beta-Poisson dose-response model are used to identify a decomposed version of the model that enables Bayesian analysis without the need to evaluate Kummer confluent hypergeometric functions. Herein, it is also established that the beta distribution in the beta-Poisson dose-response model cannot address variation among individual pathogens, criteria to validate use of the conventional approximation to the beta-Poisson model are proposed, and simple algorithms to evaluate actual beta-Poisson probabilities of infection are investigated. The developed MCMC procedures are applied to analysis of a case study data set, and it is demonstrated that an important region of the posterior distribution of the beta-Poisson dose-response model parameters is attributable to the absence of low-dose data. This region includes beta-Poisson models for which the conventional approximation is especially invalid and in which many beta distributions have an extreme shape with questionable plausibility.
Fransson, Martin Niclas; Barregard, Lars; Sallsten, Gerd; Akerstrom, Magnus; Johanson, Gunnar
2014-10-01
The health effects of low-level chronic exposure to cadmium are increasingly recognized. To improve the risk assessment, it is essential to know the relation between cadmium intake, body burden, and biomarker levels of cadmium. We combined a physiologically-based toxicokinetic (PBTK) model for cadmium with a data set from healthy kidney donors to re-estimate the model parameters and to test the effects of gender and serum ferritin on systemic uptake. Cadmium levels in whole blood, blood plasma, kidney cortex, and urinary excretion from 82 men and women were used to calculate posterior distributions for model parameters using Markov-chain Monte Carlo analysis. For never- and ever-smokers combined, the daily systemic uptake was estimated at 0.0063 μg cadmium/kg body weight in men, with 35% increased uptake in women and a daily uptake of 1.2 μg for each pack-year per calendar year of smoking. The rate of urinary excretion from cadmium accumulated in the kidney was estimated at 0.000042 day(-1), corresponding to a half-life of 45 years in the kidneys. We have provided an improved model of cadmium kinetics. As the new parameter estimates derive from a single study with measurements in several compartments in each individual, these new estimates are likely to be more accurate than the previous ones where the data used originated from unrelated data sets. The estimated urinary excretion of cadmium accumulated in the kidneys was much lower than previous estimates, neglecting this finding may result in a marked under-prediction of the true kidney burden.
NASA Astrophysics Data System (ADS)
Chaudhuri, Sutapa; Goswami, Sayantika; Das, Debanjana; Middey, Anirban
2014-05-01
Forecasting summer monsoon rainfall with precision becomes crucial for the farmers to plan for harvesting in a country like India where the national economy is mostly based on regional agriculture. The forecast of monsoon rainfall based on artificial neural network is a well-researched problem. In the present study, the meta-heuristic ant colony optimization (ACO) technique is implemented to forecast the amount of summer monsoon rainfall for the next day over Kolkata (22.6°N, 88.4°E), India. The ACO technique belongs to swarm intelligence and simulates the decision-making processes of ant colony similar to other adaptive learning techniques. ACO technique takes inspiration from the foraging behaviour of some ant species. The ants deposit pheromone on the ground in order to mark a favourable path that should be followed by other members of the colony. A range of rainfall amount replicating the pheromone concentration is evaluated during the summer monsoon season. The maximum amount of rainfall during summer monsoon season (June—September) is observed to be within the range of 7.5-35 mm during the period from 1998 to 2007, which is in the range 4 category set by the India Meteorological Department (IMD). The result reveals that the accuracy in forecasting the amount of rainfall for the next day during the summer monsoon season using ACO technique is 95 % where as the forecast accuracy is 83 % with Markov chain model (MCM). The forecast through ACO and MCM are compared with other existing models and validated with IMD observations from 2008 to 2012.
Building Simple Hidden Markov Models. Classroom Notes
ERIC Educational Resources Information Center
Ching, Wai-Ki; Ng, Michael K.
2004-01-01
Hidden Markov models (HMMs) are widely used in bioinformatics, speech recognition and many other areas. This note presents HMMs via the framework of classical Markov chain models. A simple example is given to illustrate the model. An estimation method for the transition probabilities of the hidden states is also discussed.
Unmixing hyperspectral images using Markov random fields
Eches, Olivier; Dobigeon, Nicolas; Tourneret, Jean-Yves
2011-03-14
This paper proposes a new spectral unmixing strategy based on the normal compositional model that exploits the spatial correlations between the image pixels. The pure materials (referred to as endmembers) contained in the image are assumed to be available (they can be obtained by using an appropriate endmember extraction algorithm), while the corresponding fractions (referred to as abundances) are estimated by the proposed algorithm. Due to physical constraints, the abundances have to satisfy positivity and sum-to-one constraints. The image is divided into homogeneous distinct regions having the same statistical properties for the abundance coefficients. The spatial dependencies within each class are modeled thanks to Potts-Markov random fields. Within a Bayesian framework, prior distributions for the abundances and the associated hyperparameters are introduced. A reparametrization of the abundance coefficients is proposed to handle the physical constraints (positivity and sum-to-one) inherent to hyperspectral imagery. The parameters (abundances), hyperparameters (abundance mean and variance for each class) and the classification map indicating the classes of all pixels in the image are inferred from the resulting joint posterior distribution. To overcome the complexity of the joint posterior distribution, Markov chain Monte Carlo methods are used to generate samples asymptotically distributed according to the joint posterior of interest. Simulations conducted on synthetic and real data are presented to illustrate the performance of the proposed algorithm.
Generator estimation of Markov jump processes
NASA Astrophysics Data System (ADS)
Metzner, P.; Dittmer, E.; Jahnke, T.; Schütte, Ch.
2007-11-01
Estimating the generator of a continuous-time Markov jump process based on incomplete data is a problem which arises in various applications ranging from machine learning to molecular dynamics. Several methods have been devised for this purpose: a quadratic programming approach (cf. [D.T. Crommelin, E. Vanden-Eijnden, Fitting timeseries by continuous-time Markov chains: a quadratic programming approach, J. Comp. Phys. 217 (2006) 782-805]), a resolvent method (cf. [T. Müller, Modellierung von Proteinevolution, PhD thesis, Heidelberg, 2001]), and various implementations of an expectation-maximization algorithm ([S. Asmussen, O. Nerman, M. Olsson, Fitting phase-type distributions via the EM algorithm, Scand. J. Stat. 23 (1996) 419-441; I. Holmes, G.M. Rubin, An expectation maximization algorithm for training hidden substitution models, J. Mol. Biol. 317 (2002) 753-764; U. Nodelman, C.R. Shelton, D. Koller, Expectation maximization and complex duration distributions for continuous time Bayesian networks, in: Proceedings of the twenty-first conference on uncertainty in AI (UAI), 2005, pp. 421-430; M. Bladt, M. Sørensen, Statistical inference for discretely observed Markov jump processes, J.R. Statist. Soc. B 67 (2005) 395-410]). Some of these methods, however, seem to be known only in a particular research community, and have later been reinvented in a different context. The purpose of this paper is to compile a catalogue of existing approaches, to compare the strengths and weaknesses, and to test their performance in a series of numerical examples. These examples include carefully chosen model problems and an application to a time series from molecular dynamics.
Allen, Bruce C; Hack, C Eric; Clewell, Harvey J
2007-08-01
A Bayesian approach, implemented using Markov Chain Monte Carlo (MCMC) analysis, was applied with a physiologically-based pharmacokinetic (PBPK) model of methylmercury (MeHg) to evaluate the variability of MeHg exposure in women of childbearing age in the U.S. population. The analysis made use of the newly available National Health and Nutrition Survey (NHANES) blood and hair mercury concentration data for women of age 16-49 years (sample size, 1,582). Bayesian analysis was performed to estimate the population variability in MeHg exposure (daily ingestion rate) implied by the variation in blood and hair concentrations of mercury in the NHANES database. The measured variability in the NHANES blood and hair data represents the result of a process that includes interindividual variation in exposure to MeHg and interindividual variation in the pharmacokinetics (distribution, clearance) of MeHg. The PBPK model includes a number of pharmacokinetic parameters (e.g., tissue volumes, partition coefficients, rate constants for metabolism and elimination) that can vary from individual to individual within the subpopulation of interest. Using MCMC analysis, it was possible to combine prior distributions of the PBPK model parameters with the NHANES blood and hair data, as well as with kinetic data from controlled human exposures to MeHg, to derive posterior distributions that refine the estimates of both the population exposure distribution and the pharmacokinetic parameters. In general, based on the populations surveyed by NHANES, the results of the MCMC analysis indicate that a small fraction, less than 1%, of the U.S. population of women of childbearing age may have mercury exposures greater than the EPA RfD for MeHg of 0.1 microg/kg/day, and that there are few, if any, exposures greater than the ATSDR MRL of 0.3 microg/kg/day. The analysis also indicates that typical exposures may be greater than previously estimated from food consumption surveys, but that the variability
Destri, C.; Vega, H. J. de; Sanchez, N. G.
2008-02-15
We perform a Monte Carlo Markov chains (MCMC) analysis of the available cosmic microwave background (CMB) and large scale structure (LSS) data (including the three years WMAP data) with single field slow-roll new inflation and chaotic inflation models. We do this within our approach to inflation as an effective field theory in the Ginsburg-Landau spirit with fourth degree trinomial potentials in the inflaton field {phi}. We derive explicit formulae and study in detail the spectral index n{sub s} of the adiabatic fluctuations, the ratio r of tensor to scalar fluctuations, and the running index dn{sub s}/dlnk. We use these analytic formulas as hard constraints on n{sub s} and r in the MCMC analysis. Our analysis differs in this crucial aspect from previous MCMC studies in the literature involving the WMAP3 data. Our results are as follows: (i) The data strongly indicate the breaking (whether spontaneous or explicit) of the {phi}{yields}-{phi} symmetry of the inflaton potentials both for new and for chaotic inflation. (ii) Trinomial new inflation naturally satisfies this requirement and provides an excellent fit to the data. (iii) Trinomial chaotic inflation produces the best fit in a very narrow corner of the parameter space. (iv) The chaotic symmetric trinomial potential is almost certainly ruled out (at 95% C.L.). In trinomial chaotic inflation the MCMC runs go towards a potential in the boundary of the parameter space and which resembles a spontaneously symmetry broken potential of new inflation. (v) The above results and further physical analysis here lead us to conclude that new inflation gives the best description of the data. (vi) We find a lower bound for r within trinomial new inflation potentials: r>0.016(95%CL) and r>0.049(68%CL). (vii) The preferred new inflation trinomial potential is a double well, even function of the field with a moderate quartic coupling yielding as most probable values: n{sub s}{approx_equal}0.958, r{approx_equal}0.055. This value
Spectral likelihood expansions for Bayesian inference
NASA Astrophysics Data System (ADS)
Nagel, Joseph B.; Sudret, Bruno
2016-03-01
A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Exact significance test for Markov order
NASA Astrophysics Data System (ADS)
Pethel, S. D.; Hahs, D. W.
2014-02-01
We describe an exact significance test of the null hypothesis that a Markov chain is nth order. The procedure utilizes surrogate data to yield an exact test statistic distribution valid for any sample size. Surrogate data are generated using a novel algorithm that guarantees, per shot, a uniform sampling from the set of sequences that exactly match the nth order properties of the observed data. Using the test, the Markov order of Tel Aviv rainfall data is examined.
NASA Astrophysics Data System (ADS)
MacBean, Natasha; Disney, Mathias; Lewis, Philip; Ineson, Phil
2010-05-01
profile as a whole. We present results from an Observing System Simulation Experiment (OSSE) designed to investigate the impact of management and climate change on peatland carbon fluxes, as well as how observations from satellites may be able to constrain modeled carbon fluxes. We use an adapted version of the Carnegie-Ames-Stanford Approach (CASA) model (Potter et al., 1993) that includes a representation of methane dynamics (Potter, 1997). The model formulation is further modified to allow for assimilation of satellite observations of surface soil moisture and land surface temperature. The observations are used to update model estimates using a Metropolis Hastings Markov Chain Monte Carlo (MCMC) approach. We examine the effect of temporal frequency and precision of satellite observations with a view to establishing how, and at what level, such observations would make a significant improvement in model uncertainty. We compare this with the system characteristics of existing and future satellites. We believe this is the first attempt to assimilate surface soil moisture and land surface temperature into an ecosystem model that includes a full representation of CH4 flux. Bubier, J., and T. Moore (1994), An ecological perspective on methane emissions from northern wetlands, TREE, 9, 460-464. Charman, D. (2002), Peatlands and Environmental Change, JohnWiley and Sons, Ltd, England. Gorham, E. (1991), Northern peatlands: Role in the carbon cycle and probable responses to climatic warming, Ecological Applications, 1, 182-195. Lai, D. (2009), Methane dynamics in northern peatlands: A review, Pedosphere, 19, 409-421. Le Mer, J., and P. Roger (2001), Production, oxidation, emission and consumption of methane by soils: A review, European Journal of Soil Biology, 37, 25-50. Limpens, J., F. Berendse, J. Canadell, C. Freeman, J. Holden, N. Roulet, H. Rydin, and Potter, C. (1997), An ecosystem simulation model for methane production and emission from wetlands, Global Biogeochemical
Computationally efficient Bayesian inference for inverse problems.
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Markov stochasticity coordinates
NASA Astrophysics Data System (ADS)
Eliazar, Iddo
2017-01-01
Markov dynamics constitute one of the most fundamental models of random motion between the states of a system of interest. Markov dynamics have diverse applications in many fields of science and engineering, and are particularly applicable in the context of random motion in networks. In this paper we present a two-dimensional gauging method of the randomness of Markov dynamics. The method-termed Markov Stochasticity Coordinates-is established, discussed, and exemplified. Also, the method is tweaked to quantify the stochasticity of the first-passage-times of Markov dynamics, and the socioeconomic equality and mobility in human societies.
Application of Markov Graphs in Marketing
NASA Astrophysics Data System (ADS)
Bešić, C.; Sajfert, Z.; Đorđević, D.; Sajfert, V.
2007-04-01
The applications of Markov's processes theory in marketing are discussed. It was turned out that Markov's processes have wide field of applications. The advancement of marketing by the use of convolution of stationary Markov's distributions is analysed. It turned out that convolution distribution gives average net profit that is two times higher than the one obtained by usual Markov's distribution. It can be achieved if one selling chain is divided onto two parts with different ratios of output and input frequencies. The stability of marketing system was examined by the use of conforming coefficients. It was shown, by means of Jensen inequality that system remains stable if initial capital is higher than averaged losses.
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Fancher, Chris M; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J; Smith, Ralph C; Wilson, Alyson G; Jones, Jacob L
2016-08-23
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method.
Bayesian inference for OPC modeling
NASA Astrophysics Data System (ADS)
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
BAYESIAN INFERENCE OF CMB GRAVITATIONAL LENSING
Anderes, Ethan; Wandelt, Benjamin D.; Lavaux, Guilhem
2015-08-01
The Planck satellite, along with several ground-based telescopes, has mapped the cosmic microwave background (CMB) at sufficient resolution and signal-to-noise so as to allow a detection of the subtle distortions due to the gravitational influence of the intervening matter distribution. A natural modeling approach is to write a Bayesian hierarchical model for the lensed CMB in terms of the unlensed CMB and the lensing potential. So far there has been no feasible algorithm for inferring the posterior distribution of the lensing potential from the lensed CMB map. We propose a solution that allows efficient Markov Chain Monte Carlo sampling from the joint posterior of the lensing potential and the unlensed CMB map using the Hamiltonian Monte Carlo technique. The main conceptual step in the solution is a re-parameterization of CMB lensing in terms of the lensed CMB and the “inverse lensing” potential. We demonstrate a fast implementation on simulated data, including noise and a sky cut, that uses a further acceleration based on a very mild approximation of the inverse lensing potential. We find that the resulting Markov Chain has short correlation lengths and excellent convergence properties, making it promising for applications to high-resolution CMB data sets in the future.
Evolutionary inference via the Poisson Indel Process.
Bouchard-Côté, Alexandre; Jordan, Michael I
2013-01-22
We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.
BAYESIAN HIERARCHICAL MODELING FOR SIGNALING PATHWAY INFERENCE FROM SINGLE CELL INTERVENTIONAL DATA1
Luo, Ruiyan; Zhao, Hongyu
2011-01-01
Recent technological advances have made it possible to simultaneously measure multiple protein activities at the single cell level. With such data collected under different stimulatory or inhibitory conditions, it is possible to infer the causal relationships among proteins from single cell interventional data. In this article we propose a Bayesian hierarchical modeling framework to infer the signaling pathway based on the posterior distributions of parameters in the model. Under this framework, we consider network sparsity and model the existence of an association between two proteins both at the overall level across all experiments and at each individual experimental level. This allows us to infer the pairs of proteins that are associated with each other and their causal relationships. We also explicitly consider both intrinsic noise and measurement error. Markov chain Monte Carlo is implemented for statistical inference. We demonstrate that this hierarchical modeling can effectively pool information from different interventional experiments through simulation studies and real data analysis. PMID:22162986
Phase transitions in Hidden Markov Models
NASA Astrophysics Data System (ADS)
Bechhoefer, John; Lathouwers, Emma
In Hidden Markov Models (HMMs), a Markov process is not directly accessible. In the simplest case, a two-state Markov model ``emits'' one of two ``symbols'' at each time step. We can think of these symbols as noisy measurements of the underlying state. With some probability, the symbol implies that the system is in one state when it is actually in the other. The ability to judge which state the system is in sets the efficiency of a Maxwell demon that observes state fluctuations in order to extract heat from a coupled reservoir. The state-inference problem is to infer the underlying state from such noisy measurements at each time step. We show that there can be a phase transition in such measurements: for measurement error rates below a certain threshold, the inferred state always matches the observation. For higher error rates, there can be continuous or discontinuous transitions to situations where keeping a memory of past observations improves the state estimate. We can partly understand this behavior by mapping the HMM onto a 1d random-field Ising model at zero temperature. We also present more recent work that explores a larger parameter space and more states. Research funded by NSERC, Canada.
Inferring Ancestral Recombination Graphs from Bacterial Genomic Data
Vaughan, Timothy G.; Welch, David; Drummond, Alexei J.; Biggs, Patrick J.; George, Tessy; French, Nigel P.
2017-01-01
Homologous recombination is a central feature of bacterial evolution, yet it confounds traditional phylogenetic methods. While a number of methods specific to bacterial evolution have been developed, none of these permit joint inference of a bacterial recombination graph and associated parameters. In this article, we present a new method which addresses this shortcoming. Our method uses a novel Markov chain Monte Carlo algorithm to perform phylogenetic inference under the ClonalOrigin model. We demonstrate the utility of our method by applying it to ribosomal multilocus sequence typing data sequenced from pathogenic and nonpathogenic Escherichia coli serotype O157 and O26 isolates collected in rural New Zealand. The method is implemented as an open source BEAST 2 package, Bacter, which is available via the project web page at http://tgvaughan.github.io/bacter. PMID:28007885
Bayesian Inference for Skewed Stable Distributions
NASA Astrophysics Data System (ADS)
Shokripour, Mona; Nassiri, Vahid; Mohammadpour, Adel
2011-03-01
Stable distributions are a class of distributions which allow skewness and heavy tail. Non-Gaussian stable random variables play the role of normal distribution in the central limit theorem, for normalized sums of random variables with infinite variance. The lack of analytic formula for density and distribution functions of stable random variables has been a major drawback to the use of stable distributions, also in the case of inference in Bayesian framework. Buckle introduced priors for the parameters of stable random variables to obtain an analytic form of posterior distribution. However, many researchers tried to solve the problem, through the Markov chain Monte Carlo methods, e.g. [8] and their references. In this paper a new class of heavy-tailed distribution is introduced, called skewed stable. This class has two main advantages: It has many inferential advantages, since it is a member of exponential family, so the Bayesian inference can be drawn similar to the exponential family of distributions and modelling skew data with stable distributions is dominated by this family. Finally, Bayesian inference for skewed stable arc compared to the stable distributions through a few simulations study.
Test to determine the Markov order of a time series.
Racca, E; Laio, F; Poggi, D; Ridolfi, L
2007-01-01
The Markov order of a time series is an important measure of the "memory" of a process, and its knowledge is fundamental for the correct simulation of the characteristics of the process. For this reason, several techniques have been proposed in the past for its estimation. However, most of this methods are rather complex, and often can be applied only in the case of Markov chains. Here we propose a simple and robust test to evaluate the Markov order of a time series. Only the first-order moment of the conditional probability density function characterizing the process is used to evaluate the memory of the process itself. This measure is called the "expected value Markov (EVM) order." We show that there is good agreement between the EVM order and the known Markov order of some synthetic time series.
Genome-Wide Inference of Ancestral Recombination Graphs
Rasmussen, Matthew D.; Hubisz, Melissa J.; Gronau, Ilan; Siepel, Adam
2014-01-01
The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the “ancestral recombination graph” (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally intensive, highly approximate, or limited to small numbers of sequences, and, as a consequence, explicit ARG inference is rarely used in applied population genomics. Here, we introduce a new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes. The key idea of our approach is to sample an ARG of chromosomes conditional on an ARG of chromosomes, an operation we call “threading.” Using techniques based on hidden Markov models, we can perform this threading operation exactly, up to the assumptions of the sequentially Markov coalescent and a discretization of time. An extension allows for threading of subtrees instead of individual sequences. Repeated application of these threading operations results in highly efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these methods in a computer program called ARGweaver. Experiments with simulated data indicate that ARGweaver converges rapidly to the posterior distribution over ARGs and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations. In applications of ARGweaver to 54 human genome sequences from Complete Genomics, we find clear signatures of natural selection, including regions of unusually ancient ancestry associated with balancing selection and reductions in allele age in sites under directional selection. The patterns we observe near protein-coding genes are consistent with a primary influence from background selection rather than hitchhiking, although we cannot rule out a contribution from recurrent selective sweeps. PMID:24831947
Samah, S; Valadez-Moctezuma, E; Peláez-Luna, K S; Morales-Manzano, S; Meza-Carrera, P; Cid-Contreras, R C
2016-06-03
Molecular methods are powerful tools in characterizing and determining relationships between plants. The aim of this study was to study genetic divergence between 103 accessions of Mexican Opuntia. To accomplish this, polymerase chain reaction (PCR)-restriction fragment length polymorphism analysis of three chloroplast intergenic spacers (atpB-rbcL, trnL-trnF, and psbA-trnH), one chloroplast gene (ycf1), two nuclear genes (ppc and PhyC), and one mitochondrial gene (cox3) was conducted. The amplified products from all the samples had very similar molecular sizes, and there were only very small differences between the undigested PCR amplicons for all regions, with the exception of ppc. We obtained 5850 bp from the seven regions, and 136 fragments were detected with eight enzymes, 37 of which (27.2%) were polymorphic. We found that 40% of the fragments from the chloroplast regions were polymorphic, 9.8% of the bands detected in the nuclear genes were polymorphic, and 20% of the bands in the mitochondrial locus were polymorphic. trnL-trnF and psbA-trnH were the most variable regions. The Nei and Li/Dice distance was very short, and ranged from 0 to 0.12; indeed, 77 of the 103 genotypes had the same genetic profile. All the xoconostle accessions (acidic fruits) were grouped together without being separated from three genotypes of prickly pear (sweet fruits). We assume that the genetic divergence between prickly pears and xoconostles is very low, and question the number of Opuntia species currently considered in Mexico.
Performability analysis using semi-Markov reward processes
NASA Technical Reports Server (NTRS)
Ciardo, Gianfranco; Marie, Raymond A.; Sericola, Bruno; Trivedi, Kishor S.
1990-01-01
Beaudry (1978) proposed a simple method of computing the distribution of performability in a Markov reward process. Two extensions of Beaudry's approach are presented. The method is generalized to a semi-Markov reward process by removing the restriction requiring the association of zero reward to absorbing states only. The algorithm proceeds by replacing zero-reward nonabsorbing states by a probabilistic switch; it is therefore related to the elimination of vanishing states from the reachability graph of a generalized stochastic Petri net and to the elimination of fast transient states in a decomposition approach to stiff Markov chains. The use of the approach is illustrated with three applications.
Universal recovery map for approximate Markov chains
Sutter, David; Fawzi, Omar; Renner, Renato
2016-01-01
A central question in quantum information theory is to determine how well lost information can be reconstructed. Crucially, the corresponding recovery operation should perform well without knowing the information to be reconstructed. In this work, we show that the quantum conditional mutual information measures the performance of such recovery operations. More precisely, we prove that the conditional mutual information I(A:C|B) of a tripartite quantum state ρABC can be bounded from below by its distance to the closest recovered state RB→BC(ρAB), where the C-part is reconstructed from the B-part only and the recovery map RB→BC merely depends on ρBC. One particular application of this result implies the equivalence between two different approaches to define topological order in quantum systems. PMID:27118889
Markov Chain Monte Carlo from Lagrangian Dynamics
Lan, Shiwei; Stathopoulos, Vasileios; Shahbaba, Babak; Girolami, Mark
2014-01-01
Hamiltonian Monte Carlo (HMC) improves the computational e ciency of the Metropolis-Hastings algorithm by reducing its random walk behavior. Riemannian HMC (RHMC) further improves the performance of HMC by exploiting the geometric properties of the parameter space. However, the geometric integrator used for RHMC involves implicit equations that require fixed-point iterations. In some cases, the computational overhead for solving implicit equations undermines RHMC's benefits. In an attempt to circumvent this problem, we propose an explicit integrator that replaces the momentum variable in RHMC by velocity. We show that the resulting transformation is equivalent to transforming Riemannian Hamiltonian dynamics to Lagrangian dynamics. Experimental results suggests that our method improves RHMC's overall computational e ciency in the cases considered. All computer programs and data sets are available online (http://www.ics.uci.edu/~babaks/Site/Codes.html) in order to allow replication of the results reported in this paper. PMID:26240515
sick: The Spectroscopic Inference Crank
NASA Astrophysics Data System (ADS)
Casey, Andrew R.
2016-03-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Spatiotemporal Bayesian inference dipole analysis for MEG neuroimaging data.
Jun, Sung C; George, John S; Paré-Blagoev, Juliana; Plis, Sergey M; Ranken, Doug M; Schmidt, David M; Wood, C C
2005-10-15
Recently, we described a Bayesian inference approach to the MEG/EEG inverse problem that used numerical techniques to estimate the full posterior probability distributions of likely solutions upon which all inferences were based [Schmidt, D.M., George, J.S., Wood, C.C., 1999. Bayesian inference applied to the electromagnetic inverse problem. Human Brain Mapping 7, 195; Schmidt, D.M., George, J.S., Ranken, D.M., Wood, C.C., 2001. Spatial-temporal bayesian inference for MEG/EEG. In: Nenonen, J., Ilmoniemi, R. J., Katila, T. (Eds.), Biomag 2000: 12th International Conference on Biomagnetism. Espoo, Norway, p. 671]. Schmidt et al. (1999) focused on the analysis of data at a single point in time employing an extended region source model. They subsequently extended their work to a spatiotemporal Bayesian inference analysis of the full spatiotemporal MEG/EEG data set. Here, we formulate spatiotemporal Bayesian inference analysis using a multi-dipole model of neural activity. This approach is faster than the extended region model, does not require use of the subject's anatomical information, does not require prior determination of the number of dipoles, and yields quantitative probabilistic inferences. In addition, we have incorporated the ability to handle much more complex and realistic estimates of the background noise, which may be represented as a sum of Kronecker products of temporal and spatial noise covariance components. This reduces the effects of undermodeling noise. In order to reduce the rigidity of the multi-dipole formulation which commonly causes problems due to multiple local minima, we treat the given covariance of the background as uncertain and marginalize over it in the analysis. Markov Chain Monte Carlo (MCMC) was used to sample the many possible likely solutions. The spatiotemporal Bayesian dipole analysis is demonstrated using simulated and empirical whole-head MEG data.
Probabilistic Resilience in Hidden Markov Models
NASA Astrophysics Data System (ADS)
Panerati, Jacopo; Beltrame, Giovanni; Schwind, Nicolas; Zeltner, Stefan; Inoue, Katsumi
2016-05-01
Originally defined in the context of ecological systems and environmental sciences, resilience has grown to be a property of major interest for the design and analysis of many other complex systems: resilient networks and robotics systems other the desirable capability of absorbing disruption and transforming in response to external shocks, while still providing the services they were designed for. Starting from an existing formalization of resilience for constraint-based systems, we develop a probabilistic framework based on hidden Markov models. In doing so, we introduce two new important features: stochastic evolution and partial observability. Using our framework, we formalize a methodology for the evaluation of probabilities associated with generic properties, we describe an efficient algorithm for the computation of its essential inference step, and show that its complexity is comparable to other state-of-the-art inference algorithms.
Estimation and uncertainty of reversible Markov models
NASA Astrophysics Data System (ADS)
Trendelkamp-Schroer, Benjamin; Wu, Hao; Paul, Fabian; Noé, Frank
2015-11-01
Reversibility is a key concept in Markov models and master-equation models of molecular kinetics. The analysis and interpretation of the transition matrix encoding the kinetic properties of the model rely heavily on the reversibility property. The estimation of a reversible transition matrix from simulation data is, therefore, crucial to the successful application of the previously developed theory. In this work, we discuss methods for the maximum likelihood estimation of transition matrices from finite simulation data and present a new algorithm for the estimation if reversibility with respect to a given stationary vector is desired. We also develop new methods for the Bayesian posterior inference of reversible transition matrices with and without given stationary vector taking into account the need for a suitable prior distribution preserving the meta-stable features of the observed process during posterior inference. All algorithms here are implemented in the PyEMMA software — http://pyemma.org — as of version 2.0.
Abstraction Augmented Markov Models.
Caragea, Cornelia; Silvescu, Adrian; Caragea, Doina; Honavar, Vasant
2010-12-13
High accuracy sequence classification often requires the use of higher order Markov models (MMs). However, the number of MM parameters increases exponentially with the range of direct dependencies between sequence elements, thereby increasing the risk of overfitting when the data set is limited in size. We present abstraction augmented Markov models (AAMMs) that effectively reduce the number of numeric parameters of k(th) order MMs by successively grouping strings of length k (i.e., k-grams) into abstraction hierarchies. We evaluate AAMMs on three protein subcellular localization prediction tasks. The results of our experiments show that abstraction makes it possible to construct predictive models that use significantly smaller number of features (by one to three orders of magnitude) as compared to MMs. AAMMs are competitive with and, in some cases, significantly outperform MMs. Moreover, the results show that AAMMs often perform significantly better than variable order Markov models, such as decomposed context tree weighting, prediction by partial match, and probabilistic suffix trees.
Overstall, Antony M; Woods, David C
2013-06-01
Bayesian inference is considered for statistical models that depend on the evaluation of a computationally expensive computer code or simulator. For such situations, the number of evaluations of the likelihood function, and hence of the unnormalized posterior probability density function, is determined by the available computational resource and may be extremely limited. We present a new example of such a simulator that describes the properties of human embryonic stem cells using data from optical trapping experiments. This application is used to motivate a novel strategy for Bayesian inference which exploits a Gaussian process approximation of the simulator and allows computationally efficient Markov chain Monte Carlo inference. The advantages of this strategy over previous methodology are that it is less reliant on the determination of tuning parameters and allows the application of model diagnostic procedures that require no additional evaluations of the simulator. We show the advantages of our method on synthetic examples and demonstrate its application on stem cell experiments.
Degradation monitoring using probabilistic inference
NASA Astrophysics Data System (ADS)
Alpay, Bulent
In order to increase safety and improve economy and performance in a nuclear power plant (NPP), the source and extent of component degradations should be identified before failures and breakdowns occur. It is also crucial for the next generation of NPPs, which are designed to have a long core life and high fuel burnup to have a degradation monitoring system in order to keep the reactor in a safe state, to meet the designed reactor core lifetime and to optimize the scheduled maintenance. Model-based methods are based on determining the inconsistencies between the actual and expected behavior of the plant, and use these inconsistencies for detection and diagnostics of degradations. By defining degradation as a random abrupt change from the nominal to a constant degraded state of a component, we employed nonlinear filtering techniques based on state/parameter estimation. We utilized a Bayesian recursive estimation formulation in the sequential probabilistic inference framework and constructed a hidden Markov model to represent a general physical system. By addressing the problem of a filter's inability to estimate an abrupt change, which is called the oblivious filter problem in nonlinear extensions of Kalman filtering, and the sample impoverishment problem in particle filtering, we developed techniques to modify filtering algorithms by utilizing additional data sources to improve the filter's response to this problem. We utilized a reliability degradation database that can be constructed from plant specific operational experience and test and maintenance reports to generate proposal densities for probable degradation modes. These are used in a multiple hypothesis testing algorithm. We then test samples drawn from these proposal densities with the particle filtering estimates based on the Bayesian recursive estimation formulation with the Metropolis Hastings algorithm, which is a well-known Markov chain Monte Carlo method (MCMC). This multiple hypothesis testing
Sun, Shuying; Yu, Xiaoqing
2016-03-01
DNA methylation is an epigenetic event that plays an important role in regulating gene expression. It is important to study DNA methylation, especially differential methylation patterns between two groups of samples (e.g. patients vs. normal individuals). With next generation sequencing technologies, it is now possible to identify differential methylation patterns by considering methylation at the single CG site level in an entire genome. However, it is challenging to analyze large and complex NGS data. In order to address this difficult question, we have developed a new statistical method using a hidden Markov model and Fisher's exact test (HMM-Fisher) to identify differentially methylated cytosines and regions. We first use a hidden Markov chain to model the methylation signals to infer the methylation state as Not methylated (N), Partly methylated (P), and Fully methylated (F) for each individual sample. We then use Fisher's exact test to identify differentially methylated CG sites. We show the HMM-Fisher method and compare it with commonly cited methods using both simulated data and real sequencing data. The results show that HMM-Fisher outperforms the current available methods to which we have compared. HMM-Fisher is efficient and robust in identifying heterogeneous DM regions.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Bayesian inference in processing experimental data: principles and basic applications
NASA Astrophysics Data System (ADS)
D'Agostini, G.
2003-09-01
This paper introduces general ideas and some basic methods of the Bayesian probability theory applied to physics measurements. Our aim is to make the reader familiar, through examples rather than rigorous formalism, with concepts such as the following: model comparison (including the automatic Ockham's Razor filter provided by the Bayesian approach); parametric inference; quantification of the uncertainty about the value of physical quantities, also taking into account systematic effects; role of marginalization; posterior characterization; predictive distributions; hierarchical modelling and hyperparameters; Gaussian approximation of the posterior and recovery of conventional methods, especially maximum likelihood and chi-square fits under well-defined conditions; conjugate priors, transformation invariance and maximum entropy motivated priors; and Monte Carlo (MC) estimates of expectation, including a short introduction to Markov Chain MC methods.
Nonparametric inference of network structure and dynamics
NASA Astrophysics Data System (ADS)
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Signal inference with unknown response: Calibration-uncertainty renormalized estimator
NASA Astrophysics Data System (ADS)
Dorn, Sebastian; Enßlin, Torsten A.; Greiner, Maksim; Selig, Marco; Boehm, Vanessa
2015-01-01
The calibration of a measurement device is crucial for every scientific experiment, where a signal has to be inferred from data. We present CURE, the calibration-uncertainty renormalized estimator, to reconstruct a signal and simultaneously the instrument's calibration from the same data without knowing the exact calibration, but its covariance structure. The idea of the CURE method, developed in the framework of information field theory, is to start with an assumed calibration to successively include more and more portions of calibration uncertainty into the signal inference equations and to absorb the resulting corrections into renormalized signal (and calibration) solutions. Thereby, the signal inference and calibration problem turns into a problem of solving a single system of ordinary differential equations and can be identified with common resummation techniques used in field theories. We verify the CURE method by applying it to a simplistic toy example and compare it against existent self-calibration schemes, Wiener filter solutions, and Markov chain Monte Carlo sampling. We conclude that the method is able to keep up in accuracy with the best self-calibration methods and serves as a noniterative alternative to them.
Signal inference with unknown response: calibration-uncertainty renormalized estimator.
Dorn, Sebastian; Enßlin, Torsten A; Greiner, Maksim; Selig, Marco; Boehm, Vanessa
2015-01-01
The calibration of a measurement device is crucial for every scientific experiment, where a signal has to be inferred from data. We present CURE, the calibration-uncertainty renormalized estimator, to reconstruct a signal and simultaneously the instrument's calibration from the same data without knowing the exact calibration, but its covariance structure. The idea of the CURE method, developed in the framework of information field theory, is to start with an assumed calibration to successively include more and more portions of calibration uncertainty into the signal inference equations and to absorb the resulting corrections into renormalized signal (and calibration) solutions. Thereby, the signal inference and calibration problem turns into a problem of solving a single system of ordinary differential equations and can be identified with common resummation techniques used in field theories. We verify the CURE method by applying it to a simplistic toy example and compare it against existent self-calibration schemes, Wiener filter solutions, and Markov chain Monte Carlo sampling. We conclude that the method is able to keep up in accuracy with the best self-calibration methods and serves as a noniterative alternative to them.
NASA Astrophysics Data System (ADS)
King, Gary; Rosen, Ori; Tanner, Martin A.
2004-09-01
This collection of essays brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half-decade has witnessed an explosion of research in ecological inference--the process of trying to infer individual behavior from aggregate data. Although uncertainties and information lost in aggregation make ecological inference one of the most problematic types of research to rely on, these inferences are required in many academic fields, as well as by legislatures and the Courts in redistricting, by business in marketing research, and by governments in policy analysis.
Quantum hidden Markov models based on transition operation matrices
NASA Astrophysics Data System (ADS)
Cholewa, Michał; Gawron, Piotr; Głomb, Przemysław; Kurzyk, Dariusz
2017-04-01
In this work, we extend the idea of quantum Markov chains (Gudder in J Math Phys 49(7):072105 [3]) in order to propose quantum hidden Markov models (QHMMs). For that, we use the notions of transition operation matrices and vector states, which are an extension of classical stochastic matrices and probability distributions. Our main result is the Mealy QHMM formulation and proofs of algorithms needed for application of this model: Forward for general case and Vitterbi for a restricted class of QHMMs. We show the relations of the proposed model to other quantum HMM propositions and present an example of application.
The Manhattan Frame Model - Manhattan World Inference in the Space of Surface Normals.
Straub, Julian; Freifeld, Oren; Rosman, Guy; Leonard, John J; Fisher, John W
2017-02-01
Objects and structures within man-made environments typically exhibit a high degree of organization in the form of orthogonal and parallel planes. Traditional approaches utilize these regularities via the restrictive, and rather local, Manhattan World (MW) assumption which posits that every plane is perpendicular to one of the axes of a single coordinate system. The aforementioned regularities are especially evident in the surface normal distribution of a scene where they manifest as orthogonally-coupled clusters. This motivates the introduction of the Manhattan-Frame (MF) model which captures the notion of a MW in the surface normals space, the unit sphere, and two probabilistic MF models over this space. First, for a single MF we propose novel real-time MAP inference algorithms, evaluate their performance and their use in drift-free rotation estimation. Second, to capture the complexity of real-world scenes at a global scale, we extend the MF model to a probabilistic mixture of Manhattan Frames (MMF). For MMF inference we propose a simple MAP inference algorithm and an adaptive Markov-Chain Monte-Carlo sampling algorithm with Metropolis-Hastings split/merge moves that let us infer the unknown number of mixture components. We demonstrate the versatility of the MMF model and inference algorithm across several scales of man-made environments.
Sunspots and ENSO relationship using Markov method
NASA Astrophysics Data System (ADS)
Hassan, Danish; Iqbal, Asif; Ahmad Hassan, Syed; Abbas, Shaheen; Ansari, Muhammad Rashid Kamal
2016-01-01
The various techniques have been used to confer the existence of significant relations between the number of Sunspots and different terrestrial climate parameters such as rainfall, temperature, dewdrops, aerosol and ENSO etc. Improved understanding and modelling of Sunspots variations can explore the information about the related variables. This study uses a Markov chain method to find the relations between monthly Sunspots and ENSO data of two epochs (1996-2009 and 1950-2014). Corresponding transition matrices of both data sets appear similar and it is qualitatively evaluated by high values of 2-dimensional correlation found between transition matrices of ENSO and Sunspots. The associated transition diagrams show that each state communicates with the others. Presence of stronger self-communication (between same states) confirms periodic behaviour among the states. Moreover, closeness found in the expected number of visits from one state to the other show the existence of a possible relation between Sunspots and ENSO data. Moreover, perfect validation of dependency and stationary tests endorses the applicability of the Markov chain analyses on Sunspots and ENSO data. This shows that a significant relation between Sunspots and ENSO data exists. Improved understanding and modelling of Sunspots variations can help to explore the information about the related variables. This study can be useful to explore the influence of ENSO related local climatic variability.
Markov sequential pattern recognition : dependency and the unknown class.
Malone, Kevin Thomas; Haschke, Greg Benjamin; Koch, Mark William
2004-10-01
The sequential probability ratio test (SPRT) minimizes the expected number of observations to a decision and can solve problems in sequential pattern recognition. Some problems have dependencies between the observations, and Markov chains can model dependencies where the state occupancy probability is geometric. For a non-geometric process we show how to use the effective amount of independent information to modify the decision process, so that we can account for the remaining dependencies. Along with dependencies between observations, a successful system needs to handle the unknown class in unconstrained environments. For example, in an acoustic pattern recognition problem any sound source not belonging to the target set is in the unknown class. We show how to incorporate goodness of fit (GOF) classifiers into the Markov SPRT, and determine the worse case nontarget model. We also develop a multiclass Markov SPRT using the GOF concept.
Constructing Dynamic Event Trees from Markov Models
Paolo Bucci; Jason Kirschenbaum; Tunc Aldemir; Curtis Smith; Ted Wood
2006-05-01
In the probabilistic risk assessment (PRA) of process plants, Markov models can be used to model accurately the complex dynamic interactions between plant physical process variables (e.g., temperature, pressure, etc.) and the instrumentation and control system that monitors and manages the process. One limitation of this approach that has prevented its use in nuclear power plant PRAs is the difficulty of integrating the results of a Markov analysis into an existing PRA. In this paper, we explore a new approach to the generation of failure scenarios and their compilation into dynamic event trees from a Markov model of the system. These event trees can be integrated into an existing PRA using software tools such as SAPHIRE. To implement our approach, we first construct a discrete-time Markov chain modeling the system of interest by: a) partitioning the process variable state space into magnitude intervals (cells), b) using analytical equations or a system simulator to determine the transition probabilities between the cells through the cell-to-cell mapping technique, and, c) using given failure/repair data for all the components of interest. The Markov transition matrix thus generated can be thought of as a process model describing the stochastic dynamic behavior of the finite-state system. We can therefore search the state space starting from a set of initial states to explore all possible paths to failure (scenarios) with associated probabilities. We can also construct event trees of arbitrary depth by tracing paths from a chosen initiating event and recording the following events while keeping track of the probabilities associated with each branch in the tree. As an example of our approach, we use the simple level control system often used as benchmark in the literature with one process variable (liquid level in a tank), and three control units: a drain unit and two supply units. Each unit includes a separate level sensor to observe the liquid level in the tank
Indexed semi-Markov process for wind speed modeling.
NASA Astrophysics Data System (ADS)
Petroni, F.; D'Amico, G.; Prattico, F.
2012-04-01
The increasing interest in renewable energy leads scientific research to find a better way to recover most of the available energy. Particularly, the maximum energy recoverable from wind is equal to 59.3% of that available (Betz law) at a specific pitch angle and when the ratio between the wind speed in output and in input is equal to 1/3. The pitch angle is the angle formed between the airfoil of the blade of the wind turbine and the wind direction. Old turbine and a lot of that actually marketed, in fact, have always the same invariant geometry of the airfoil. This causes that wind turbines will work with an efficiency that is lower than 59.3%. New generation wind turbines, instead, have a system to variate the pitch angle by rotating the blades. This system able the wind turbines to recover, at different wind speed, always the maximum energy, working in Betz limit at different speed ratios. A powerful system control of the pitch angle allows the wind turbine to recover better the energy in transient regime. A good stochastic model for wind speed is then needed to help both the optimization of turbine design and to assist the system control to predict the value of the wind speed to positioning the blades quickly and correctly. The possibility to have synthetic data of wind speed is a powerful instrument to assist designer to verify the structures of the wind turbines or to estimate the energy recoverable from a specific site. To generate synthetic data, Markov chains of first or higher order are often used [1,2,3]. In particular in [1] is presented a comparison between a first-order Markov chain and a second-order Markov chain. A similar work, but only for the first-order Markov chain, is conduced by [2], presenting the probability transition matrix and comparing the energy spectral density and autocorrelation of real and synthetic wind speed data. A tentative to modeling and to join speed and direction of wind is presented in [3], by using two models, first
Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.
2011-01-01
Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348
Csuros, Miklos; Rogozin, Igor B; Koonin, Eugene V
2011-09-01
Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6-7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing.
2012-03-01
58 3. Third Category of Data of HN Officers ...........................................59 B. CONCEPTUAL FRAMEWORK/DESCRIPTION OF MARKOV...performance (Barrick & Mount,1991). Factor 4, Emotional Adjustment, is often labeled by its opposite, Neuroticism , which is the tendency to be...category of officers comprises the pool. B. CONCEPTUAL FRAMEWORK/DESCRIPTION OF MARKOV-CHAIN MODELS Military organizations, as has been stated
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
NASA Astrophysics Data System (ADS)
Ticchi, Alessandro; Faisal, Aldo A.; Brain; Behaviour Lab Team
2015-03-01
Experimental evidence at the behavioural-level shows that the brains are able to make Bayes-optimal inference and decisions (Kording and Wolpert 2004, Nature; Ernst and Banks, 2002, Nature), yet at the circuit level little is known about how neural circuits may implement Bayesian learning and inference (but see (Ma et al. 2006, Nat Neurosci)). Molecular sources of noise are clearly established to be powerful enough to pose limits to neural function and structure in the brain (Faisal et al. 2008, Nat Rev Neurosci; Faisal et al. 2005, Curr Biol). We propose a spking neuron model where we exploit molecular noise as a useful resource to implement close-to-optimal inference by sampling. Specifically, we derive a synaptic plasticity rule which, coupled with integrate-and-fire neural dynamics and recurrent inhibitory connections, enables a neural population to learn the statistical properties of the received sensory input (prior). Moreover, the proposed model allows to combine prior knowledge with additional sources of information (likelihood) from another neural population, and to implement in spiking neurons a Markov Chain Monte Carlo algorithm which generates samples from the inferred posterior distribution.
Inference on the Strength of Balancing Selection for Epistatically Interacting Loci
Buzbas, Erkan Ozge; Joyce, Paul; Rosenberg, Noah A.
2011-01-01
Existing inference methods for estimating the strength of balancing selection in multi-locus genotypes rely on the assumption that there are no epistatic interactions between loci. Complex systems in which balancing selection is prevalent, such as sets of human immune system genes, are known to contain components that interact epistatically. Therefore, current methods may not produce reliable inference on the strength of selection at these loci. In this paper, we address this problem by presenting statistical methods that can account for epistatic interactions in making inference about balancing selection. A theoretical result due to Fearnhead (2006) is used to build a multi-locus Wright-Fisher model of balancing selection, allowing for epistatic interactions among loci. Antagonistic and synergistic types of interactions are examined. The joint posterior distribution of the selection and mutation parameters is sampled by Markov chain Monte Carlo methods, and the plausibility of models is assessed via Bayes factors. As a component of the inference process, an algorithm to generate multi-locus allele frequencies under balancing selection models with epistasis is also presented. Recent evidence on interactions among a set of human immune system genes is introduced as a motivating biological system for the epistatic model, and data on these genes are used to demonstrate the methods. PMID:21277883
Bayesian Inference for Time Trends in Parameter Values using Weighted Evidence Sets
D. L. Kelly; A. Malkhasyan
2010-09-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performed for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “in-dustry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an applica-tion of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates an approach to incorporating multiple sources of data via applicability weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.
On Markov parameters in system identification
NASA Technical Reports Server (NTRS)
Phan, Minh; Juang, Jer-Nan; Longman, Richard W.
1991-01-01
A detailed discussion of Markov parameters in system identification is given. Different forms of input-output representation of linear discrete-time systems are reviewed and discussed. Interpretation of sampled response data as Markov parameters is presented. Relations between the state-space model and particular linear difference models via the Markov parameters are formulated. A generalization of Markov parameters to observer and Kalman filter Markov parameters for system identification is explained. These extended Markov parameters play an important role in providing not only a state-space realization, but also an observer/Kalman filter for the system of interest.
Latent mixed Markov modelling of smoking transitions using Monte Carlo bootstrapping.
Mannan, Haider R; Koval, John J
2003-03-01
It has been established that measures and reports of smoking behaviours are subject to substantial measurement errors. Thus, the manifest Markov model which does not consider measurement error in observed responses may not be adequate to mathematically model changes in adolescent smoking behaviour over time. For this purpose we fit several Mixed Markov Latent Class (MMLC) models using data sets from two longitudinal panel studies--the third Waterloo Smoking Prevention study and the UWO smoking study, which have varying numbers of measurements on adolescent smoking behaviour. However, the conventional statistics used for testing goodness of fit of these models do not follow the theoretical chi-square distribution when there is data sparsity. The two data sets analysed had varying degrees of sparsity. This problem can be solved by estimating the proper distribution of fit measures using Monte Carlo bootstrap simulation. In this study, we showed that incorporating response uncertainty in smoking behaviour significantly improved the fit of a single Markov chain model. However, the single chain latent Markov model did not adequately fit the two data sets indicating that the smoking process was heterogeneous with regard to latent Markov chains. It was found that a higher percentage of students (except for never smokers) changed their smoking behaviours over time at the manifest level compared to the latent or true level. The smoking process generally accelerated with time. The students had a tendency to underreport their smoking behaviours while response uncertainty was estimated to be considerably less for the Waterloo smoking study which adopted the 'bogus pipeline' method for reducing measurement error while the UWO study did not. For the two-chain latent mixed Markov models, incorporating a 'stayer' chain to an unrestricted Markov chain led to a significant improvement in model fit for the UWO study only. For both data sets, the assumption for the existence of an
Aggelopoulos, Nikolaos C
2015-08-01
Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience.
Markov Analysis of Sleep Dynamics
NASA Astrophysics Data System (ADS)
Kim, J. W.; Lee, J.-S.; Robinson, P. A.; Jeong, D.-U.
2009-05-01
A new approach, based on a Markov transition matrix, is proposed to explain frequent sleep and wake transitions during sleep. The matrix is determined by analyzing hypnograms of 113 obstructive sleep apnea patients. Our approach shows that the statistics of sleep can be constructed via a single Markov process and that durations of all states have modified exponential distributions, in contrast to recent reports of a scale-free form for the wake stage and an exponential form for the sleep stage. Hypnograms of the same subjects, but treated with Continuous Positive Airway Pressure, are analyzed and compared quantitatively with the pretreatment ones, suggesting potential clinical applications.
The Moments of Matched and Mismatched Hidden Markov Models
1987-06-11
preprocessor. Denote by hkj the probability that the observation symbol Vk is altered to symbol V. by the noise mechanism and define the m-by-m noise3...probability matrix H = [ hkj ]. It is assumed that H is independent of the state of the Markov chain and of time t. Consequently, the output of a given HMM...chain is i and that symbol j is produced, given that symbol k was the output of the given HMM. The sum over k of bik hkj gives the component b.j of the
NASA Astrophysics Data System (ADS)
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr.
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
NASA Astrophysics Data System (ADS)
Hu, Zixi; Yao, Zhewei; Li, Jinglai
2017-03-01
Many scientific and engineering problems require to perform Bayesian inference for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.
Inference on cancer screening exam accuracy using population-level administrative data.
Jiang, H; Brown, P E; Walter, S D
2016-01-15
This paper develops a model for cancer screening and cancer incidence data, accommodating the partially unobserved disease status, clustered data structures, general covariate effects, and dependence between exams. The true unobserved cancer and detection status of screening participants are treated as latent variables, and a Markov Chain Monte Carlo algorithm is used to estimate the Bayesian posterior distributions of the diagnostic error rates and disease prevalence. We show how the Bayesian approach can be used to draw inferences about screening exam properties and disease prevalence while allowing for the possibility of conditional dependence between two exams. The techniques are applied to the estimation of the diagnostic accuracy of mammography and clinical breast examination using data from the Ontario Breast Screening Program in Canada.
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-10-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the technique also allows one to compute informative "error bars" on the volume estimates of individual structures.
NASA Astrophysics Data System (ADS)
Khan, Shahjahan
Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden "jewels" in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model
NASA Astrophysics Data System (ADS)
Khan, Shahjahan
Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden “jewels” in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model
Statistical inference for noisy nonlinear ecological dynamic systems.
Wood, Simon N
2010-08-26
Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.
Weighted-indexed semi-Markov models for modeling financial returns
NASA Astrophysics Data System (ADS)
D'Amico, Guglielmo; Petroni, Filippo
2012-07-01
In this paper we propose a new stochastic model based on a generalization of semi-Markov chains for studying the high frequency price dynamics of traded stocks. We assume that the financial returns are described by a weighted-indexed semi-Markov chain model. We show, through Monte Carlo simulations, that the model is able to reproduce important stylized facts of financial time series such as the first-passage-time distributions and the persistence of volatility. The model is applied to data from the Italian and German stock markets from 1 January 2007 until the end of December 2010.
Inferring transient particle transport dynamics in live cells.
Monnier, Nilah; Barry, Zachary; Park, Hye Yoon; Su, Kuan-Chung; Katz, Zachary; English, Brian P; Dey, Arkajit; Pan, Keyao; Cheeseman, Iain M; Singer, Robert H; Bathe, Mark
2015-09-01
Live-cell imaging and particle tracking provide rich information on mechanisms of intracellular transport. However, trajectory analysis procedures to infer complex transport dynamics involving stochastic switching between active transport and diffusive motion are lacking. We applied Bayesian model selection to hidden Markov modeling to infer transient transport states from trajectories of mRNA-protein complexes in live mouse hippocampal neurons and metaphase kinetochores in dividing human cells. The software is available at http://hmm-bayes.org/.
The generalization ability of online SVM classification based on Markov sampling.
Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang
2015-03-01
In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.
Extended Dissipative State Estimation for Markov Jump Neural Networks With Unreliable Links.
Shen, Hao; Zhu, Yanzheng; Zhang, Lixian; Park, Ju H
2017-02-01
This paper is concerned with the problem of extended dissipativity-based state estimation for discrete-time Markov jump neural networks (NNs), where the variation of the piecewise time-varying transition probabilities of Markov chain is subject to a set of switching signals satisfying an average dwell-time property. The communication links between the NNs and the estimator are assumed to be imperfect, where the phenomena of signal quantization and data packet dropouts occur simultaneously. The aim of this paper is to contribute with a Markov switching estimator design method, which ensures that the resulting error system is extended stochastically dissipative, in the simultaneous presences of packet dropouts and signal quantization stemmed from unreliable communication links. Sufficient conditions for the solvability of such a problem are established. Based on the derived conditions, an explicit expression of the desired Markov switching estimator is presented. Finally, two illustrated examples are given to show the effectiveness of the proposed design method.
Markov reliability models for digital flight control systems
NASA Technical Reports Server (NTRS)
Mcgough, John; Reibman, Andrew; Trivedi, Kishor
1989-01-01
The reliability of digital flight control systems can often be accurately predicted using Markov chain models. The cost of numerical solution depends on a model's size and stiffness. Acyclic Markov models, a useful special case, are particularly amenable to efficient numerical solution. Even in the general case, instantaneous coverage approximation allows the reduction of some cyclic models to more readily solvable acyclic models. After considering the solution of single-phase models, the discussion is extended to phased-mission models. Phased-mission reliability models are classified based on the state restoration behavior that occurs between mission phases. As an economical approach for the solution of such models, the mean failure rate solution method is introduced. A numerical example is used to show the influence of fault-model parameters and interphase behavior on system unreliability.
Multivariate longitudinal data analysis with mixed effects hidden Markov models.
Raffa, Jesse D; Dubin, Joel A
2015-09-01
Multiple longitudinal responses are often collected as a means to capture relevant features of the true outcome of interest, which is often hidden and not directly measurable. We outline an approach which models these multivariate longitudinal responses as generated from a hidden disease process. We propose a class of models which uses a hidden Markov model with separate but correlated random effects between multiple longitudinal responses. This approach was motivated by a smoking cessation clinical trial, where a bivariate longitudinal response involving both a continuous and a binomial response was collected for each participant to monitor smoking behavior. A Bayesian method using Markov chain Monte Carlo is used. Comparison of separate univariate response models to the bivariate response models was undertaken. Our methods are demonstrated on the smoking cessation clinical trial dataset, and properties of our approach are examined through extensive simulation studies.
Choice of Units and the Causal Markov Condition
NASA Astrophysics Data System (ADS)
Zhang, Jiji; Spirtes, Peter
2014-03-01
Elliott Sober's well-known challenge to the principle of the common cause -- and to its generalization, the causal Markov condition -- appeals to the apparent positive correlation between two causally unconnected quantities: Venetian sea levels and British bread prices. In this paper we examine Kevin Hoover's and Daniel Steel's opposite evaluations of Sober's case. We argue that the difference in their assessments results from a difference in their choice of units and populations for statistical modeling. Our analysis suggests yet another diagnosis of Sober's counterexample: the failure of the causal Markov condition in the population chosen by Sober and Steel is due to the presence of causal relations that hold between the relevant properties across units. Such inter-unit causation is left unrepresented in causal models congenial to statistical analysis, because statistics does not deal with inter-unit relationships once the units are fixed. Accordingly, the causal Markov condition is formulated in terms of causal structures that depict intra-unit causal relations only. It is therefore worth highlighting a methodological principle for causal inference: the units should be so chosen that they do not interfere with each other, a principle that, fortunately, is often observed in practice.
Markov Boundary Discovery with Ridge Regularized Linear Models
Visweswaran, Shyam
2016-01-01
Ridge regularized linear models (RRLMs), such as ridge regression and the SVM, are a popular group of methods that are used in conjunction with coefficient hypothesis testing to discover explanatory variables with a significant multivariate association to a response. However, many investigators are reluctant to draw causal interpretations of the selected variables due to the incomplete knowledge of the capabilities of RRLMs in causal inference. Under reasonable assumptions, we show that a modified form of RRLMs can get “very close” to identifying a subset of the Markov boundary by providing a worst-case bound on the space of possible solutions. The results hold for any convex loss, even when the underlying functional relationship is nonlinear, and the solution is not unique. Our approach combines ideas in Markov boundary and sufficient dimension reduction theory. Experimental results show that the modified RRLMs are competitive against state-of-the-art algorithms in discovering part of the Markov boundary from gene expression data. PMID:27170915
Bayesian Inference for Latent Biologic Structure with Determinantal Point Processes (DPP)
Xu, Yanxun; Müller, Peter; Telesca, Donatello
2016-01-01
Summary We discuss the use of the determinantal point process (DPP) as a prior for latent structure in biomedical applications, where inference often centers on the interpretation of latent features as biologically or clinically meaningful structure. Typical examples include mixture models, when the terms of the mixture are meant to represent clinically meaningful subpopulations (of patients, genes, etc.). Another class of examples are feature allocation models. We propose the DPP prior as a repulsive prior on latent mixture components in the first example, and as prior on feature-specific parameters in the second case. We argue that the DPP is in general an attractive prior model for latent structure when biologically relevant interpretation of such structure is desired. We illustrate the advantages of DPP prior in three case studies, including inference in mixture models for magnetic resonance images (MRI) and for protein expression, and a feature allocation model for gene expression using data from The Cancer Genome Atlas. An important part of our argument are efficient and straightforward posterior simulation methods. We implement a variation of reversible jump Markov chain Monte Carlo simulation for inference under the DPP prior, using a density with respect to the unit rate Poisson process. PMID:26873271
Α Markov model for longitudinal studies with incomplete dichotomous outcomes.
Efthimiou, Orestis; Welton, Nicky; Samara, Myrto; Leucht, Stefan; Salanti, Georgia
2017-03-01
Missing outcome data constitute a serious threat to the validity and precision of inferences from randomized controlled trials. In this paper, we propose the use of a multistate Markov model for the analysis of incomplete individual patient data for a dichotomous outcome reported over a period of time. The model accounts for patients dropping out of the study and also for patients relapsing. The time of each observation is accounted for, and the model allows the estimation of time-dependent relative treatment effects. We apply our methods to data from a study comparing the effectiveness of 2 pharmacological treatments for schizophrenia. The model jointly estimates the relative efficacy and the dropout rate and also allows for a wide range of clinically interesting inferences to be made. Assumptions about the missingness mechanism and the unobserved outcomes of patients dropping out can be incorporated into the analysis. The presented method constitutes a viable candidate for analyzing longitudinal, incomplete binary data.
Α Markov model for longitudinal studies with incomplete dichotomous outcomes
Welton, Nicky; Samara, Myrto; Leucht, Stefan; Salanti, Georgia
2016-01-01
Missing outcome data constitute a serious threat to the validity and precision of inferences from randomized controlled trials. In this paper, we propose the use of a multistate Markov model for the analysis of incomplete individual patient data for a dichotomous outcome reported over a period of time. The model accounts for patients dropping out of the study and also for patients relapsing. The time of each observation is accounted for, and the model allows the estimation of time‐dependent relative treatment effects. We apply our methods to data from a study comparing the effectiveness of 2 pharmacological treatments for schizophrenia. The model jointly estimates the relative efficacy and the dropout rate and also allows for a wide range of clinically interesting inferences to be made. Assumptions about the missingness mechanism and the unobserved outcomes of patients dropping out can be incorporated into the analysis. The presented method constitutes a viable candidate for analyzing longitudinal, incomplete binary data. PMID:27917593
Markov Tracking for Agent Coordination
NASA Technical Reports Server (NTRS)
Washington, Richard; Lau, Sonie (Technical Monitor)
1998-01-01
Partially observable Markov decision processes (POMDPs) axe an attractive representation for representing agent behavior, since they capture uncertainty in both the agent's state and its actions. However, finding an optimal policy for POMDPs in general is computationally difficult. In this paper we present Markov Tracking, a restricted problem of coordinating actions with an agent or process represented as a POMDP Because the actions coordinate with the agent rather than influence its behavior, the optimal solution to this problem can be computed locally and quickly. We also demonstrate the use of the technique on sequential POMDPs, which can be used to model a behavior that follows a linear, acyclic trajectory through a series of states. By imposing a "windowing" restriction that restricts the number of possible alternatives considered at any moment to a fixed size, a coordinating action can be calculated in constant time, making this amenable to coordination with complex agents.
Joint inference of identity by descent along multiple chromosomes from population samples.
Zheng, Chaozhi; Kuhner, Mary K; Thompson, Elizabeth A
2014-03-01
There has been much interest in detecting genomic identity by descent (IBD) segments from modern dense genetic marker data and in using them to identify human disease susceptibility loci. Here we present a novel Bayesian framework using Markov chain Monte Carlo (MCMC) realizations to jointly infer IBD states among multiple individuals not known to be related, together with the allelic typing error rate and the IBD process parameters. The data are phased single nucleotide polymorphism (SNP) haplotypes. We model changes in latent IBD state along homologous chromosomes by a continuous time Markov model having the Ewens sampling formula as its stationary distribution. We show by simulation that this model for the IBD process fits quite well with the coalescent predictions. Using simulation data sets of 40 haplotypes over regions of 1 and 10 million base pairs (Mbp), we show that the jointly estimated IBD states are very close to the true values, although the presence of linkage disequilibrium decreases the accuracy. We also present comparisons with the ibd_haplo program, which estimates IBD among sets of four haplotypes. Our new IBD detection method focuses on the scale between genome-wide methods using simple IBD models and complex coalescent-based methods that are limited to short genome segments. At the scale of a few Mbp, our approach offers potentially more power for fine-scale IBD association mapping.
Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
Neuwald, Andrew F; Altschul, Stephen F
2016-12-01
Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).
Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations
Neuwald, Andrew F.
2016-01-01
Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes’ theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu). PMID:28002465
Dana L. Kelly; Albert Malkhasyan
2010-06-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performed for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “industry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an application of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates the development of a generic prior distribution, which incorporates multiple sources of generic data via weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.
A Fuzzy Markov approach for assessing groundwater pollution potential for landfill siting.
Chen, Wei-Yea; Kao, Jehng-Jung
2002-04-01
This study presents a Fuzzy Markov groundwater pollution potential assessment approach to facilitate landfill siting analysis. Landfill siting is constrained by various regulations and is complicated by the uncertainty of groundwater related factors. The conventional static rating method cannot properly depict the potential impact of pollution on a groundwater table because the groundwater table level fluctuates. A Markov chain model is a dynamic model that can be viewed as a hybrid of probability and matrix models. The probability matrix of the Markov chain model is determined based on the groundwater table elevation time series. The probability reflects the likelihood of the groundwater table changing between levels. A fuzzy set method is applied to estimate the degree of pollution potential, and a case study demonstrates the applicability of the proposed approach. The short- and long-term pollution potential information provided by the proposed approach is expected to enhance landfill siting decisions.
Measuring the Inference Load of a Text.
ERIC Educational Resources Information Center
Kemper, Susan
1983-01-01
A new approach to measuring readability is proposed based on the analysis of texts as causally connected chains of actions, physical states, and mental states. Using the inference load formula reflecting the difficulty readers have in inferring causal connections, the difficulty of texts can be adjusted for readers differing in skill or knowledge.…
Markov and semi-Markov processes as a failure rate
NASA Astrophysics Data System (ADS)
Grabski, Franciszek
2016-06-01
In this paper the reliability function is defined by the stochastic failure rate process with a non negative and right continuous trajectories. Equations for the conditional reliability functions of an object, under assumption that the failure rate is a semi-Markov process with an at most countable state space are derived. A proper theorem is presented. The linear systems of equations for the appropriate Laplace transforms allow to find the reliability functions for the alternating, the Poisson and the Furry-Yule failure rate processes.
Trans-dimensional Bayesian inference for gravitational lens substructures
NASA Astrophysics Data System (ADS)
Brewer, Brendon J.; Huijser, David; Lewis, Geraint F.
2016-01-01
We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or `blobs') whose form was chosen with speed as a primary criterion. The prior distribution for the blobs' properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo within Diffusive Nested Sampling to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on two simulated data sets: one with a single substructure, and the other with 10. We also apply the method to the g-band image of the `Cosmic Horseshoe' system, and find evidence for more than zero substructures. However, these have large spatial extent and probably only point to misspecifications in the model (such as the shape of the smooth lens component or the point-spread function), which are difficult to guard against in full generality.
A Test of the Need Hierarchy Concept by a Markov Model of Change in Need Strength.
ERIC Educational Resources Information Center
Rauschenberger, John; And Others
1980-01-01
In this study of 547 high school graduates, Alderfer's and Maslow's need hierarchy theories were expressed in Markov chain form and were subjected to empirical test. Both models were disconfirmed. Corroborative multiwave correlational analysis also failed to support the need hierarchy concept. (Author/IRT)
A protein-dependent side-chain rotamer library
2011-01-01
Background Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. Methods In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Results Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods. PMID:22373394
Schaeffer, Marie-Caroline; Aksenova, Tetiana
2017-03-10
Brain-Computer Interfaces (BCIs) are systems which translate brain neural activity into commands for external devices. BCI users generally alternate between No-Control (NC) and Intentional Control (IC) periods. NC/IC discrimination is crucial for clinical BCIs, particularly when they provide neural control over complex effectors such as exoskeletons. Numerous BCI decoders focus on the estimation of continuously-valued limb trajectories from neural signals. The integration of NC support into continuous decoders is investigated in the present article. Most discrete/continuous BCI hybrid decoders rely on static state models which don't exploit the dynamic of NC/IC state succession. A hybrid decoder, referred to as Markov Switching Linear Model (MSLM), is proposed in the present article. The MSLM assumes that the NC/IC state sequence is generated by a first-order Markov chain, and performs dynamic NC/IC state detection. Linear continuous movement models are probabilistically combined using the NC and IC state posterior probabilities yielded by the state decoder. The proposed decoder is evaluated for the task of asynchronous wrist position decoding from high dimensional space-time-frequency ElectroCorticoGraphic (ECoG) features in monkeys. The MSLM is compared with another dynamic hybrid decoder proposed in the literature, namely a Switching Kalman Filter (SKF). A comparison is additionally drawn with a Wiener filter decoder which infers NC states by thresholding trajectory estimates. The MSLM decoder is found to outperform both the SKF and the thresholded Wiener filter decoder in terms of False Positive Ratio and NC/IC state detection error. It additionally surpasses the SKF with respect to the Pearson Correlation Coefficient and Root Mean Squared Error between true and estimated continuous trajectories.
Efficient Bayesian species tree inference under the multispecies coalescent.
Rannala, Bruce; Yang, Ziheng
2017-01-04
We develop a Bayesian method for inferring the species phylogeny under the multispecies coalescent (MSC) model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we implement two efficient MCMC proposals: the first is based on the Subtree Pruning and Regrafting (SPR) algorithm and the second is based on a node-slider algorithm. Like the Nearest-Neighbor Interchange (NNI) algorithm we implemented previously, both new algorithms propose changes to the species tree while simultaneously altering the gene trees at multiple genetic loci to automatically avoid conflicts with the newly proposed species tree. The method integrates over gene trees, naturally taking account of the uncertainty of gene tree topology and branch lengths given the sequence data. A simulation study was performed to examine the statistical properties of the new method. The method was found to show excellent statistical performance, inferring the correct species tree with near certainty when 10 loci were included in the dataset. The prior on species trees has some impact, particularly for small numbers of loci. We analyzed several previously published datasets (both real and simulated) for rattlesnakes and Philippine shrews, in comparison with alternative methods. The results suggest that the Bayesian coalescent-based method is statistically more efficient than heuristic methods based on summary statistics, and that our implementation is computationally more efficient than alternative full-likelihood methods under the MSC. Parameter estimates for the rattlesnake data suggest drastically different evolutionary dynamics between the nuclear and mitochondrial loci, even though they support largely consistent species trees. We discuss the different challenges facing the marginal likelihood calculation and transmodel MCMC as alternative strategies for estimating posterior probabilities for species trees.
A Hidden Markov Approach to Modeling Interevent Earthquake Times
NASA Astrophysics Data System (ADS)
Chambers, D.; Ebel, J. E.; Kafka, A. L.; Baglivo, J.
2003-12-01
A hidden Markov process, in which the interevent time distribution is a mixture of exponential distributions with different rates, is explored as a model for seismicity that does not follow a Poisson process. In a general hidden Markov model, one assumes that a system can be in any of a finite number k of states and there is a random variable of interest whose distribution depends on the state in which the system resides. The system moves probabilistically among the states according to a Markov chain; that is, given the history of visited states up to the present, the conditional probability that the next state is a specified one depends only on the present state. Thus the transition probabilities are specified by a k by k stochastic matrix. Furthermore, it is assumed that the actual states are unobserved (hidden) and that only the values of the random variable are seen. From these values, one wishes to estimate the sequence of states, the transition probability matrix, and any parameters used in the state-specific distributions. The hidden Markov process was applied to a data set of 110 interevent times for earthquakes in New England from 1975 to 2000. Using the Baum-Welch method (Baum et al., Ann. Math. Statist. 41, 164-171), we estimate the transition probabilities, find the most likely sequence of states, and estimate the k means of the exponential distributions. Using k=2 states, we found the data were fit well by a mixture of two exponential distributions, with means of approximately 5 days and 95 days. The steady state model indicates that after approximately one fourth of the earthquakes, the waiting time until the next event had the first exponential distribution and three fourths of the time it had the second. Three and four state models were also fit to the data; the data were inconsistent with a three state model but were well fit by a four state model.
Semi-Markov Unreliability-Range Evaluator
NASA Technical Reports Server (NTRS)
Butler, Ricky W.
1988-01-01
Reconfigurable, fault-tolerant systems modeled. Semi-Markov unreliability-range evaluator (SURE) computer program is software tool for analysis of reliability of reconfigurable, fault-tolerant systems. Based on new method for computing death-state probabilities of semi-Markov model. Computes accurate upper and lower bounds on probability of failure of system. Written in PASCAL.
An introduction to hidden Markov models.
Schuster-Böckler, Benjamin; Bateman, Alex
2007-06-01
This unit introduces the concept of hidden Markov models in computational biology. It describes them using simple biological examples, requiring as little mathematical knowledge as possible. The unit also presents a brief history of hidden Markov models and an overvie