Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
Bayesian approach to inverse statistical mechanics.
Habeck, Michael
2014-05-01
Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
Bayesian approach to inverse statistical mechanics
NASA Astrophysics Data System (ADS)
Habeck, Michael
2014-05-01
Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism As a Process of Bayesian Inference
Campbell, John O.
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an “experiment” in the external world environment, and the results of that “experiment” or the “surprise” entailed by predicted and actual outcomes of the “experiment.” Minimization of free energy implies that the implicit measure of “surprise” experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438
On a full Bayesian inference for force reconstruction problems
NASA Astrophysics Data System (ADS)
Aucejo, M.; De Smet, O.
2018-05-01
In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
Uwano, Ikuko; Sasaki, Makoto; Kudo, Kohsuke; Boutelier, Timothé; Kameda, Hiroyuki; Mori, Futoshi; Yamashita, Fumio
2017-01-10
The Bayesian estimation algorithm improves the precision of bolus tracking perfusion imaging. However, this algorithm cannot directly calculate Tmax, the time scale widely used to identify ischemic penumbra, because Tmax is a non-physiological, artificial index that reflects the tracer arrival delay (TD) and other parameters. We calculated Tmax from the TD and mean transit time (MTT) obtained by the Bayesian algorithm and determined its accuracy in comparison with Tmax obtained by singular value decomposition (SVD) algorithms. The TD and MTT maps were generated by the Bayesian algorithm applied to digital phantoms with time-concentration curves that reflected a range of values for various perfusion metrics using a global arterial input function. Tmax was calculated from the TD and MTT using constants obtained by a linear least-squares fit to Tmax obtained from the two SVD algorithms that showed the best benchmarks in a previous study. Correlations between the Tmax values obtained by the Bayesian and SVD methods were examined. The Bayesian algorithm yielded accurate TD and MTT values relative to the true values of the digital phantom. Tmax calculated from the TD and MTT values with the least-squares fit constants showed excellent correlation (Pearson's correlation coefficient = 0.99) and agreement (intraclass correlation coefficient = 0.99) with Tmax obtained from SVD algorithms. Quantitative analyses of Tmax values calculated from Bayesian-estimation algorithm-derived TD and MTT from a digital phantom correlated and agreed well with Tmax values determined using SVD algorithms.
Win-Stay, Lose-Sample: a simple sequential algorithm for approximating Bayesian inference.
Bonawitz, Elizabeth; Denison, Stephanie; Gopnik, Alison; Griffiths, Thomas L
2014-11-01
People can behave in a way that is consistent with Bayesian models of cognition, despite the fact that performing exact Bayesian inference is computationally challenging. What algorithms could people be using to make this possible? We show that a simple sequential algorithm "Win-Stay, Lose-Sample", inspired by the Win-Stay, Lose-Shift (WSLS) principle, can be used to approximate Bayesian inference. We investigate the behavior of adults and preschoolers on two causal learning tasks to test whether people might use a similar algorithm. These studies use a "mini-microgenetic method", investigating how people sequentially update their beliefs as they encounter new evidence. Experiment 1 investigates a deterministic causal learning scenario and Experiments 2 and 3 examine how people make inferences in a stochastic scenario. The behavior of adults and preschoolers in these experiments is consistent with our Bayesian version of the WSLS principle. This algorithm provides both a practical method for performing Bayesian inference and a new way to understand people's judgments. Copyright © 2014 Elsevier Inc. All rights reserved.
New seismogenic stress fields for southern Italy from a Bayesian approach
NASA Astrophysics Data System (ADS)
Totaro, Cristina; Orecchio, Barbara; Presti, Debora; Scolaro, Silvia; Neri, Giancarlo
2017-04-01
A new database of high-quality waveform inversion focal mechanism has been compiled for southern Italy by integrating the highest quality solutions, available from literature and catalogues, and 146 newly-computed ones. All the selected focal mechanisms are (i) coming from the Italian CMT, Regional CMT and TDMT catalogues (Pondrelli et al., PEPI 2006, PEPI 2011; http://www.ingv.it), or (ii) computed by using the Cut And Paste (CAP) method (Zhao & Helmberger, BSSA 1994; Zhu & Helmberger, BSSA 1996). Specific tests have been carried out in order to evaluate the robustness of the obtained solutions (e.g., by varying both seismic network configuration and Earth structure parameters) and to estimate uncertainties on the focal mechanism parameters. Only the resulting highest-quality solutions have been enclosed in the database, that has then been used for computation of posterior density distributions of stress tensor components by a Bayesian method (Arnold & Townend, GJI 2007). This algorithm furnishes the posterior density function of the principal components of stress tensor (maximum σ1, intermediate σ2, and minimum σ3 compressive stress, respectively) and the stress-magnitude ratio (R). Before stress computation, we applied the k-means clustering algorithm to subdivide the focal mechanism catalog on the basis of earthquake locations. This approach allows identifying the sectors to be investigated without any "a priori" constraint from faulting type distribution. The large amount of data and the application of the Bayesian algorithm allowed us to provide a more accurate local-to-regional scale stress distribution that has shed new light on the kinematics and dynamics of this very complex area, where lithospheric unit configuration and geodynamic engines are still strongly debated. The new high-quality information here furnished will then represent very useful tools and constraints for future geophysical analyses and geodynamic modeling.
Bayesian Analysis of High Dimensional Classification
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Subhadeep; Liang, Faming
2009-12-01
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.
Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832
Search Parameter Optimization for Discrete, Bayesian, and Continuous Search Algorithms
2017-09-01
NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS SEARCH PARAMETER OPTIMIZATION FOR DISCRETE , BAYESIAN, AND CONTINUOUS SEARCH ALGORITHMS by...to 09-22-2017 4. TITLE AND SUBTITLE SEARCH PARAMETER OPTIMIZATION FOR DISCRETE , BAYESIAN, AND CON- TINUOUS SEARCH ALGORITHMS 5. FUNDING NUMBERS 6...simple search and rescue acts to prosecuting aerial/surface/submersible targets on mission. This research looks at varying the known discrete and
Approximate string matching algorithms for limited-vocabulary OCR output correction
NASA Astrophysics Data System (ADS)
Lasko, Thomas A.; Hauser, Susan E.
2000-12-01
Five methods for matching words mistranslated by optical character recognition to their most likely match in a reference dictionary were tested on data from the archives of the National Library of Medicine. The methods, including an adaptation of the cross correlation algorithm, the generic edit distance algorithm, the edit distance algorithm with a probabilistic substitution matrix, Bayesian analysis, and Bayesian analysis on an actively thinned reference dictionary were implemented and their accuracy rates compared. Of the five, the Bayesian algorithm produced the most correct matches (87%), and had the advantage of producing scores that have a useful and practical interpretation.
Exemplar Models as a Mechanism for Performing Bayesian Inference
2010-01-01
Feldman Department of Cognitive and Linguistic Sciences Brown University Adam N. Sanborn Gatsby Computational Neuroscience Unit University College London...problem. As noted above, particle filters are another instance of a rational process model, but the great diversity of efficient approximation algorithms
Order priors for Bayesian network discovery with an application to malware phylogeny
Oyen, Diane; Anderson, Blake; Sentz, Kari; ...
2017-09-15
Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges)more » in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared to existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.« less
Order priors for Bayesian network discovery with an application to malware phylogeny
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oyen, Diane; Anderson, Blake; Sentz, Kari
Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges)more » in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared to existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.« less
Extreme-Scale Bayesian Inference for Uncertainty Quantification of Complex Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Biros, George
Uncertainty quantification (UQ)—that is, quantifying uncertainties in complex mathematical models and their large-scale computational implementations—is widely viewed as one of the outstanding challenges facing the field of CS&E over the coming decade. The EUREKA project set to address the most difficult class of UQ problems: those for which both the underlying PDE model as well as the uncertain parameters are of extreme scale. In the project we worked on these extreme-scale challenges in the following four areas: 1. Scalable parallel algorithms for sampling and characterizing the posterior distribution that exploit the structure of the underlying PDEs and parameter-to-observable map. Thesemore » include structure-exploiting versions of the randomized maximum likelihood method, which aims to overcome the intractability of employing conventional MCMC methods for solving extreme-scale Bayesian inversion problems by appealing to and adapting ideas from large-scale PDE-constrained optimization, which have been very successful at exploring high-dimensional spaces. 2. Scalable parallel algorithms for construction of prior and likelihood functions based on learning methods and non-parametric density estimation. Constructing problem-specific priors remains a critical challenge in Bayesian inference, and more so in high dimensions. Another challenge is construction of likelihood functions that capture unmodeled couplings between observations and parameters. We will create parallel algorithms for non-parametric density estimation using high dimensional N-body methods and combine them with supervised learning techniques for the construction of priors and likelihood functions. 3. Bayesian inadequacy models, which augment physics models with stochastic models that represent their imperfections. The success of the Bayesian inference framework depends on the ability to represent the uncertainty due to imperfections of the mathematical model of the phenomena of interest. This is a central challenge in UQ, especially for large-scale models. We propose to develop the mathematical tools to address these challenges in the context of extreme-scale problems. 4. Parallel scalable algorithms for Bayesian optimal experimental design (OED). Bayesian inversion yields quantified uncertainties in the model parameters, which can be propagated forward through the model to yield uncertainty in outputs of interest. This opens the way for designing new experiments to reduce the uncertainties in the model parameters and model predictions. Such experimental design problems have been intractable for large-scale problems using conventional methods; we will create OED algorithms that exploit the structure of the PDE model and the parameter-to-output map to overcome these challenges. Parallel algorithms for these four problems were created, analyzed, prototyped, implemented, tuned, and scaled up for leading-edge supercomputers, including UT-Austin’s own 10 petaflops Stampede system, ANL’s Mira system, and ORNL’s Titan system. While our focus is on fundamental mathematical/computational methods and algorithms, we will assess our methods on model problems derived from several DOE mission applications, including multiscale mechanics and ice sheet dynamics.« less
Fuzzy Naive Bayesian model for medical diagnostic decision support.
Wagholikar, Kavishwar B; Vijayraghavan, Sundararajan; Deshpande, Ashok W
2009-01-01
This work relates to the development of computational algorithms to provide decision support to physicians. The authors propose a Fuzzy Naive Bayesian (FNB) model for medical diagnosis, which extends the Fuzzy Bayesian approach proposed by Okuda. A physician's interview based method is described to define a orthogonal fuzzy symptom information system, required to apply the model. For the purpose of elaboration and elicitation of characteristics, the algorithm is applied to a simple simulated dataset, and compared with conventional Naive Bayes (NB) approach. As a preliminary evaluation of FNB in real world scenario, the comparison is repeated on a real fuzzy dataset of 81 patients diagnosed with infectious diseases. The case study on simulated dataset elucidates that FNB can be optimal over NB for diagnosing patients with imprecise-fuzzy information, on account of the following characteristics - 1) it can model the information that, values of some attributes are semantically closer than values of other attributes, and 2) it offers a mechanism to temper exaggerations in patient information. Although the algorithm requires precise training data, its utility for fuzzy training data is argued for. This is supported by the case study on infectious disease dataset, which indicates optimality of FNB over NB for the infectious disease domain. Further case studies on large datasets are required to establish utility of FNB.
Bayesian Retrieval of Complete Posterior PDFs of Oceanic Rain Rate From Microwave Observations
NASA Technical Reports Server (NTRS)
Chiu, J. Christine; Petty, Grant W.
2005-01-01
This paper presents a new Bayesian algorithm for retrieving surface rain rate from Tropical Rainfall Measurements Mission (TRMM) Microwave Imager (TMI) over the ocean, along with validations against estimates from the TRMM Precipitation Radar (PR). The Bayesian approach offers a rigorous basis for optimally combining multichannel observations with prior knowledge. While other rain rate algorithms have been published that are based at least partly on Bayesian reasoning, this is believed to be the first self-contained algorithm that fully exploits Bayes Theorem to yield not just a single rain rate, but rather a continuous posterior probability distribution of rain rate. To advance our understanding of theoretical benefits of the Bayesian approach, we have conducted sensitivity analyses based on two synthetic datasets for which the true conditional and prior distribution are known. Results demonstrate that even when the prior and conditional likelihoods are specified perfectly, biased retrievals may occur at high rain rates. This bias is not the result of a defect of the Bayesian formalism but rather represents the expected outcome when the physical constraint imposed by the radiometric observations is weak, due to saturation effects. It is also suggested that the choice of the estimators and the prior information are both crucial to the retrieval. In addition, the performance of our Bayesian algorithm is found to be comparable to that of other benchmark algorithms in real-world applications, while having the additional advantage of providing a complete continuous posterior probability distribution of surface rain rate.
Semisupervised learning using Bayesian interpretation: application to LS-SVM.
Adankon, Mathias M; Cheriet, Mohamed; Biem, Alain
2011-04-01
Bayesian reasoning provides an ideal basis for representing and manipulating uncertain knowledge, with the result that many interesting algorithms in machine learning are based on Bayesian inference. In this paper, we use the Bayesian approach with one and two levels of inference to model the semisupervised learning problem and give its application to the successful kernel classifier support vector machine (SVM) and its variant least-squares SVM (LS-SVM). Taking advantage of Bayesian interpretation of LS-SVM, we develop a semisupervised learning algorithm for Bayesian LS-SVM using our approach based on two levels of inference. Experimental results on both artificial and real pattern recognition problems show the utility of our method.
Bayesian cloud detection for MERIS, AATSR, and their combination
NASA Astrophysics Data System (ADS)
Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.
2014-11-01
A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud masks were designed to be numerically efficient and suited for the processing of large amounts of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient amounts of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.
Bayesian cloud detection for MERIS, AATSR, and their combination
NASA Astrophysics Data System (ADS)
Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.
2015-04-01
A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud detection schemes were designed to be numerically efficient and suited for the processing of large numbers of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient numbers of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.
NASA Technical Reports Server (NTRS)
Buntine, Wray
1991-01-01
Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.
Applying dynamic Bayesian networks to perturbed gene expression data.
Dojer, Norbert; Gambin, Anna; Mizera, Andrzej; Wilczyński, Bartek; Tiuryn, Jerzy
2006-05-08
A central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Because of their solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way, Bayesian networks appear attractive in the field of inferring gene interactions structure from microarray experiments data. However, the basic formalism has some disadvantages, e.g. it is sometimes hard to distinguish between the origin and the target of an interaction. Two kinds of microarray experiments yield data particularly rich in information regarding the direction of interactions: time series and perturbation experiments. In order to correctly handle them, the basic formalism must be modified. For example, dynamic Bayesian networks (DBN) apply to time series microarray data. To our knowledge the DBN technique has not been applied in the context of perturbation experiments. We extend the framework of dynamic Bayesian networks in order to incorporate perturbations. Moreover, an exact algorithm for inferring an optimal network is proposed and a discretization method specialized for time series data from perturbation experiments is introduced. We apply our procedure to realistic simulations data. The results are compared with those obtained by standard DBN learning techniques. Moreover, the advantages of using exact learning algorithm instead of heuristic methods are analyzed. We show that the quality of inferred networks dramatically improves when using data from perturbation experiments. We also conclude that the exact algorithm should be used when it is possible, i.e. when considered set of genes is small enough.
Learning Instance-Specific Predictive Models
Visweswaran, Shyam; Cooper, Gregory F.
2013-01-01
This paper introduces a Bayesian algorithm for constructing predictive models from data that are optimized to predict a target variable well for a particular instance. This algorithm learns Markov blanket models, carries out Bayesian model averaging over a set of models to predict a target variable of the instance at hand, and employs an instance-specific heuristic to locate a set of suitable models to average over. We call this method the instance-specific Markov blanket (ISMB) algorithm. The ISMB algorithm was evaluated on 21 UCI data sets using five different performance measures and its performance was compared to that of several commonly used predictive algorithms, including nave Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor, Lazy Bayesian Rules, and AdaBoost. Over all the data sets, the ISMB algorithm performed better on average on all performance measures against all the comparison algorithms. PMID:25045325
A Bayesian approach to tracking patients having changing pharmacokinetic parameters
NASA Technical Reports Server (NTRS)
Bayard, David S.; Jelliffe, Roger W.
2004-01-01
This paper considers the updating of Bayesian posterior densities for pharmacokinetic models associated with patients having changing parameter values. For estimation purposes it is proposed to use the Interacting Multiple Model (IMM) estimation algorithm, which is currently a popular algorithm in the aerospace community for tracking maneuvering targets. The IMM algorithm is described, and compared to the multiple model (MM) and Maximum A-Posteriori (MAP) Bayesian estimation methods, which are presently used for posterior updating when pharmacokinetic parameters do not change. Both the MM and MAP Bayesian estimation methods are used in their sequential forms, to facilitate tracking of changing parameters. Results indicate that the IMM algorithm is well suited for tracking time-varying pharmacokinetic parameters in acutely ill and unstable patients, incurring only about half of the integrated error compared to the sequential MM and MAP methods on the same example.
BCM: toolkit for Bayesian analysis of Computational Models using samplers.
Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A
2016-10-21
Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
Fundamentals and Recent Developments in Approximate Bayesian Computation
Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka
2017-01-01
Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922
Quantum Mechanics, Pattern Recognition, and the Mammalian Brain
NASA Astrophysics Data System (ADS)
Chapline, George
2008-10-01
Although the usual way of representing Markov processes is time asymmetric, there is a way of describing Markov processes, due to Schrodinger, which is time symmetric. This observation provides a link between quantum mechanics and the layered Bayesian networks that are often used in automated pattern recognition systems. In particular, there is a striking formal similarity between quantum mechanics and a particular type of Bayesian network, the Helmholtz machine, which provides a plausible model for how the mammalian brain recognizes important environmental situations. One interesting aspect of this relationship is that the "wake-sleep" algorithm for training a Helmholtz machine is very similar to the problem of finding the potential for the multi-channel Schrodinger equation. As a practical application of this insight it may be possible to use inverse scattering techniques to study the relationship between human brain wave patterns, pattern recognition, and learning. We also comment on whether there is a relationship between quantum measurements and consciousness.
Du, Yuanwei; Guo, Yubin
2015-01-01
The intrinsic mechanism of multimorbidity is difficult to recognize and prediction and diagnosis are difficult to carry out accordingly. Bayesian networks can help to diagnose multimorbidity in health care, but it is difficult to obtain the conditional probability table (CPT) because of the lack of clinically statistical data. Today, expert knowledge and experience are increasingly used in training Bayesian networks in order to help predict or diagnose diseases, but the CPT in Bayesian networks is usually irrational or ineffective for ignoring realistic constraints especially in multimorbidity. In order to solve these problems, an evidence reasoning (ER) approach is employed to extract and fuse inference data from experts using a belief distribution and recursive ER algorithm, based on which evidence reasoning method for constructing conditional probability tables in Bayesian network of multimorbidity is presented step by step. A multimorbidity numerical example is used to demonstrate the method and prove its feasibility and application. Bayesian network can be determined as long as the inference assessment is inferred by each expert according to his/her knowledge or experience. Our method is more effective than existing methods for extracting expert inference data accurately and is fused effectively for constructing CPTs in a Bayesian network of multimorbidity.
Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks
NASA Astrophysics Data System (ADS)
Sun, Wei; Chang, K. C.
2005-05-01
Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.
Immune allied genetic algorithm for Bayesian network structure learning
NASA Astrophysics Data System (ADS)
Song, Qin; Lin, Feng; Sun, Wei; Chang, KC
2012-06-01
Bayesian network (BN) structure learning is a NP-hard problem. In this paper, we present an improved approach to enhance efficiency of BN structure learning. To avoid premature convergence in traditional single-group genetic algorithm (GA), we propose an immune allied genetic algorithm (IAGA) in which the multiple-population and allied strategy are introduced. Moreover, in the algorithm, we apply prior knowledge by injecting immune operator to individuals which can effectively prevent degeneration. To illustrate the effectiveness of the proposed technique, we present some experimental results.
Research on Bayes matting algorithm based on Gaussian mixture model
NASA Astrophysics Data System (ADS)
Quan, Wei; Jiang, Shan; Han, Cheng; Zhang, Chao; Jiang, Zhengang
2015-12-01
The digital matting problem is a classical problem of imaging. It aims at separating non-rectangular foreground objects from a background image, and compositing with a new background image. Accurate matting determines the quality of the compositing image. A Bayesian matting Algorithm Based on Gaussian Mixture Model is proposed to solve this matting problem. Firstly, the traditional Bayesian framework is improved by introducing Gaussian mixture model. Then, a weighting factor is added in order to suppress the noises of the compositing images. Finally, the effect is further improved by regulating the user's input. This algorithm is applied to matting jobs of classical images. The results are compared to the traditional Bayesian method. It is shown that our algorithm has better performance in detail such as hair. Our algorithm eliminates the noise well. And it is very effectively in dealing with the kind of work, such as interested objects with intricate boundaries.
STARBLADE: STar and Artefact Removal with a Bayesian Lightweight Algorithm from Diffuse Emission
NASA Astrophysics Data System (ADS)
Knollmüller, Jakob; Frank, Philipp; Ensslin, Torsten A.
2018-05-01
STARBLADE (STar and Artefact Removal with a Bayesian Lightweight Algorithm from Diffuse Emission) separates superimposed point-like sources from a diffuse background by imposing physically motivated models as prior knowledge. The algorithm can also be used on noisy and convolved data, though performing a proper reconstruction including a deconvolution prior to the application of the algorithm is advised; the algorithm could also be used within a denoising imaging method. STARBLADE learns the correlation structure of the diffuse emission and takes it into account to determine the occurrence and strength of a superimposed point source.
Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A
2017-01-01
The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60-90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity.
NASA Astrophysics Data System (ADS)
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.
Bayesian Community Detection in the Space of Group-Level Functional Differences
Venkataraman, Archana; Yang, Daniel Y.-J.; Pelphrey, Kevin A.; Duncan, James S.
2017-01-01
We propose a unified Bayesian framework to detect both hyper- and hypo-active communities within whole-brain fMRI data. Specifically, our model identifies dense subgraphs that exhibit population-level differences in functional synchrony between a control and clinical group. We derive a variational EM algorithm to solve for the latent posterior distributions and parameter estimates, which subsequently inform us about the afflicted network topology. We demonstrate that our method provides valuable insights into the neural mechanisms underlying social dysfunction in autism, as verified by the Neurosynth meta-analytic database. In contrast, both univariate testing and community detection via recursive edge elimination fail to identify stable functional communities associated with the disorder. PMID:26955022
Bayesian Community Detection in the Space of Group-Level Functional Differences.
Venkataraman, Archana; Yang, Daniel Y-J; Pelphrey, Kevin A; Duncan, James S
2016-08-01
We propose a unified Bayesian framework to detect both hyper- and hypo-active communities within whole-brain fMRI data. Specifically, our model identifies dense subgraphs that exhibit population-level differences in functional synchrony between a control and clinical group. We derive a variational EM algorithm to solve for the latent posterior distributions and parameter estimates, which subsequently inform us about the afflicted network topology. We demonstrate that our method provides valuable insights into the neural mechanisms underlying social dysfunction in autism, as verified by the Neurosynth meta-analytic database. In contrast, both univariate testing and community detection via recursive edge elimination fail to identify stable functional communities associated with the disorder.
Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bates, Cameron Russell; Mckigney, Edward Allen
The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.
NASA Astrophysics Data System (ADS)
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.
NASA Astrophysics Data System (ADS)
Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith
2018-01-01
This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.
Tweedell, Andrew J.; Haynes, Courtney A.
2017-01-01
The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60–90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity. PMID:28489897
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2013-09-04
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Kärkkäinen, Hanni P.; Sillanpää, Mikko J.
2013-01-01
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
Comparison of Co-Temporal Modeling Algorithms on Sparse Experimental Time Series Data Sets.
Allen, Edward E; Norris, James L; John, David J; Thomas, Stan J; Turkett, William H; Fetrow, Jacquelyn S
2010-01-01
Multiple approaches for reverse-engineering biological networks from time-series data have been proposed in the computational biology literature. These approaches can be classified by their underlying mathematical algorithms, such as Bayesian or algebraic techniques, as well as by their time paradigm, which includes next-state and co-temporal modeling. The types of biological relationships, such as parent-child or siblings, discovered by these algorithms are quite varied. It is important to understand the strengths and weaknesses of the various algorithms and time paradigms on actual experimental data. We assess how well the co-temporal implementations of three algorithms, continuous Bayesian, discrete Bayesian, and computational algebraic, can 1) identify two types of entity relationships, parent and sibling, between biological entities, 2) deal with experimental sparse time course data, and 3) handle experimental noise seen in replicate data sets. These algorithms are evaluated, using the shuffle index metric, for how well the resulting models match literature models in terms of siblings and parent relationships. Results indicate that all three co-temporal algorithms perform well, at a statistically significant level, at finding sibling relationships, but perform relatively poorly in finding parent relationships.
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe
2017-09-01
We propose a sparse Bayesian learning algorithm for improved estimation of white matter fiber parameters from compressed (under-sampled q-space) multi-shell diffusion MRI data. The multi-shell data is represented in a dictionary form using a non-monoexponential decay model of diffusion, based on continuous gamma distribution of diffusivities. The fiber volume fractions with predefined orientations, which are the unknown parameters, form the dictionary weights. These unknown parameters are estimated with a linear un-mixing framework, using a sparse Bayesian learning algorithm. A localized learning of hyperparameters at each voxel and for each possible fiber orientations improves the parameter estimation. Our experiments using synthetic data from the ISBI 2012 HARDI reconstruction challenge and in-vivo data from the Human Connectome Project demonstrate the improvements.
Bayesian Decision Theoretical Framework for Clustering
ERIC Educational Resources Information Center
Chen, Mo
2011-01-01
In this thesis, we establish a novel probabilistic framework for the data clustering problem from the perspective of Bayesian decision theory. The Bayesian decision theory view justifies the important questions: what is a cluster and what a clustering algorithm should optimize. We prove that the spectral clustering (to be specific, the…
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood
ERIC Educational Resources Information Center
Karabatsos, George
2017-01-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon…
Searching Algorithm Using Bayesian Updates
ERIC Educational Resources Information Center
Caudle, Kyle
2010-01-01
In late October 1967, the USS Scorpion was lost at sea, somewhere between the Azores and Norfolk Virginia. Dr. Craven of the U.S. Navy's Special Projects Division is credited with using Bayesian Search Theory to locate the submarine. Bayesian Search Theory is a straightforward and interesting application of Bayes' theorem which involves searching…
A new prior for bayesian anomaly detection: application to biosurveillance.
Shen, Y; Cooper, G F
2010-01-01
Bayesian anomaly detection computes posterior probabilities of anomalous events by combining prior beliefs and evidence from data. However, the specification of prior probabilities can be challenging. This paper describes a Bayesian prior in the context of disease outbreak detection. The goal is to provide a meaningful, easy-to-use prior that yields a posterior probability of an outbreak that performs at least as well as a standard frequentist approach. If this goal is achieved, the resulting posterior could be usefully incorporated into a decision analysis about how to act in light of a possible disease outbreak. This paper describes a Bayesian method for anomaly detection that combines learning from data with a semi-informative prior probability over patterns of anomalous events. A univariate version of the algorithm is presented here for ease of illustration of the essential ideas. The paper describes the algorithm in the context of disease-outbreak detection, but it is general and can be used in other anomaly detection applications. For this application, the semi-informative prior specifies that an increased count over baseline is expected for the variable being monitored, such as the number of respiratory chief complaints per day at a given emergency department. The semi-informative prior is derived based on the baseline prior, which is estimated from using historical data. The evaluation reported here used semi-synthetic data to evaluate the detection performance of the proposed Bayesian method and a control chart method, which is a standard frequentist algorithm that is closest to the Bayesian method in terms of the type of data it uses. The disease-outbreak detection performance of the Bayesian method was statistically significantly better than that of the control chart method when proper baseline periods were used to estimate the baseline behavior to avoid seasonal effects. When using longer baseline periods, the Bayesian method performed as well as the control chart method. The time complexity of the Bayesian algorithm is linear in the number of the observed events being monitored, due to a novel, closed-form derivation that is introduced in the paper. This paper introduces a novel prior probability for Bayesian outbreak detection that is expressive, easy-to-apply, computationally efficient, and performs as well or better than a standard frequentist method.
Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A
2017-12-01
The onset of muscle activity, as measured by electromyography (EMG), is a commonly applied metric in biomechanics. Intramuscular EMG is often used to examine deep musculature and there are currently no studies examining the effectiveness of algorithms for intramuscular EMG onset. The present study examines standard surface EMG onset algorithms (linear envelope, Teager-Kaiser Energy Operator, and sample entropy) and novel algorithms (time series mean-variance analysis, sequential/batch processing with parametric and nonparametric methods, and Bayesian changepoint analysis). Thirteen male and 5 female subjects had intramuscular EMG collected during isolated biceps brachii and vastus lateralis contractions, resulting in 103 trials. EMG onset was visually determined twice by 3 blinded reviewers. Since the reliability of visual onset was high (ICC (1,1) : 0.92), the mean of the 6 visual assessments was contrasted with the algorithmic approaches. Poorly performing algorithms were stepwise eliminated via (1) root mean square error analysis, (2) algorithm failure to identify onset/premature onset, (3) linear regression analysis, and (4) Bland-Altman plots. The top performing algorithms were all based on Bayesian changepoint analysis of rectified EMG and were statistically indistinguishable from visual analysis. Bayesian changepoint analysis has the potential to produce more reliable, accurate, and objective intramuscular EMG onset results than standard methodologies.
2D Bayesian automated tilted-ring fitting of disc galaxies in large H I galaxy surveys: 2DBAT
NASA Astrophysics Data System (ADS)
Oh, Se-Heon; Staveley-Smith, Lister; Spekkens, Kristine; Kamphuis, Peter; Koribalski, Bärbel S.
2018-01-01
We present a novel algorithm based on a Bayesian method for 2D tilted-ring analysis of disc galaxy velocity fields. Compared to the conventional algorithms based on a chi-squared minimization procedure, this new Bayesian-based algorithm suffers less from local minima of the model parameters even with highly multimodal posterior distributions. Moreover, the Bayesian analysis, implemented via Markov Chain Monte Carlo sampling, only requires broad ranges of posterior distributions of the parameters, which makes the fitting procedure fully automated. This feature will be essential when performing kinematic analysis on the large number of resolved galaxies expected to be detected in neutral hydrogen (H I) surveys with the Square Kilometre Array and its pathfinders. The so-called 2D Bayesian Automated Tilted-ring fitter (2DBAT) implements Bayesian fits of 2D tilted-ring models in order to derive rotation curves of galaxies. We explore 2DBAT performance on (a) artificial H I data cubes built based on representative rotation curves of intermediate-mass and massive spiral galaxies, and (b) Australia Telescope Compact Array H I data from the Local Volume H I Survey. We find that 2DBAT works best for well-resolved galaxies with intermediate inclinations (20° < i < 70°), complementing 3D techniques better suited to modelling inclined galaxies.
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing—with its unique statistical properties—became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca. PMID:28817636
Ramachandran, Parameswaran; Sánchez-Taltavull, Daniel; Perkins, Theodore J
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing-with its unique statistical properties-became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca.
NASA Astrophysics Data System (ADS)
Astuti, Ani Budi; Iriawan, Nur; Irhamah, Kuswanto, Heri
2017-12-01
In the Bayesian mixture modeling requires stages the identification number of the most appropriate mixture components thus obtained mixture models fit the data through data driven concept. Reversible Jump Markov Chain Monte Carlo (RJMCMC) is a combination of the reversible jump (RJ) concept and the Markov Chain Monte Carlo (MCMC) concept used by some researchers to solve the problem of identifying the number of mixture components which are not known with certainty number. In its application, RJMCMC using the concept of the birth/death and the split-merge with six types of movement, that are w updating, θ updating, z updating, hyperparameter β updating, split-merge for components and birth/death from blank components. The development of the RJMCMC algorithm needs to be done according to the observed case. The purpose of this study is to know the performance of RJMCMC algorithm development in identifying the number of mixture components which are not known with certainty number in the Bayesian mixture modeling for microarray data in Indonesia. The results of this study represent that the concept RJMCMC algorithm development able to properly identify the number of mixture components in the Bayesian normal mixture model wherein the component mixture in the case of microarray data in Indonesia is not known for certain number.
2010-01-01
Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788
Context Relevant Prediction Model for COPD Domain Using Bayesian Belief Network
Saleh, Lokman; Ajami, Hicham; Mili, Hafedh
2017-01-01
In the last three decades, researchers have examined extensively how context-aware systems can assist people, specifically those suffering from incurable diseases, to help them cope with their medical illness. Over the years, a huge number of studies on Chronic Obstructive Pulmonary Disease (COPD) have been published. However, how to derive relevant attributes and early detection of COPD exacerbations remains a challenge. In this research work, we will use an efficient algorithm to select relevant attributes where there is no proper approach in this domain. Such algorithm predicts exacerbations with high accuracy by adding discretization process, and organizes the pertinent attributes in priority order based on their impact to facilitate the emergency medical treatment. In this paper, we propose an extension of our existing Helper Context-Aware Engine System (HCES) for COPD. This project uses Bayesian network algorithm to depict the dependency between the COPD symptoms (attributes) in order to overcome the insufficiency and the independency hypothesis of naïve Bayesian. In addition, the dependency in Bayesian network is realized using TAN algorithm rather than consulting pneumologists. All these combined algorithms (discretization, selection, dependency, and the ordering of the relevant attributes) constitute an effective prediction model, comparing to effective ones. Moreover, an investigation and comparison of different scenarios of these algorithms are also done to verify which sequence of steps of prediction model gives more accurate results. Finally, we designed and validated a computer-aided support application to integrate different steps of this model. The findings of our system HCES has shown promising results using Area Under Receiver Operating Characteristic (AUC = 81.5%). PMID:28644419
NASA Astrophysics Data System (ADS)
Wang, Hongrui; Wang, Cheng; Wang, Ying; Gao, Xiong; Yu, Chen
2017-06-01
This paper presents a Bayesian approach using Metropolis-Hastings Markov Chain Monte Carlo algorithm and applies this method for daily river flow rate forecast and uncertainty quantification for Zhujiachuan River using data collected from Qiaotoubao Gage Station and other 13 gage stations in Zhujiachuan watershed in China. The proposed method is also compared with the conventional maximum likelihood estimation (MLE) for parameter estimation and quantification of associated uncertainties. While the Bayesian method performs similarly in estimating the mean value of daily flow rate, it performs over the conventional MLE method on uncertainty quantification, providing relatively narrower reliable interval than the MLE confidence interval and thus more precise estimation by using the related information from regional gage stations. The Bayesian MCMC method might be more favorable in the uncertainty analysis and risk management.
Computational statistics using the Bayesian Inference Engine
NASA Astrophysics Data System (ADS)
Weinberg, Martin D.
2013-09-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.
The NIFTy way of Bayesian signal inference
NASA Astrophysics Data System (ADS)
Selig, Marco
2014-12-01
We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints
Shahdoust, Maryam; Mahjub, Hossein; Sadeghi, Mehdi
2017-01-01
The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches. PMID:28938012
Bayesian estimation of realized stochastic volatility model by Hybrid Monte Carlo algorithm
NASA Astrophysics Data System (ADS)
Takaishi, Tetsuya
2014-03-01
The hybrid Monte Carlo algorithm (HMCA) is applied for Bayesian parameter estimation of the realized stochastic volatility (RSV) model. Using the 2nd order minimum norm integrator (2MNI) for the molecular dynamics (MD) simulation in the HMCA, we find that the 2MNI is more efficient than the conventional leapfrog integrator. We also find that the autocorrelation time of the volatility variables sampled by the HMCA is very short. Thus it is concluded that the HMCA with the 2MNI is an efficient algorithm for parameter estimations of the RSV model.
NASA Astrophysics Data System (ADS)
Hu, Zixi; Yao, Zhewei; Li, Jinglai
2017-03-01
Many scientific and engineering problems require to perform Bayesian inference for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.
Semi-blind sparse image reconstruction with application to MRFM.
Park, Se Un; Dobigeon, Nicolas; Hero, Alfred O
2012-09-01
We propose a solution to the image deconvolution problem where the convolution kernel or point spread function (PSF) is assumed to be only partially known. Small perturbations generated from the model are exploited to produce a few principal components explaining the PSF uncertainty in a high-dimensional space. Unlike recent developments on blind deconvolution of natural images, we assume the image is sparse in the pixel basis, a natural sparsity arising in magnetic resonance force microscopy (MRFM). Our approach adopts a Bayesian Metropolis-within-Gibbs sampling framework. The performance of our Bayesian semi-blind algorithm for sparse images is superior to previously proposed semi-blind algorithms such as the alternating minimization algorithm and blind algorithms developed for natural images. We illustrate our myopic algorithm on real MRFM tobacco virus data.
Bayesian approach for peak detection in two-dimensional chromatography.
Vivó-Truyols, Gabriel
2012-03-20
A new method for peak detection in two-dimensional chromatography is presented. In a first step, the method starts with a conventional one-dimensional peak detection algorithm to detect modulated peaks. In a second step, a sophisticated algorithm is constructed to decide which of the individual one-dimensional peaks have been originated from the same compound and should then be arranged in a two-dimensional peak. The merging algorithm is based on Bayesian inference. The user sets prior information about certain parameters (e.g., second-dimension retention time variability, first-dimension band broadening, chromatographic noise). On the basis of these priors, the algorithm calculates the probability of myriads of peak arrangements (i.e., ways of merging one-dimensional peaks), finding which of them holds the highest value. Uncertainty in each parameter can be accounted by adapting conveniently its probability distribution function, which in turn may change the final decision of the most probable peak arrangement. It has been demonstrated that the Bayesian approach presented in this paper follows the chromatographers' intuition. The algorithm has been applied and tested with LC × LC and GC × GC data and takes around 1 min to process chromatograms with several thousands of peaks.
2017-09-01
efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components
Wang, Hongrui; Wang, Cheng; Wang, Ying; ...
2017-04-05
This paper presents a Bayesian approach using Metropolis-Hastings Markov Chain Monte Carlo algorithm and applies this method for daily river flow rate forecast and uncertainty quantification for Zhujiachuan River using data collected from Qiaotoubao Gage Station and other 13 gage stations in Zhujiachuan watershed in China. The proposed method is also compared with the conventional maximum likelihood estimation (MLE) for parameter estimation and quantification of associated uncertainties. While the Bayesian method performs similarly in estimating the mean value of daily flow rate, it performs over the conventional MLE method on uncertainty quantification, providing relatively narrower reliable interval than the MLEmore » confidence interval and thus more precise estimation by using the related information from regional gage stations. As a result, the Bayesian MCMC method might be more favorable in the uncertainty analysis and risk management.« less
Update on Bayesian Blocks: Segmented Models for Sequential Data
NASA Technical Reports Server (NTRS)
Scargle, Jeff
2017-01-01
The Bayesian Block algorithm, in wide use in astronomy and other areas, has been improved in several ways. The model for block shape has been generalized to include other than constant signal rate - e.g., linear, exponential, or other parametric models. In addition the computational efficiency has been improved, so that instead of O(N**2) the basic algorithm is O(N) in most cases. Other improvements in the theory and application of segmented representations will be described.
Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data
2015-07-01
Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data Guy Van den Broeck∗ and Karthika Mohan∗ and Arthur Choi and Adnan ...notwithstanding any other provision of law , no person shall be subject to a penalty for failing to comply with a collection of information if it does...Wasserman, L. (2011). All of Statistics. Springer Science & Business Media. Yaramakala, S., & Margaritis, D. (2005). Speculative markov blanket discovery for optimal feature selection. In Proceedings of ICDM.
Slice sampling technique in Bayesian extreme of gold price modelling
NASA Astrophysics Data System (ADS)
Rostami, Mohammad; Adam, Mohd Bakri; Ibrahim, Noor Akma; Yahya, Mohamed Hisham
2013-09-01
In this paper, a simulation study of Bayesian extreme values by using Markov Chain Monte Carlo via slice sampling algorithm is implemented. We compared the accuracy of slice sampling with other methods for a Gumbel model. This study revealed that slice sampling algorithm offers more accurate and closer estimates with less RMSE than other methods . Finally we successfully employed this procedure to estimate the parameters of Malaysia extreme gold price from 2000 to 2011.
NASA Astrophysics Data System (ADS)
Hagemann, M. W.; Gleason, C. J.; Durand, M. T.
2017-11-01
The forthcoming Surface Water and Ocean Topography (SWOT) NASA satellite mission will measure water surface width, height, and slope of major rivers worldwide. The resulting data could provide an unprecedented account of river discharge at continental scales, but reliable methods need to be identified prior to launch. Here we present a novel algorithm for discharge estimation from only remotely sensed stream width, slope, and height at multiple locations along a mass-conserved river segment. The algorithm, termed the Bayesian AMHG-Manning (BAM) algorithm, implements a Bayesian formulation of streamflow uncertainty using a combination of Manning's equation and at-many-stations hydraulic geometry (AMHG). Bayesian methods provide a statistically defensible approach to generating discharge estimates in a physically underconstrained system but rely on prior distributions that quantify the a priori uncertainty of unknown quantities including discharge and hydraulic equation parameters. These were obtained from literature-reported values and from a USGS data set of acoustic Doppler current profiler (ADCP) measurements at USGS stream gauges. A data set of simulated widths, slopes, and heights from 19 rivers was used to evaluate the algorithms using a set of performance metrics. Results across the 19 rivers indicate an improvement in performance of BAM over previously tested methods and highlight a path forward in solving discharge estimation using solely satellite remote sensing.
Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel
2012-09-25
Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.
2012-01-01
Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363
A controllable sensor management algorithm capable of learning
NASA Astrophysics Data System (ADS)
Osadciw, Lisa A.; Veeramacheneni, Kalyan K.
2005-03-01
Sensor management technology progress is challenged by the geographic space it spans, the heterogeneity of the sensors, and the real-time timeframes within which plans controlling the assets are executed. This paper presents a new sensor management paradigm and demonstrates its application in a sensor management algorithm designed for a biometric access control system. This approach consists of an artificial intelligence (AI) algorithm focused on uncertainty measures, which makes the high level decisions to reduce uncertainties and interfaces with the user, integrated cohesively with a bottom up evolutionary algorithm, which optimizes the sensor network"s operation as determined by the AI algorithm. The sensor management algorithm presented is composed of a Bayesian network, the AI algorithm component, and a swarm optimization algorithm, the evolutionary algorithm. Thus, the algorithm can change its own performance goals in real-time and will modify its own decisions based on observed measures within the sensor network. The definition of the measures as well as the Bayesian network determine the robustness of the algorithm and its utility in reacting dynamically to changes in the global system.
Determining open cluster membership. A Bayesian framework for quantitative member classification
NASA Astrophysics Data System (ADS)
Stott, Jonathan J.
2018-01-01
Aims: My goal is to develop a quantitative algorithm for assessing open cluster membership probabilities. The algorithm is designed to work with single-epoch observations. In its simplest form, only one set of program images and one set of reference images are required. Methods: The algorithm is based on a two-stage joint astrometric and photometric assessment of cluster membership probabilities. The probabilities were computed within a Bayesian framework using any available prior information. Where possible, the algorithm emphasizes simplicity over mathematical sophistication. Results: The algorithm was implemented and tested against three observational fields using published survey data. M 67 and NGC 654 were selected as cluster examples while a third, cluster-free, field was used for the final test data set. The algorithm shows good quantitative agreement with the existing surveys and has a false-positive rate significantly lower than the astrometric or photometric methods used individually.
BayesMotif: de novo protein sorting motif discovery from impure datasets.
Hu, Jianjun; Zhang, Fan
2010-01-18
Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of PWM (position weight matrix) motif model.
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls that have plagued previous theoretical movements.
Application of bayesian networks to real-time flood risk estimation
NASA Astrophysics Data System (ADS)
Garrote, L.; Molina, M.; Blasco, G.
2003-04-01
This paper presents the application of a computational paradigm taken from the field of artificial intelligence - the bayesian network - to model the behaviour of hydrologic basins during floods. The final goal of this research is to develop representation techniques for hydrologic simulation models in order to define, develop and validate a mechanism, supported by a software environment, oriented to build decision models for the prediction and management of river floods in real time. The emphasis is placed on providing decision makers with tools to incorporate their knowledge of basin behaviour, usually formulated in terms of rainfall-runoff models, in the process of real-time decision making during floods. A rainfall-runoff model is only a step in the process of decision making. If a reliable rainfall forecast is available and the rainfall-runoff model is well calibrated, decisions can be based mainly on model results. However, in most practical situations, uncertainties in rainfall forecasts or model performance have to be incorporated in the decision process. The computation paradigm adopted for the simulation of hydrologic processes is the bayesian network. A bayesian network is a directed acyclic graph that represents causal influences between linked variables. Under this representation, uncertain qualitative variables are related through causal relations quantified with conditional probabilities. The solution algorithm allows the computation of the expected probability distribution of unknown variables conditioned to the observations. An approach to represent hydrologic processes by bayesian networks with temporal and spatial extensions is presented in this paper, together with a methodology for the development of bayesian models using results produced by deterministic hydrologic simulation models
Inferring phenomenological models of Markov processes from data
NASA Astrophysics Data System (ADS)
Rivera, Catalina; Nemenman, Ilya
Microscopically accurate modeling of stochastic dynamics of biochemical networks is hard due to the extremely high dimensionality of the state space of such networks. Here we propose an algorithm for inference of phenomenological, coarse-grained models of Markov processes describing the network dynamics directly from data, without the intermediate step of microscopically accurate modeling. The approach relies on the linear nature of the Chemical Master Equation and uses Bayesian Model Selection for identification of parsimonious models that fit the data. When applied to synthetic data from the Kinetic Proofreading process (KPR), a common mechanism used by cells for increasing specificity of molecular assembly, the algorithm successfully uncovers the known coarse-grained description of the process. This phenomenological description has been notice previously, but this time it is derived in an automated manner by the algorithm. James S. McDonnell Foundation Grant No. 220020321.
Zaikin, Alexey; Míguez, Joaquín
2017-01-01
We compare three state-of-the-art Bayesian inference methods for the estimation of the unknown parameters in a stochastic model of a genetic network. In particular, we introduce a stochastic version of the paradigmatic synthetic multicellular clock model proposed by Ullner et al., 2007. By introducing dynamical noise in the model and assuming that the partial observations of the system are contaminated by additive noise, we enable a principled mechanism to represent experimental uncertainties in the synthesis of the multicellular system and pave the way for the design of probabilistic methods for the estimation of any unknowns in the model. Within this setup, we tackle the Bayesian estimation of a subset of the model parameters. Specifically, we compare three Monte Carlo based numerical methods for the approximation of the posterior probability density function of the unknown parameters given a set of partial and noisy observations of the system. The schemes we assess are the particle Metropolis-Hastings (PMH) algorithm, the nonlinear population Monte Carlo (NPMC) method and the approximate Bayesian computation sequential Monte Carlo (ABC-SMC) scheme. We present an extensive numerical simulation study, which shows that while the three techniques can effectively solve the problem there are significant differences both in estimation accuracy and computational efficiency. PMID:28797087
NASA Astrophysics Data System (ADS)
Jeffs, Brian D.; Christou, Julian C.
1998-09-01
This paper addresses post processing for resolution enhancement of sequences of short exposure adaptive optics (AO) images of space objects. The unknown residual blur is removed using Bayesian maximum a posteriori blind image restoration techniques. In the problem formulation, both the true image and the unknown blur psf's are represented by the flexible generalized Gaussian Markov random field (GGMRF) model. The GGMRF probability density function provides a natural mechanism for expressing available prior information about the image and blur. Incorporating such prior knowledge in the deconvolution optimization is crucial for the success of blind restoration algorithms. For example, space objects often contain sharp edge boundaries and geometric structures, while the residual blur psf in the corresponding partially corrected AO image is spectrally band limited, and exhibits while the residual blur psf in the corresponding partially corrected AO image is spectrally band limited, and exhibits smoothed, random , texture-like features on a peaked central core. By properly choosing parameters, GGMRF models can accurately represent both the blur psf and the object, and serve to regularize the deconvolution problem. These two GGMRF models also serve as discriminator functions to separate blur and object in the solution. Algorithm performance is demonstrated with examples from synthetic AO images. Results indicate significant resolution enhancement when applied to partially corrected AO images. An efficient computational algorithm is described.
Toward a probabilistic acoustic emission source location algorithm: A Bayesian approach
NASA Astrophysics Data System (ADS)
Schumacher, Thomas; Straub, Daniel; Higgins, Christopher
2012-09-01
Acoustic emissions (AE) are stress waves initiated by sudden strain releases within a solid body. These can be caused by internal mechanisms such as crack opening or propagation, crushing, or rubbing of crack surfaces. One application for the AE technique in the field of Structural Engineering is Structural Health Monitoring (SHM). With piezo-electric sensors mounted to the surface of the structure, stress waves can be detected, recorded, and stored for later analysis. An important step in quantitative AE analysis is the estimation of the stress wave source locations. Commonly, source location results are presented in a rather deterministic manner as spatial and temporal points, excluding information about uncertainties and errors. Due to variability in the material properties and uncertainty in the mathematical model, measures of uncertainty are needed beyond best-fit point solutions for source locations. This paper introduces a novel holistic framework for the development of a probabilistic source location algorithm. Bayesian analysis methods with Markov Chain Monte Carlo (MCMC) simulation are employed where all source location parameters are described with posterior probability density functions (PDFs). The proposed methodology is applied to an example employing data collected from a realistic section of a reinforced concrete bridge column. The selected approach is general and has the advantage that it can be extended and refined efficiently. Results are discussed and future steps to improve the algorithm are suggested.
NASA Astrophysics Data System (ADS)
Wen, Fang-Qing; Zhang, Gong; Ben, De
2015-11-01
This paper addresses the direction of arrival (DOA) estimation problem for the co-located multiple-input multiple-output (MIMO) radar with random arrays. The spatially distributed sparsity of the targets in the background makes compressive sensing (CS) desirable for DOA estimation. A spatial CS framework is presented, which links the DOA estimation problem to support recovery from a known over-complete dictionary. A modified statistical model is developed to accurately represent the intra-block correlation of the received signal. A structural sparsity Bayesian learning algorithm is proposed for the sparse recovery problem. The proposed algorithm, which exploits intra-signal correlation, is capable being applied to limited data support and low signal-to-noise ratio (SNR) scene. Furthermore, the proposed algorithm has less computation load compared to the classical Bayesian algorithm. Simulation results show that the proposed algorithm has a more accurate DOA estimation than the traditional multiple signal classification (MUSIC) algorithm and other CS recovery algorithms. Project supported by the National Natural Science Foundation of China (Grant Nos. 61071163, 61271327, and 61471191), the Funding for Outstanding Doctoral Dissertation in Nanjing University of Aeronautics and Astronautics, China (Grant No. BCXJ14-08), the Funding of Innovation Program for Graduate Education of Jiangsu Province, China (Grant No. KYLX 0277), the Fundamental Research Funds for the Central Universities, China (Grant No. 3082015NP2015504), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PADA), China.
Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...
NASA Astrophysics Data System (ADS)
Wilting, Jens; Lehnertz, Klaus
2015-08-01
We investigate a recently published analysis framework based on Bayesian inference for the time-resolved characterization of interaction properties of noisy, coupled dynamical systems. It promises wide applicability and a better time resolution than well-established methods. At the example of representative model systems, we show that the analysis framework has the same weaknesses as previous methods, particularly when investigating interacting, structurally different non-linear oscillators. We also inspect the tracking of time-varying interaction properties and propose a further modification of the algorithm, which improves the reliability of obtained results. We exemplarily investigate the suitability of this algorithm to infer strength and direction of interactions between various regions of the human brain during an epileptic seizure. Within the limitations of the applicability of this analysis tool, we show that the modified algorithm indeed allows a better time resolution through Bayesian inference when compared to previous methods based on least square fits.
Huang, Shuai; Li, Jing; Ye, Jieping; Fleisher, Adam; Chen, Kewei; Wu, Teresa; Reiman, Eric
2013-06-01
Structure learning of Bayesian Networks (BNs) is an important topic in machine learning. Driven by modern applications in genetics and brain sciences, accurate and efficient learning of large-scale BN structures from high-dimensional data becomes a challenging problem. To tackle this challenge, we propose a Sparse Bayesian Network (SBN) structure learning algorithm that employs a novel formulation involving one L1-norm penalty term to impose sparsity and another penalty term to ensure that the learned BN is a Directed Acyclic Graph--a required property of BNs. Through both theoretical analysis and extensive experiments on 11 moderate and large benchmark networks with various sample sizes, we show that SBN leads to improved learning accuracy, scalability, and efficiency as compared with 10 existing popular BN learning algorithms. We apply SBN to a real-world application of brain connectivity modeling for Alzheimer's disease (AD) and reveal findings that could lead to advancements in AD research.
Huang, Shuai; Li, Jing; Ye, Jieping; Fleisher, Adam; Chen, Kewei; Wu, Teresa; Reiman, Eric
2014-01-01
Structure learning of Bayesian Networks (BNs) is an important topic in machine learning. Driven by modern applications in genetics and brain sciences, accurate and efficient learning of large-scale BN structures from high-dimensional data becomes a challenging problem. To tackle this challenge, we propose a Sparse Bayesian Network (SBN) structure learning algorithm that employs a novel formulation involving one L1-norm penalty term to impose sparsity and another penalty term to ensure that the learned BN is a Directed Acyclic Graph (DAG)—a required property of BNs. Through both theoretical analysis and extensive experiments on 11 moderate and large benchmark networks with various sample sizes, we show that SBN leads to improved learning accuracy, scalability, and efficiency as compared with 10 existing popular BN learning algorithms. We apply SBN to a real-world application of brain connectivity modeling for Alzheimer’s disease (AD) and reveal findings that could lead to advancements in AD research. PMID:22665720
Resource-Constrained Project Scheduling Under Uncertainty: Models, Algorithms and Applications
2014-11-10
Make-to-Order (MTO) Production Planning using Bayesian Updating, International Journal of Production Economics (04 2014) Norman Keith Womer, Haitao...2013) Made-to-Order Production Scheduling using Bayesian Updating, Working Paper, under second-round review in International Journal of Production Economics . VI
Probabilistic Model for Untargeted Peak Detection in LC-MS Using Bayesian Statistics.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2015-07-21
We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography-mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a chromatogram are affected by a chromatographic peak and which ones are only affected by noise. The use of probabilities contrasts with the traditional method in which a binary answer is given, relying on a threshold. By contrast, with the Bayesian peak detection presented here, the values of probability can be further propagated into other preprocessing steps, which will increase (or decrease) the importance of chromatographic regions into the final results. The present work is based on the use of the statistical overlap theory of component overlap from Davis and Giddings (Davis, J. M.; Giddings, J. Anal. Chem. 1983, 55, 418-424) as prior probability in the Bayesian formulation. The algorithm was tested on LC-MS Orbitrap data and was able to successfully distinguish chemical noise from actual peaks without any data preprocessing.
MDTS: automatic complex materials design using Monte Carlo tree search.
M Dieb, Thaer; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji
2017-01-01
Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.
MDTS: automatic complex materials design using Monte Carlo tree search
NASA Astrophysics Data System (ADS)
Dieb, Thaer M.; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji
2017-12-01
Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.
Sparse Bayesian Learning for Nonstationary Data Sources
NASA Astrophysics Data System (ADS)
Fujimaki, Ryohei; Yairi, Takehisa; Machida, Kazuo
This paper proposes an online Sparse Bayesian Learning (SBL) algorithm for modeling nonstationary data sources. Although most learning algorithms implicitly assume that a data source does not change over time (stationary), one in the real world usually does due to such various factors as dynamically changing environments, device degradation, sudden failures, etc (nonstationary). The proposed algorithm can be made useable for stationary online SBL by setting time decay parameters to zero, and as such it can be interpreted as a single unified framework for online SBL for use with stationary and nonstationary data sources. Tests both on four types of benchmark problems and on actual stock price data have shown it to perform well.
Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals
Fourment, Mathieu; Claywell, Brian C; Dinh, Vu; McCoy, Connor; Matsen IV, Frederick A; Darling, Aaron E
2018-01-01
Abstract Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop “guided” proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy. PMID:29186587
Textual and visual content-based anti-phishing: a Bayesian approach.
Zhang, Haijun; Liu, Gang; Chow, Tommy W S; Liu, Wenyin
2011-10-01
A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases. © 2011 IEEE
Chen, Ming-Hui; Zeng, Donglin; Hu, Kuolung; Jia, Catherine
2014-01-01
Summary In many biomedical studies, patients may experience the same type of recurrent event repeatedly over time, such as bleeding, multiple infections and disease. In this article, we propose a Bayesian design to a pivotal clinical trial in which lower risk myelodysplastic syndromes (MDS) patients are treated with MDS disease modifying therapies. One of the key study objectives is to demonstrate the investigational product (treatment) effect on reduction of platelet transfusion and bleeding events while receiving MDS therapies. In this context, we propose a new Bayesian approach for the design of superiority clinical trials using recurrent events frailty regression models. Historical recurrent events data from an already completed phase 2 trial are incorporated into the Bayesian design via the partial borrowing power prior of Ibrahim et al. (2012, Biometrics 68, 578–586). An efficient Gibbs sampling algorithm, a predictive data generation algorithm, and a simulation-based algorithm are developed for sampling from the fitting posterior distribution, generating the predictive recurrent events data, and computing various design quantities such as the type I error rate and power, respectively. An extensive simulation study is conducted to compare the proposed method to the existing frequentist methods and to investigate various operating characteristics of the proposed design. PMID:25041037
NASA Astrophysics Data System (ADS)
Nunez, Jorge; Llacer, Jorge
1993-10-01
This paper describes a general Bayesian iterative algorithm with entropy prior for image reconstruction. It solves the cases of both pure Poisson data and Poisson data with Gaussian readout noise. The algorithm maintains positivity of the solution; it includes case-specific prior information (default map) and flatfield corrections; it removes background and can be accelerated to be faster than the Richardson-Lucy algorithm. In order to determine the hyperparameter that balances the entropy and liklihood terms in the Bayesian approach, we have used a liklihood cross-validation technique. Cross-validation is more robust than other methods because it is less demanding in terms of the knowledge of exact data characteristics and of the point-spread function. We have used the algorithm to reconstruct successfully images obtained in different space-and ground-based imaging situations. It has been possible to recover most of the original intended capabilities of the Hubble Space Telescope (HST) wide field and planetary camera (WFPC) and faint object camera (FOC) from images obtained in their present state. Semireal simulations for the future wide field planetary camera 2 show that even after the repair of the spherical abberration problem, image reconstruction can play a key role in improving the resolution of the cameras, well beyond the design of the Hubble instruments. We also show that ground-based images can be reconstructed successfully with the algorithm. A technique which consists of dividing the CCD observations into two frames, with one-half the exposure time each, emerges as a recommended procedure for the utilization of the described algorithms. We have compared our technique with two commonly used reconstruction algorithms: the Richardson-Lucy and the Cambridge maximum entropy algorithms.
Learning oncogenetic networks by reducing to mixed integer linear programming.
Shahrabi Farahani, Hossein; Lagergren, Jens
2013-01-01
Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease. In order to model cancer progression, we propose Progression Networks, a special case of Bayesian networks, that are tailored to model disease progression. Progression networks have similarities with Conjunctive Bayesian Networks (CBNs) [1],a variation of Bayesian networks also proposed for modeling disease progression. We also describe a learning algorithm for learning Bayesian networks in general and progression networks in particular. We reduce the hard problem of learning the Bayesian and progression networks to Mixed Integer Linear Programming (MILP). MILP is a Non-deterministic Polynomial-time complete (NP-complete) problem for which very good heuristics exists. We tested our algorithm on synthetic and real cytogenetic data from renal cell carcinoma. We also compared our learned progression networks with the networks proposed in earlier publications. The software is available on the website https://bitbucket.org/farahani/diprog.
Traffic Video Image Segmentation Model Based on Bayesian and Spatio-Temporal Markov Random Field
NASA Astrophysics Data System (ADS)
Zhou, Jun; Bao, Xu; Li, Dawei; Yin, Yongwen
2017-10-01
Traffic video image is a kind of dynamic image and its background and foreground is changed at any time, which results in the occlusion. In this case, using the general method is more difficult to get accurate image segmentation. A segmentation algorithm based on Bayesian and Spatio-Temporal Markov Random Field is put forward, which respectively build the energy function model of observation field and label field to motion sequence image with Markov property, then according to Bayesian' rule, use the interaction of label field and observation field, that is the relationship of label field’s prior probability and observation field’s likelihood probability, get the maximum posterior probability of label field’s estimation parameter, use the ICM model to extract the motion object, consequently the process of segmentation is finished. Finally, the segmentation methods of ST - MRF and the Bayesian combined with ST - MRF were analyzed. Experimental results: the segmentation time in Bayesian combined with ST-MRF algorithm is shorter than in ST-MRF, and the computing workload is small, especially in the heavy traffic dynamic scenes the method also can achieve better segmentation effect.
Bayesian networks in neuroscience: a survey.
Bielza, Concha; Larrañaga, Pedro
2014-01-01
Bayesian networks are a type of probabilistic graphical models lie at the intersection between statistics and machine learning. They have been shown to be powerful tools to encode dependence relationships among the variables of a domain under uncertainty. Thanks to their generality, Bayesian networks can accommodate continuous and discrete variables, as well as temporal processes. In this paper we review Bayesian networks and how they can be learned automatically from data by means of structure learning algorithms. Also, we examine how a user can take advantage of these networks for reasoning by exact or approximate inference algorithms that propagate the given evidence through the graphical structure. Despite their applicability in many fields, they have been little used in neuroscience, where they have focused on specific problems, like functional connectivity analysis from neuroimaging data. Here we survey key research in neuroscience where Bayesian networks have been used with different aims: discover associations between variables, perform probabilistic reasoning over the model, and classify new observations with and without supervision. The networks are learned from data of any kind-morphological, electrophysiological, -omics and neuroimaging-, thereby broadening the scope-molecular, cellular, structural, functional, cognitive and medical- of the brain aspects to be studied.
Bayesian networks in neuroscience: a survey
Bielza, Concha; Larrañaga, Pedro
2014-01-01
Bayesian networks are a type of probabilistic graphical models lie at the intersection between statistics and machine learning. They have been shown to be powerful tools to encode dependence relationships among the variables of a domain under uncertainty. Thanks to their generality, Bayesian networks can accommodate continuous and discrete variables, as well as temporal processes. In this paper we review Bayesian networks and how they can be learned automatically from data by means of structure learning algorithms. Also, we examine how a user can take advantage of these networks for reasoning by exact or approximate inference algorithms that propagate the given evidence through the graphical structure. Despite their applicability in many fields, they have been little used in neuroscience, where they have focused on specific problems, like functional connectivity analysis from neuroimaging data. Here we survey key research in neuroscience where Bayesian networks have been used with different aims: discover associations between variables, perform probabilistic reasoning over the model, and classify new observations with and without supervision. The networks are learned from data of any kind–morphological, electrophysiological, -omics and neuroimaging–, thereby broadening the scope–molecular, cellular, structural, functional, cognitive and medical– of the brain aspects to be studied. PMID:25360109
Benchmarking for Bayesian Reinforcement Learning
Ernst, Damien; Couëtoux, Adrien
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. PMID:27304891
Benchmarking for Bayesian Reinforcement Learning.
Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.
Bayesian Deconvolution for Angular Super-Resolution in Forward-Looking Scanning Radar
Zha, Yuebo; Huang, Yulin; Sun, Zhichao; Wang, Yue; Yang, Jianyu
2015-01-01
Scanning radar is of notable importance for ground surveillance, terrain mapping and disaster rescue. However, the angular resolution of a scanning radar image is poor compared to the achievable range resolution. This paper presents a deconvolution algorithm for angular super-resolution in scanning radar based on Bayesian theory, which states that the angular super-resolution can be realized by solving the corresponding deconvolution problem with the maximum a posteriori (MAP) criterion. The algorithm considers that the noise is composed of two mutually independent parts, i.e., a Gaussian signal-independent component and a Poisson signal-dependent component. In addition, the Laplace distribution is used to represent the prior information about the targets under the assumption that the radar image of interest can be represented by the dominant scatters in the scene. Experimental results demonstrate that the proposed deconvolution algorithm has higher precision for angular super-resolution compared with the conventional algorithms, such as the Tikhonov regularization algorithm, the Wiener filter and the Richardson–Lucy algorithm. PMID:25806871
NASA Astrophysics Data System (ADS)
Farrell, Kathryn; Oden, J. Tinsley; Faghihi, Danial
2015-08-01
A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.
Bayesian design of decision rules for failure detection
NASA Technical Reports Server (NTRS)
Chow, E. Y.; Willsky, A. S.
1984-01-01
The formulation of the decision making process of a failure detection algorithm as a Bayes sequential decision problem provides a simple conceptualization of the decision rule design problem. As the optimal Bayes rule is not computable, a methodology that is based on the Bayesian approach and aimed at a reduced computational requirement is developed for designing suboptimal rules. A numerical algorithm is constructed to facilitate the design and performance evaluation of these suboptimal rules. The result of applying this design methodology to an example shows that this approach is potentially a useful one.
Quantum mechanics: The Bayesian theory generalized to the space of Hermitian matrices
NASA Astrophysics Data System (ADS)
Benavoli, Alessio; Facchini, Alessandro; Zaffalon, Marco
2016-10-01
We consider the problem of gambling on a quantum experiment and enforce rational behavior by a few rules. These rules yield, in the classical case, the Bayesian theory of probability via duality theorems. In our quantum setting, they yield the Bayesian theory generalized to the space of Hermitian matrices. This very theory is quantum mechanics: in fact, we derive all its four postulates from the generalized Bayesian theory. This implies that quantum mechanics is self-consistent. It also leads us to reinterpret the main operations in quantum mechanics as probability rules: Bayes' rule (measurement), marginalization (partial tracing), independence (tensor product). To say it with a slogan, we obtain that quantum mechanics is the Bayesian theory in the complex numbers.
Approximate Bayesian Computation by Subset Simulation using hierarchical state-space models
NASA Astrophysics Data System (ADS)
Vakilzadeh, Majid K.; Huang, Yong; Beck, James L.; Abrahamsson, Thomas
2017-02-01
A new multi-level Markov Chain Monte Carlo algorithm for Approximate Bayesian Computation, ABC-SubSim, has recently appeared that exploits the Subset Simulation method for efficient rare-event simulation. ABC-SubSim adaptively creates a nested decreasing sequence of data-approximating regions in the output space that correspond to increasingly closer approximations of the observed output vector in this output space. At each level, multiple samples of the model parameter vector are generated by a component-wise Metropolis algorithm so that the predicted output corresponding to each parameter value falls in the current data-approximating region. Theoretically, if continued to the limit, the sequence of data-approximating regions would converge on to the observed output vector and the approximate posterior distributions, which are conditional on the data-approximation region, would become exact, but this is not practically feasible. In this paper we study the performance of the ABC-SubSim algorithm for Bayesian updating of the parameters of dynamical systems using a general hierarchical state-space model. We note that the ABC methodology gives an approximate posterior distribution that actually corresponds to an exact posterior where a uniformly distributed combined measurement and modeling error is added. We also note that ABC algorithms have a problem with learning the uncertain error variances in a stochastic state-space model and so we treat them as nuisance parameters and analytically integrate them out of the posterior distribution. In addition, the statistical efficiency of the original ABC-SubSim algorithm is improved by developing a novel strategy to regulate the proposal variance for the component-wise Metropolis algorithm at each level. We demonstrate that Self-regulated ABC-SubSim is well suited for Bayesian system identification by first applying it successfully to model updating of a two degree-of-freedom linear structure for three cases: globally, locally and un-identifiable model classes, and then to model updating of a two degree-of-freedom nonlinear structure with Duffing nonlinearities in its interstory force-deflection relationship.
With or without you: predictive coding and Bayesian inference in the brain
Aitchison, Laurence; Lengyel, Máté
2018-01-01
Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly. PMID:28942084
Hepatitis disease detection using Bayesian theory
NASA Astrophysics Data System (ADS)
Maseleno, Andino; Hidayati, Rohmah Zahroh
2017-02-01
This paper presents hepatitis disease diagnosis using a Bayesian theory for better understanding of the theory. In this research, we used a Bayesian theory for detecting hepatitis disease and displaying the result of diagnosis process. Bayesian algorithm theory is rediscovered and perfected by Laplace, the basic idea is using of the known prior probability and conditional probability density parameter, based on Bayes theorem to calculate the corresponding posterior probability, and then obtained the posterior probability to infer and make decisions. Bayesian methods combine existing knowledge, prior probabilities, with additional knowledge derived from new data, the likelihood function. The initial symptoms of hepatitis which include malaise, fever and headache. The probability of hepatitis given the presence of malaise, fever, and headache. The result revealed that a Bayesian theory has successfully identified the existence of hepatitis disease.
Language Evolution by Iterated Learning with Bayesian Agents
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Kalish, Michael L.
2007-01-01
Languages are transmitted from person to person and generation to generation via a process of iterated learning: people learn a language from other people who once learned that language themselves. We analyze the consequences of iterated learning for learning algorithms based on the principles of Bayesian inference, assuming that learners compute…
Monte Carlo Algorithms for a Bayesian Analysis of the Cosmic Microwave Background
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey B.; Eriksen, H. K.; ODwyer, I. J.; Wandelt, B. D.; Gorski, K.; Knox, L.; Chu, M.
2006-01-01
A viewgraph presentation on the review of Bayesian approach to Cosmic Microwave Background (CMB) analysis, numerical implementation with Gibbs sampling, a summary of application to WMAP I and work in progress with generalizations to polarization, foregrounds, asymmetric beams, and 1/f noise is given.
Model-based Bayesian signal extraction algorithm for peripheral nerves
NASA Astrophysics Data System (ADS)
Eggers, Thomas E.; Dweiri, Yazan M.; McCallum, Grant A.; Durand, Dominique M.
2017-10-01
Objective. Multi-channel cuff electrodes have recently been investigated for extracting fascicular-level motor commands from mixed neural recordings. Such signals could provide volitional, intuitive control over a robotic prosthesis for amputee patients. Recent work has demonstrated success in extracting these signals in acute and chronic preparations using spatial filtering techniques. These extracted signals, however, had low signal-to-noise ratios and thus limited their utility to binary classification. In this work a new algorithm is proposed which combines previous source localization approaches to create a model based method which operates in real time. Approach. To validate this algorithm, a saline benchtop setup was created to allow the precise placement of artificial sources within a cuff and interference sources outside the cuff. The artificial source was taken from five seconds of chronic neural activity to replicate realistic recordings. The proposed algorithm, hybrid Bayesian signal extraction (HBSE), is then compared to previous algorithms, beamforming and a Bayesian spatial filtering method, on this test data. An example chronic neural recording is also analyzed with all three algorithms. Main results. The proposed algorithm improved the signal to noise and signal to interference ratio of extracted test signals two to three fold, as well as increased the correlation coefficient between the original and recovered signals by 10-20%. These improvements translated to the chronic recording example and increased the calculated bit rate between the recovered signals and the recorded motor activity. Significance. HBSE significantly outperforms previous algorithms in extracting realistic neural signals, even in the presence of external noise sources. These results demonstrate the feasibility of extracting dynamic motor signals from a multi-fascicled intact nerve trunk, which in turn could extract motor command signals from an amputee for the end goal of controlling a prosthetic limb.
ERIC Educational Resources Information Center
Martin-Fernandez, Manuel; Revuelta, Javier
2017-01-01
This study compares the performance of two estimation algorithms of new usage, the Metropolis-Hastings Robins-Monro (MHRM) and the Hamiltonian MCMC (HMC), with two consolidated algorithms in the psychometric literature, the marginal likelihood via EM algorithm (MML-EM) and the Markov chain Monte Carlo (MCMC), in the estimation of multidimensional…
Spatial cluster detection using dynamic programming.
Sverchkov, Yuriy; Jiang, Xia; Cooper, Gregory F
2012-03-25
The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum a-posteriori (MAP) estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. We conclude that the dynamic programming algorithm performs on-par with other available methods for spatial cluster detection and point to its low computational cost and extendability as advantages in favor of further research and use of the algorithm.
Spatial cluster detection using dynamic programming
2012-01-01
Background The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. Methods We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum a-posteriori (MAP) estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. Results When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. Conclusions We conclude that the dynamic programming algorithm performs on-par with other available methods for spatial cluster detection and point to its low computational cost and extendability as advantages in favor of further research and use of the algorithm. PMID:22443103
Sarigiannis, Dimosthenis A; Karakitsios, Spyros P; Gotti, Alberto; Papaloukas, Costas L; Kassomenos, Pavlos A; Pilidis, Georgios A
2009-01-01
The objective of the current study was the development of a reliable modeling platform to calculate in real time the personal exposure and the associated health risk for filling station employees evaluating current environmental parameters (traffic, meteorological and amount of fuel traded) determined by the appropriate sensor network. A set of Artificial Neural Networks (ANNs) was developed to predict benzene exposure pattern for the filling station employees. Furthermore, a Physiology Based Pharmaco-Kinetic (PBPK) risk assessment model was developed in order to calculate the lifetime probability distribution of leukemia to the employees, fed by data obtained by the ANN model. Bayesian algorithm was involved in crucial points of both model sub compartments. The application was evaluated in two filling stations (one urban and one rural). Among several algorithms available for the development of the ANN exposure model, Bayesian regularization provided the best results and seemed to be a promising technique for prediction of the exposure pattern of that occupational population group. On assessing the estimated leukemia risk under the scope of providing a distribution curve based on the exposure levels and the different susceptibility of the population, the Bayesian algorithm was a prerequisite of the Monte Carlo approach, which is integrated in the PBPK-based risk model. In conclusion, the modeling system described herein is capable of exploiting the information collected by the environmental sensors in order to estimate in real time the personal exposure and the resulting health risk for employees of gasoline filling stations.
Sarigiannis, Dimosthenis A.; Karakitsios, Spyros P.; Gotti, Alberto; Papaloukas, Costas L.; Kassomenos, Pavlos A.; Pilidis, Georgios A.
2009-01-01
The objective of the current study was the development of a reliable modeling platform to calculate in real time the personal exposure and the associated health risk for filling station employees evaluating current environmental parameters (traffic, meteorological and amount of fuel traded) determined by the appropriate sensor network. A set of Artificial Neural Networks (ANNs) was developed to predict benzene exposure pattern for the filling station employees. Furthermore, a Physiology Based Pharmaco-Kinetic (PBPK) risk assessment model was developed in order to calculate the lifetime probability distribution of leukemia to the employees, fed by data obtained by the ANN model. Bayesian algorithm was involved in crucial points of both model sub compartments. The application was evaluated in two filling stations (one urban and one rural). Among several algorithms available for the development of the ANN exposure model, Bayesian regularization provided the best results and seemed to be a promising technique for prediction of the exposure pattern of that occupational population group. On assessing the estimated leukemia risk under the scope of providing a distribution curve based on the exposure levels and the different susceptibility of the population, the Bayesian algorithm was a prerequisite of the Monte Carlo approach, which is integrated in the PBPK-based risk model. In conclusion, the modeling system described herein is capable of exploiting the information collected by the environmental sensors in order to estimate in real time the personal exposure and the resulting health risk for employees of gasoline filling stations. PMID:22399936
Bayesian Peptide Peak Detection for High Resolution TOF Mass Spectrometry.
Zhang, Jianqiu; Zhou, Xiaobo; Wang, Honghui; Suffredini, Anthony; Zhang, Lin; Huang, Yufei; Wong, Stephen
2010-11-01
In this paper, we address the issue of peptide ion peak detection for high resolution time-of-flight (TOF) mass spectrometry (MS) data. A novel Bayesian peptide ion peak detection method is proposed for TOF data with resolution of 10 000-15 000 full width at half-maximum (FWHW). MS spectra exhibit distinct characteristics at this resolution, which are captured in a novel parametric model. Based on the proposed parametric model, a Bayesian peak detection algorithm based on Markov chain Monte Carlo (MCMC) sampling is developed. The proposed algorithm is tested on both simulated and real datasets. The results show a significant improvement in detection performance over a commonly employed method. The results also agree with expert's visual inspection. Moreover, better detection consistency is achieved across MS datasets from patients with identical pathological condition.
Bayesian Peptide Peak Detection for High Resolution TOF Mass Spectrometry
Zhang, Jianqiu; Zhou, Xiaobo; Wang, Honghui; Suffredini, Anthony; Zhang, Lin; Huang, Yufei; Wong, Stephen
2011-01-01
In this paper, we address the issue of peptide ion peak detection for high resolution time-of-flight (TOF) mass spectrometry (MS) data. A novel Bayesian peptide ion peak detection method is proposed for TOF data with resolution of 10 000–15 000 full width at half-maximum (FWHW). MS spectra exhibit distinct characteristics at this resolution, which are captured in a novel parametric model. Based on the proposed parametric model, a Bayesian peak detection algorithm based on Markov chain Monte Carlo (MCMC) sampling is developed. The proposed algorithm is tested on both simulated and real datasets. The results show a significant improvement in detection performance over a commonly employed method. The results also agree with expert’s visual inspection. Moreover, better detection consistency is achieved across MS datasets from patients with identical pathological condition. PMID:21544266
Missing value imputation: with application to handwriting data
NASA Astrophysics Data System (ADS)
Xu, Zhen; Srihari, Sargur N.
2015-01-01
Missing values make pattern analysis difficult, particularly with limited available data. In longitudinal research, missing values accumulate, thereby aggravating the problem. Here we consider how to deal with temporal data with missing values in handwriting analysis. In the task of studying development of individuality of handwriting, we encountered the fact that feature values are missing for several individuals at several time instances. Six algorithms, i.e., random imputation, mean imputation, most likely independent value imputation, and three methods based on Bayesian network (static Bayesian network, parameter EM, and structural EM), are compared with children's handwriting data. We evaluate the accuracy and robustness of the algorithms under different ratios of missing data and missing values, and useful conclusions are given. Specifically, static Bayesian network is used for our data which contain around 5% missing data to provide adequate accuracy and low computational cost.
Modelling maximum river flow by using Bayesian Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Cheong, R. Y.; Gabda, D.
2017-09-01
Analysis of flood trends is vital since flooding threatens human living in terms of financial, environment and security. The data of annual maximum river flows in Sabah were fitted into generalized extreme value (GEV) distribution. Maximum likelihood estimator (MLE) raised naturally when working with GEV distribution. However, previous researches showed that MLE provide unstable results especially in small sample size. In this study, we used different Bayesian Markov Chain Monte Carlo (MCMC) based on Metropolis-Hastings algorithm to estimate GEV parameters. Bayesian MCMC method is a statistical inference which studies the parameter estimation by using posterior distribution based on Bayes’ theorem. Metropolis-Hastings algorithm is used to overcome the high dimensional state space faced in Monte Carlo method. This approach also considers more uncertainty in parameter estimation which then presents a better prediction on maximum river flow in Sabah.
Experimental Bayesian Quantum Phase Estimation on a Silicon Photonic Chip.
Paesani, S; Gentile, A A; Santagati, R; Wang, J; Wiebe, N; Tew, D P; O'Brien, J L; Thompson, M G
2017-03-10
Quantum phase estimation is a fundamental subroutine in many quantum algorithms, including Shor's factorization algorithm and quantum simulation. However, so far results have cast doubt on its practicability for near-term, nonfault tolerant, quantum devices. Here we report experimental results demonstrating that this intuition need not be true. We implement a recently proposed adaptive Bayesian approach to quantum phase estimation and use it to simulate molecular energies on a silicon quantum photonic device. The approach is verified to be well suited for prethreshold quantum processors by investigating its superior robustness to noise and decoherence compared to the iterative phase estimation algorithm. This shows a promising route to unlock the power of quantum phase estimation much sooner than previously believed.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
NASA Astrophysics Data System (ADS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran.
Modeling and Bayesian parameter estimation for shape memory alloy bending actuators
NASA Astrophysics Data System (ADS)
Crews, John H.; Smith, Ralph C.
2012-04-01
In this paper, we employ a homogenized energy model (HEM) for shape memory alloy (SMA) bending actuators. Additionally, we utilize a Bayesian method for quantifying parameter uncertainty. The system consists of a SMA wire attached to a flexible beam. As the actuator is heated, the beam bends, providing endoscopic motion. The model parameters are fit to experimental data using an ordinary least-squares approach. The uncertainty in the fit model parameters is then quantified using Markov Chain Monte Carlo (MCMC) methods. The MCMC algorithm provides bounds on the parameters, which will ultimately be used in robust control algorithms. One purpose of the paper is to test the feasibility of the Random Walk Metropolis algorithm, the MCMC method used here.
NASA Astrophysics Data System (ADS)
Kopka, Piotr; Wawrzynczak, Anna; Borysiewicz, Mieczyslaw
2016-11-01
In this paper the Bayesian methodology, known as Approximate Bayesian Computation (ABC), is applied to the problem of the atmospheric contamination source identification. The algorithm input data are on-line arriving concentrations of the released substance registered by the distributed sensors network. This paper presents the Sequential ABC algorithm in detail and tests its efficiency in estimation of probabilistic distributions of atmospheric release parameters of a mobile contamination source. The developed algorithms are tested using the data from Over-Land Atmospheric Diffusion (OLAD) field tracer experiment. The paper demonstrates estimation of seven parameters characterizing the contamination source, i.e.: contamination source starting position (x,y), the direction of the motion of the source (d), its velocity (v), release rate (q), start time of release (ts) and its duration (td). The online-arriving new concentrations dynamically update the probability distributions of search parameters. The atmospheric dispersion Second-order Closure Integrated PUFF (SCIPUFF) Model is used as the forward model to predict the concentrations at the sensors locations.
Collaborative autonomous sensing with Bayesians in the loop
NASA Astrophysics Data System (ADS)
Ahmed, Nisar
2016-10-01
There is a strong push to develop intelligent unmanned autonomy that complements human reasoning for applications as diverse as wilderness search and rescue, military surveillance, and robotic space exploration. More than just replacing humans for `dull, dirty and dangerous' work, autonomous agents are expected to cope with a whole host of uncertainties while working closely together with humans in new situations. The robotics revolution firmly established the primacy of Bayesian algorithms for tackling challenging perception, learning and decision-making problems. Since the next frontier of autonomy demands the ability to gather information across stretches of time and space that are beyond the reach of a single autonomous agent, the next generation of Bayesian algorithms must capitalize on opportunities to draw upon the sensing and perception abilities of humans-in/on-the-loop. This work summarizes our recent research toward harnessing `human sensors' for information gathering tasks. The basic idea behind is to allow human end users (i.e. non-experts in robotics, statistics, machine learning, etc.) to directly `talk to' the information fusion engine and perceptual processes aboard any autonomous agent. Our approach is grounded in rigorous Bayesian modeling and fusion of flexible semantic information derived from user-friendly interfaces, such as natural language chat and locative hand-drawn sketches. This naturally enables `plug and play' human sensing with existing probabilistic algorithms for planning and perception, and has been successfully demonstrated with human-robot teams in target localization applications.
On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang
2015-02-01
The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesianmore » inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.« less
NASA Astrophysics Data System (ADS)
Volkov, D.
2017-12-01
We introduce an algorithm for the simultaneous reconstruction of faults and slip fields on those faults. We define a regularized functional to be minimized for the reconstruction. We prove that the minimum of that functional converges to the unique solution of the related fault inverse problem. Due to inherent uncertainties in measurements, rather than seeking a deterministic solution to the fault inverse problem, we consider a Bayesian approach. The advantage of such an approach is that we obtain a way of quantifying uncertainties as part of our final answer. On the downside, this Bayesian approach leads to a very large computation. To contend with the size of this computation we developed an algorithm for the numerical solution to the stochastic minimization problem which can be easily implemented on a parallel multi-core platform and we discuss techniques to save on computational time. After showing how this algorithm performs on simulated data and assessing the effect of noise, we apply it to measured data. The data was recorded during a slow slip event in Guerrero, Mexico.
Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm.
Bai, Li-Yue; Dai, Hao; Xu, Qin; Junaid, Muhammad; Peng, Shao-Liang; Zhu, Xiaolei; Xiong, Yi; Wei, Dong-Qing
2018-02-05
Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters) were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.
Nonparametric Bayesian Modeling for Automated Database Schema Matching
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferragut, Erik M; Laska, Jason A
2015-01-01
The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.
Youngs, Noah; Penfold-Brown, Duncan; Drew, Kevin; Shasha, Dennis; Bonneau, Richard
2013-05-01
Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html
Decision-making without a brain: how an amoeboid organism solves the two-armed bandit.
Reid, Chris R; MacDonald, Hannelore; Mann, Richard P; Marshall, James A R; Latty, Tanya; Garnier, Simon
2016-06-01
Several recent studies hint at shared patterns in decision-making between taxonomically distant organisms, yet few studies demonstrate and dissect mechanisms of decision-making in simpler organisms. We examine decision-making in the unicellular slime mould Physarum polycephalum using a classical decision problem adapted from human and animal decision-making studies: the two-armed bandit problem. This problem has previously only been used to study organisms with brains, yet here we demonstrate that a brainless unicellular organism compares the relative qualities of multiple options, integrates over repeated samplings to perform well in random environments, and combines information on reward frequency and magnitude in order to make correct and adaptive decisions. We extend our inquiry by using Bayesian model selection to determine the most likely algorithm used by the cell when making decisions. We deduce that this algorithm centres around a tendency to exploit environments in proportion to their reward experienced through past sampling. The algorithm is intermediate in computational complexity between simple, reactionary heuristics and calculation-intensive optimal performance algorithms, yet it has very good relative performance. Our study provides insight into ancestral mechanisms of decision-making and suggests that fundamental principles of decision-making, information processing and even cognition are shared among diverse biological systems. © 2016 The Authors.
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
NASA Technical Reports Server (NTRS)
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
NASA Technical Reports Server (NTRS)
Olson, William S.; Kummerow, Christian D.; Yang, Song; Petty, Grant W.; Tao, Wei-Kuo; Bell, Thomas L.; Braun, Scott A.; Wang, Yansen; Lang, Stephen E.; Johnson, Daniel E.
2004-01-01
A revised Bayesian algorithm for estimating surface rain rate, convective rain proportion, and latent heating/drying profiles from satellite-borne passive microwave radiometer observations over ocean backgrounds is described. The algorithm searches a large database of cloud-radiative model simulations to find cloud profiles that are radiatively consistent with a given set of microwave radiance measurements. The properties of these radiatively consistent profiles are then composited to obtain best estimates of the observed properties. The revised algorithm is supported by an expanded and more physically consistent database of cloud-radiative model simulations. The algorithm also features a better quantification of the convective and non-convective contributions to total rainfall, a new geographic database, and an improved representation of background radiances in rain-free regions. Bias and random error estimates are derived from applications of the algorithm to synthetic radiance data, based upon a subset of cloud resolving model simulations, and from the Bayesian formulation itself. Synthetic rain rate and latent heating estimates exhibit a trend of high (low) bias for low (high) retrieved values. The Bayesian estimates of random error are propagated to represent errors at coarser time and space resolutions, based upon applications of the algorithm to TRMM Microwave Imager (TMI) data. Errors in instantaneous rain rate estimates at 0.5 deg resolution range from approximately 50% at 1 mm/h to 20% at 14 mm/h. These errors represent about 70-90% of the mean random deviation between collocated passive microwave and spaceborne radar rain rate estimates. The cumulative algorithm error in TMI estimates at monthly, 2.5 deg resolution is relatively small (less than 6% at 5 mm/day) compared to the random error due to infrequent satellite temporal sampling (8-35% at the same rain rate).
Fan, Yue; Wang, Xiao; Peng, Qinke
2017-01-01
Gene regulatory networks (GRNs) play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result.
A probabilistic, distributed, recursive mechanism for decision-making in the brain
Gurney, Kevin N.
2018-01-01
Decision formation recruits many brain regions, but the procedure they jointly execute is unknown. Here we characterize its essential composition, using as a framework a novel recursive Bayesian algorithm that makes decisions based on spike-trains with the statistics of those in sensory cortex (MT). Using it to simulate the random-dot-motion task, we demonstrate it quantitatively replicates the choice behaviour of monkeys, whilst predicting losses of otherwise usable information from MT. Its architecture maps to the recurrent cortico-basal-ganglia-thalamo-cortical loops, whose components are all implicated in decision-making. We show that the dynamics of its mapped computations match those of neural activity in the sensorimotor cortex and striatum during decisions, and forecast those of basal ganglia output and thalamus. This also predicts which aspects of neural dynamics are and are not part of inference. Our single-equation algorithm is probabilistic, distributed, recursive, and parallel. Its success at capturing anatomy, behaviour, and electrophysiology suggests that the mechanism implemented by the brain has these same characteristics. PMID:29614077
Bayesian reconstruction of projection reconstruction NMR (PR-NMR).
Yoon, Ji Won
2014-11-01
Projection reconstruction nuclear magnetic resonance (PR-NMR) is a technique for generating multidimensional NMR spectra. A small number of projections from lower-dimensional NMR spectra are used to reconstruct the multidimensional NMR spectra. In our previous work, it was shown that multidimensional NMR spectra are efficiently reconstructed using peak-by-peak based reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. We propose an extended and generalized RJMCMC algorithm replacing a simple linear model with a linear mixed model to reconstruct close NMR spectra into true spectra. This statistical method generates samples in a Bayesian scheme. Our proposed algorithm is tested on a set of six projections derived from the three-dimensional 700 MHz HNCO spectrum of a protein HasA. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Khawaja, Taimoor Saleem
A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-01-01
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs. PMID:29113310
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe
2018-02-15
We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.
Bayesian parameter estimation for nonlinear modelling of biological pathways.
Ghasemi, Omid; Lindsey, Merry L; Yang, Tianyi; Nguyen, Nguyen; Huang, Yufei; Jin, Yu-Fang
2011-01-01
The availability of temporal measurements on biological experiments has significantly promoted research areas in systems biology. To gain insight into the interaction and regulation of biological systems, mathematical frameworks such as ordinary differential equations have been widely applied to model biological pathways and interpret the temporal data. Hill equations are the preferred formats to represent the reaction rate in differential equation frameworks, due to their simple structures and their capabilities for easy fitting to saturated experimental measurements. However, Hill equations are highly nonlinearly parameterized functions, and parameters in these functions cannot be measured easily. Additionally, because of its high nonlinearity, adaptive parameter estimation algorithms developed for linear parameterized differential equations cannot be applied. Therefore, parameter estimation in nonlinearly parameterized differential equation models for biological pathways is both challenging and rewarding. In this study, we propose a Bayesian parameter estimation algorithm to estimate parameters in nonlinear mathematical models for biological pathways using time series data. We used the Runge-Kutta method to transform differential equations to difference equations assuming a known structure of the differential equations. This transformation allowed us to generate predictions dependent on previous states and to apply a Bayesian approach, namely, the Markov chain Monte Carlo (MCMC) method. We applied this approach to the biological pathways involved in the left ventricle (LV) response to myocardial infarction (MI) and verified our algorithm by estimating two parameters in a Hill equation embedded in the nonlinear model. We further evaluated our estimation performance with different parameter settings and signal to noise ratios. Our results demonstrated the effectiveness of the algorithm for both linearly and nonlinearly parameterized dynamic systems. Our proposed Bayesian algorithm successfully estimated parameters in nonlinear mathematical models for biological pathways. This method can be further extended to high order systems and thus provides a useful tool to analyze biological dynamics and extract information using temporal data.
Cawley, Gavin C; Talbot, Nicola L C
2006-10-01
Gene selection algorithms for cancer classification, based on the expression of a small number of biomarker genes, have been the subject of considerable research in recent years. Shevade and Keerthi propose a gene selection algorithm based on sparse logistic regression (SLogReg) incorporating a Laplace prior to promote sparsity in the model parameters, and provide a simple but efficient training procedure. The degree of sparsity obtained is determined by the value of a regularization parameter, which must be carefully tuned in order to optimize performance. This normally involves a model selection stage, based on a computationally intensive search for the minimizer of the cross-validation error. In this paper, we demonstrate that a simple Bayesian approach can be taken to eliminate this regularization parameter entirely, by integrating it out analytically using an uninformative Jeffrey's prior. The improved algorithm (BLogReg) is then typically two or three orders of magnitude faster than the original algorithm, as there is no longer a need for a model selection step. The BLogReg algorithm is also free from selection bias in performance estimation, a common pitfall in the application of machine learning algorithms in cancer classification. The SLogReg, BLogReg and Relevance Vector Machine (RVM) gene selection algorithms are evaluated over the well-studied colon cancer and leukaemia benchmark datasets. The leave-one-out estimates of the probability of test error and cross-entropy of the BLogReg and SLogReg algorithms are very similar, however the BlogReg algorithm is found to be considerably faster than the original SLogReg algorithm. Using nested cross-validation to avoid selection bias, performance estimation for SLogReg on the leukaemia dataset takes almost 48 h, whereas the corresponding result for BLogReg is obtained in only 1 min 24 s, making BLogReg by far the more practical algorithm. BLogReg also demonstrates better estimates of conditional probability than the RVM, which are of great importance in medical applications, with similar computational expense. A MATLAB implementation of the sparse logistic regression algorithm with Bayesian regularization (BLogReg) is available from http://theoval.cmp.uea.ac.uk/~gcc/cbl/blogreg/
ERIC Educational Resources Information Center
Kim, Seohyun; Lu, Zhenqiu; Cohen, Allan S.
2018-01-01
Bayesian algorithms have been used successfully in the social and behavioral sciences to analyze dichotomous data particularly with complex structural equation models. In this study, we investigate the use of the Polya-Gamma data augmentation method with Gibbs sampling to improve estimation of structural equation models with dichotomous variables.…
ERIC Educational Resources Information Center
Galbraith, Craig S.; Merrill, Gregory B.; Kline, Doug M.
2012-01-01
In this study we investigate the underlying relational structure between student evaluations of teaching effectiveness (SETEs) and achievement of student learning outcomes in 116 business related courses. Utilizing traditional statistical techniques, a neural network analysis and a Bayesian data reduction and classification algorithm, we find…
Bonello, Nicolas; Sampson, James; Burn, John; Wilson, Ian J; McGrown, Gail; Margison, Geoff P; Thorncroft, Mary; Crossbie, Philip; Povey, Andrew C; Santibanez-Koref, Mauro; Walters, Kevin
2013-11-07
We exploit model-based Bayesian inference methodologies to analyse lung tumour-derived methylation data from a CpG island in the O6-methylguanine-DNA methyltransferase (MGMT) promoter. Interest is in modelling the changes in methylation patterns in a CpG island in the first exon of the promoter during lung tumour development. We propose four competils of methylation state propagation based on two mechanisms. The first is the location-dependence mechanism in which the probability of a gain or loss of methylation at a CpG within the promoter depends upon its location in the CpG sequence. The second mechanism is that of neighbour-dependence in which gain or loss of methylation at a CpG depends upon the methylation status of the immediately preceding CpG. Our data comprises the methylation status at 12 CpGs near the 5' end of the CpG island in two lung tumour samples for both alleles of a nearby polymorphism. We use approximate Bayesian computation, a computationally intensive rejection-sampling algorithm to infer model parameters and compare models without the need to evaluate the likelihood function. We compare the four proposed models using two criteria: the approximate Bayes factors and the distribution of the Euclidean distance between the summary statistics of the observed and simulated datasets. Our model-based analysis demonstrates compelling evidence for both location and neighbour dependence in the process of aberrant DNA methylation of this MGMT promoter CpG island in lung tumours. We find equivocal evidence to support the hypothesis that the methylation patterns of the two alleles evolve independently. © 2013 Published by Elsevier Ltd. All rights reserved.
Bayesian modeling of flexible cognitive control
Jiang, Jiefeng; Heller, Katherine; Egner, Tobias
2014-01-01
“Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218
Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng
2014-01-01
A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping.
Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng
2014-01-01
A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping. PMID:25254227
Mean Field Variational Bayesian Data Assimilation
NASA Astrophysics Data System (ADS)
Vrettas, M.; Cornford, D.; Opper, M.
2012-04-01
Current data assimilation schemes propose a range of approximate solutions to the classical data assimilation problem, particularly state estimation. Broadly there are three main active research areas: ensemble Kalman filter methods which rely on statistical linearization of the model evolution equations, particle filters which provide a discrete point representation of the posterior filtering or smoothing distribution and 4DVAR methods which seek the most likely posterior smoothing solution. In this paper we present a recent extension to our variational Bayesian algorithm which seeks the most probably posterior distribution over the states, within the family of non-stationary Gaussian processes. Our original work on variational Bayesian approaches to data assimilation sought the best approximating time varying Gaussian process to the posterior smoothing distribution for stochastic dynamical systems. This approach was based on minimising the Kullback-Leibler divergence between the true posterior over paths, and our Gaussian process approximation. So long as the observation density was sufficiently high to bring the posterior smoothing density close to Gaussian the algorithm proved very effective, on lower dimensional systems. However for higher dimensional systems, the algorithm was computationally very demanding. We have been developing a mean field version of the algorithm which treats the state variables at a given time as being independent in the posterior approximation, but still accounts for their relationships between each other in the mean solution arising from the original dynamical system. In this work we present the new mean field variational Bayesian approach, illustrating its performance on a range of classical data assimilation problems. We discuss the potential and limitations of the new approach. We emphasise that the variational Bayesian approach we adopt, in contrast to other variational approaches, provides a bound on the marginal likelihood of the observations given parameters in the model which also allows inference of parameters such as observation errors, and parameters in the model and model error representation, particularly if this is written as a deterministic form with small additive noise. We stress that our approach can address very long time window and weak constraint settings. However like traditional variational approaches our Bayesian variational method has the benefit of being posed as an optimisation problem. We finish with a sketch of the future directions for our approach.
Primal-dual techniques for online algorithms and mechanisms
NASA Astrophysics Data System (ADS)
Liaghat, Vahid
An offline algorithm is one that knows the entire input in advance. An online algorithm, however, processes its input in a serial fashion. In contrast to offline algorithms, an online algorithm works in a local fashion and has to make irrevocable decisions without having the entire input. Online algorithms are often not optimal since their irrevocable decisions may turn out to be inefficient after receiving the rest of the input. For a given online problem, the goal is to design algorithms which are competitive against the offline optimal solutions. In a classical offline scenario, it is often common to see a dual analysis of problems that can be formulated as a linear or convex program. Primal-dual and dual-fitting techniques have been successfully applied to many such problems. Unfortunately, the usual tricks come short in an online setting since an online algorithm should make decisions without knowing even the whole program. In this thesis, we study the competitive analysis of fundamental problems in the literature such as different variants of online matching and online Steiner connectivity, via online dual techniques. Although there are many generic tools for solving an optimization problem in the offline paradigm, in comparison, much less is known for tackling online problems. The main focus of this work is to design generic techniques for solving integral linear optimization problems where the solution space is restricted via a set of linear constraints. A general family of these problems are online packing/covering problems. Our work shows that for several seemingly unrelated problems, primal-dual techniques can be successfully applied as a unifying approach for analyzing these problems. We believe this leads to generic algorithmic frameworks for solving online problems. In the first part of the thesis, we show the effectiveness of our techniques in the stochastic settings and their applications in Bayesian mechanism design. In particular, we introduce new techniques for solving a fundamental linear optimization problem, namely, the stochastic generalized assignment problem (GAP). This packing problem generalizes various problems such as online matching, ad allocation, bin packing, etc. We furthermore show applications of such results in the mechanism design by introducing Prophet Secretary, a novel Bayesian model for online auctions. In the second part of the thesis, we focus on the covering problems. We develop the framework of "Disk Painting" for a general class of network design problems that can be characterized by proper functions. This class generalizes the node-weighted and edge-weighted variants of several well-known Steiner connectivity problems. We furthermore design a generic technique for solving the prize-collecting variants of these problems when there exists a dual analysis for the non-prize-collecting counterparts. Hence, we solve the online prize-collecting variants of several network design problems for the first time. Finally we focus on designing techniques for online problems with mixed packing/covering constraints. We initiate the study of degree-bounded graph optimization problems in the online setting by designing an online algorithm with a tight competitive ratio for the degree-bounded Steiner forest problem. We hope these techniques establishes a starting point for the analysis of the important class of online degree-bounded optimization on graphs.
NASA Astrophysics Data System (ADS)
Bukhari, W.; Hong, S.-M.
2015-01-01
Motion-adaptive radiotherapy aims to deliver a conformal dose to the target tumour with minimal normal tissue exposure by compensating for tumour motion in real time. The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting and gating respiratory motion that utilizes a model-based and a model-free Bayesian framework by combining them in a cascade structure. The algorithm, named EKF-GPR+, implements a gating function without pre-specifying a particular region of the patient’s breathing cycle. The algorithm first employs an extended Kalman filter (LCM-EKF) to predict the respiratory motion and then uses a model-free Gaussian process regression (GPR) to correct the error of the LCM-EKF prediction. The GPR is a non-parametric Bayesian algorithm that yields predictive variance under Gaussian assumptions. The EKF-GPR+ algorithm utilizes the predictive variance from the GPR component to capture the uncertainty in the LCM-EKF prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification allows us to pause the treatment beam over such instances. EKF-GPR+ implements the gating function by using simple calculations based on the predictive variance with no additional detection mechanism. A sparse approximation of the GPR algorithm is employed to realize EKF-GPR+ in real time. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPR+. The experimental results show that the EKF-GPR+ algorithm effectively reduces the prediction error in a root-mean-square (RMS) sense by employing the gating function, albeit at the cost of a reduced duty cycle. As an example, EKF-GPR+ reduces the patient-wise RMS error to 37%, 39% and 42% in percent ratios relative to no prediction for a duty cycle of 80% at lookahead lengths of 192 ms, 384 ms and 576 ms, respectively. The experiments also confirm that EKF-GPR+ controls the duty cycle with reasonable accuracy.
Mishra, Arabinda; Anderson, Adam W; Wu, Xi; Gore, John C; Ding, Zhaohua
2010-08-01
The purpose of this work is to design a neuronal fiber tracking algorithm, which will be more suitable for reconstruction of fibers associated with functionally important regions in the human brain. The functional activations in the brain normally occur in the gray matter regions. Hence the fibers bordering these regions are weakly myelinated, resulting in poor performance of conventional tractography methods to trace the fiber links between them. A lower fractional anisotropy in this region makes it even difficult to track the fibers in the presence of noise. In this work, the authors focused on a stochastic approach to reconstruct these fiber pathways based on a Bayesian regularization framework. To estimate the true fiber direction (propagation vector), the a priori and conditional probability density functions are calculated in advance and are modeled as multivariate normal. The variance of the estimated tensor element vector is associated with the uncertainty due to noise and partial volume averaging (PVA). An adaptive and multiple sampling of the estimated tensor element vector, which is a function of the pre-estimated variance, overcomes the effect of noise and PVA in this work. The algorithm has been rigorously tested using a variety of synthetic data sets. The quantitative comparison of the results to standard algorithms motivated the authors to implement it for in vivo DTI data analysis. The algorithm has been implemented to delineate fibers in two major language pathways (Broca's to SMA and Broca's to Wernicke's) across 12 healthy subjects. Though the mean of standard deviation was marginally bigger than conventional (Euler's) approach [P. J. Basser et al., "In vivo fiber tractography using DT-MRI data," Magn. Reson. Med. 44(4), 625-632 (2000)], the number of extracted fibers in this approach was significantly higher. The authors also compared the performance of the proposed method to Lu's method [Y. Lu et al., "Improved fiber tractography with Bayesian tensor regularization," Neuroimage 31(3), 1061-1074 (2006)] and Friman's stochastic approach [O. Friman et al., "A Bayesian approach for stochastic white matter tractography," IEEE Trans. Med. Imaging 25(8), 965-978 (2006)]. Overall performance of the approach is found to be superior to above two methods, particularly when the signal-to-noise ratio was low. The authors observed that an adaptive sampling of the tensor element vectors, estimated as a function of the variance in a Bayesian framework, can effectively delineate neuronal fibers to analyze the structure-function relationship in human brain. The simulated and in vivo results are in good agreement with the theoretical aspects of the algorithm.
A Bayesian analysis of HAT-P-7b using the EXONEST algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Placek, Ben; Knuth, Kevin H.
2015-01-13
The study of exoplanets (planets orbiting other stars) is revolutionizing the way we view our universe. High-precision photometric data provided by the Kepler Space Telescope (Kepler) enables not only the detection of such planets, but also their characterization. This presents a unique opportunity to apply Bayesian methods to better characterize the multitude of previously confirmed exoplanets. This paper focuses on applying the EXONEST algorithm to characterize the transiting short-period-hot-Jupiter, HAT-P-7b (also referred to as Kepler-2b). EXONEST evaluates a suite of exoplanet photometric models by applying Bayesian Model Selection, which is implemented with the MultiNest algorithm. These models take into accountmore » planetary effects, such as reflected light and thermal emissions, as well as the effect of the planetary motion on the host star, such as Doppler beaming, or boosting, of light from the reflex motion of the host star, and photometric variations due to the planet-induced ellipsoidal shape of the host star. By calculating model evidences, one can determine which model best describes the observed data, thus identifying which effects dominate the planetary system. Presented are parameter estimates and model evidences for HAT-P-7b.« less
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1973-01-01
An algorithm and computer program are presented for generating all the distinct 2(p-q) fractional factorial designs. Some applications of this algorithm to the construction of tables of designs and of designs for nonstandard situations and its use in Bayesian design are discussed. An appendix includes a discussion of an actual experiment whose design was facilitated by the algorithm.
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
Archambeau, Cédric; Verleysen, Michel
2007-01-01
A new variational Bayesian learning algorithm for Student-t mixture models is introduced. This algorithm leads to (i) robust density estimation, (ii) robust clustering and (iii) robust automatic model selection. Gaussian mixture models are learning machines which are based on a divide-and-conquer approach. They are commonly used for density estimation and clustering tasks, but are sensitive to outliers. The Student-t distribution has heavier tails than the Gaussian distribution and is therefore less sensitive to any departure of the empirical distribution from Gaussianity. As a consequence, the Student-t distribution is suitable for constructing robust mixture models. In this work, we formalize the Bayesian Student-t mixture model as a latent variable model in a different way from Svensén and Bishop [Svensén, M., & Bishop, C. M. (2005). Robust Bayesian mixture modelling. Neurocomputing, 64, 235-252]. The main difference resides in the fact that it is not necessary to assume a factorized approximation of the posterior distribution on the latent indicator variables and the latent scale variables in order to obtain a tractable solution. Not neglecting the correlations between these unobserved random variables leads to a Bayesian model having an increased robustness. Furthermore, it is expected that the lower bound on the log-evidence is tighter. Based on this bound, the model complexity, i.e. the number of components in the mixture, can be inferred with a higher confidence.
Perception as Evidence Accumulation and Bayesian Inference: Insights from Masked Priming
ERIC Educational Resources Information Center
Norris, Dennis; Kinoshita, Sachiko
2008-01-01
The authors argue that perception is Bayesian inference based on accumulation of noisy evidence and that, in masked priming, the perceptual system is tricked into treating the prime and the target as a single object. Of the 2 algorithms considered for formalizing how the evidence sampled from a prime and target is combined, only 1 was shown to be…
ERIC Educational Resources Information Center
Tsutakawa, Robert K.; Lin, Hsin Ying
Item response curves for a set of binary responses are studied from a Bayesian viewpoint of estimating the item parameters. For the two-parameter logistic model with normally distributed ability, restricted bivariate beta priors are used to illustrate the computation of the posterior mode via the EM algorithm. The procedure is illustrated by data…
Precise Network Modeling of Systems Genetics Data Using the Bayesian Network Webserver.
Ziebarth, Jesse D; Cui, Yan
2017-01-01
The Bayesian Network Webserver (BNW, http://compbio.uthsc.edu/BNW ) is an integrated platform for Bayesian network modeling of biological datasets. It provides a web-based network modeling environment that seamlessly integrates advanced algorithms for probabilistic causal modeling and reasoning with Bayesian networks. BNW is designed for precise modeling of relatively small networks that contain less than 20 nodes. The structure learning algorithms used by BNW guarantee the discovery of the best (most probable) network structure given the data. To facilitate network modeling across multiple biological levels, BNW provides a very flexible interface that allows users to assign network nodes into different tiers and define the relationships between and within the tiers. This function is particularly useful for modeling systems genetics datasets that often consist of multiscalar heterogeneous genotype-to-phenotype data. BNW enables users to, within seconds or minutes, go from having a simply formatted input file containing a dataset to using a network model to make predictions about the interactions between variables and the potential effects of experimental interventions. In this chapter, we will introduce the functions of BNW and show how to model systems genetics datasets with BNW.
NASA Astrophysics Data System (ADS)
Kopka, P.; Wawrzynczak, A.; Borysiewicz, M.
2015-09-01
In many areas of application, a central problem is a solution to the inverse problem, especially estimation of the unknown model parameters to model the underlying dynamics of a physical system precisely. In this situation, the Bayesian inference is a powerful tool to combine observed data with prior knowledge to gain the probability distribution of searched parameters. We have applied the modern methodology named Sequential Approximate Bayesian Computation (S-ABC) to the problem of tracing the atmospheric contaminant source. The ABC is technique commonly used in the Bayesian analysis of complex models and dynamic system. Sequential methods can significantly increase the efficiency of the ABC. In the presented algorithm, the input data are the on-line arriving concentrations of released substance registered by distributed sensor network from OVER-LAND ATMOSPHERIC DISPERSION (OLAD) experiment. The algorithm output are the probability distributions of a contamination source parameters i.e. its particular location, release rate, speed and direction of the movement, start time and duration. The stochastic approach presented in this paper is completely general and can be used in other fields where the parameters of the model bet fitted to the observable data should be found.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
An experimental phylogeny to benchmark ancestral sequence reconstruction
Randall, Ryan N.; Radford, Caelan E.; Roof, Kelsey A.; Natarajan, Divya K.; Gaucher, Eric A.
2016-01-01
Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as ‘modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences. PMID:27628687
Automatic inference of multicellular regulatory networks using informative priors.
Sun, Xiaoyun; Hong, Pengyu
2009-01-01
To fully understand the mechanisms governing animal development, computational models and algorithms are needed to enable quantitative studies of the underlying regulatory networks. We developed a mathematical model based on dynamic Bayesian networks to model multicellular regulatory networks that govern cell differentiation processes. A machine-learning method was developed to automatically infer such a model from heterogeneous data. We show that the model inference procedure can be greatly improved by incorporating interaction data across species. The proposed approach was applied to C. elegans vulval induction to reconstruct a model capable of simulating C. elegans vulval induction under 73 different genetic conditions.
Bayesian Correlation Analysis for Sequence Count Data
Lau, Nelson; Perkins, Theodore J.
2016-01-01
Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities’ measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low—especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities’ signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset. PMID:27701449
Online Variational Bayesian Filtering-Based Mobile Target Tracking in Wireless Sensor Networks
Zhou, Bingpeng; Chen, Qingchun; Li, Tiffany Jing; Xiao, Pei
2014-01-01
The received signal strength (RSS)-based online tracking for a mobile node in wireless sensor networks (WSNs) is investigated in this paper. Firstly, a multi-layer dynamic Bayesian network (MDBN) is introduced to characterize the target mobility with either directional or undirected movement. In particular, it is proposed to employ the Wishart distribution to approximate the time-varying RSS measurement precision's randomness due to the target movement. It is shown that the proposed MDBN offers a more general analysis model via incorporating the underlying statistical information of both the target movement and observations, which can be utilized to improve the online tracking capability by exploiting the Bayesian statistics. Secondly, based on the MDBN model, a mean-field variational Bayesian filtering (VBF) algorithm is developed to realize the online tracking of a mobile target in the presence of nonlinear observations and time-varying RSS precision, wherein the traditional Bayesian filtering scheme cannot be directly employed. Thirdly, a joint optimization between the real-time velocity and its prior expectation is proposed to enable online velocity tracking in the proposed online tacking scheme. Finally, the associated Bayesian Cramer–Rao Lower Bound (BCRLB) analysis and numerical simulations are conducted. Our analysis unveils that, by exploiting the potential state information via the general MDBN model, the proposed VBF algorithm provides a promising solution to the online tracking of a mobile node in WSNs. In addition, it is shown that the final tracking accuracy linearly scales with its expectation when the RSS measurement precision is time-varying. PMID:25393784
Detection of multiple damages employing best achievable eigenvectors under Bayesian inference
NASA Astrophysics Data System (ADS)
Prajapat, Kanta; Ray-Chaudhuri, Samit
2018-05-01
A novel approach is presented in this work to localize simultaneously multiple damaged elements in a structure along with the estimation of damage severity for each of the damaged elements. For detection of damaged elements, a best achievable eigenvector based formulation has been derived. To deal with noisy data, Bayesian inference is employed in the formulation wherein the likelihood of the Bayesian algorithm is formed on the basis of errors between the best achievable eigenvectors and the measured modes. In this approach, the most probable damage locations are evaluated under Bayesian inference by generating combinations of various possible damaged elements. Once damage locations are identified, damage severities are estimated using a Bayesian inference Markov chain Monte Carlo simulation. The efficiency of the proposed approach has been demonstrated by carrying out a numerical study involving a 12-story shear building. It has been found from this study that damage scenarios involving as low as 10% loss of stiffness in multiple elements are accurately determined (localized and severities quantified) even when 2% noise contaminated modal data are utilized. Further, this study introduces a term parameter impact (evaluated based on sensitivity of modal parameters towards structural parameters) to decide the suitability of selecting a particular mode, if some idea about the damaged elements are available. It has been demonstrated here that the accuracy and efficiency of the Bayesian quantification algorithm increases if damage localization is carried out a-priori. An experimental study involving a laboratory scale shear building and different stiffness modification scenarios shows that the proposed approach is efficient enough to localize the stories with stiffness modification.
Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler.
Jin, Ick Hoon; Yuan, Ying; Liang, Faming
2013-10-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the intractable normalizing constant and model degeneracy. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the intractable normalizing constant and model degeneracy issues encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stinnett, Jacob; Sullivan, Clair J.; Xiong, Hao
Low-resolution isotope identifiers are widely deployed for nuclear security purposes, but these detectors currently demonstrate problems in making correct identifications in many typical usage scenarios. While there are many hardware alternatives and improvements that can be made, performance on existing low resolution isotope identifiers should be able to be improved by developing new identification algorithms. We have developed a wavelet-based peak extraction algorithm and an implementation of a Bayesian classifier for automated peak-based identification. The peak extraction algorithm has been extended to compute uncertainties in the peak area calculations. To build empirical joint probability distributions of the peak areas andmore » uncertainties, a large set of spectra were simulated in MCNP6 and processed with the wavelet-based feature extraction algorithm. Kernel density estimation was then used to create a new component of the likelihood function in the Bayesian classifier. Furthermore, identification performance is demonstrated on a variety of real low-resolution spectra, including Category I quantities of special nuclear material.« less
Robust Bayesian Algorithm for Targeted Compound Screening in Forensic Toxicology.
Woldegebriel, Michael; Gonsalves, John; van Asten, Arian; Vivó-Truyols, Gabriel
2016-02-16
As part of forensic toxicological investigation of cases involving unexpected death of an individual, targeted or untargeted xenobiotic screening of post-mortem samples is normally conducted. To this end, liquid chromatography (LC) coupled to high-resolution mass spectrometry (MS) is typically employed. For data analysis, almost all commonly applied algorithms are threshold-based (frequentist). These algorithms examine the value of a certain measurement (e.g., peak height) to decide whether a certain xenobiotic of interest (XOI) is present/absent, yielding a binary output. Frequentist methods pose a problem when several sources of information [e.g., shape of the chromatographic peak, isotopic distribution, estimated mass-to-charge ratio (m/z), adduct, etc.] need to be combined, requiring the approach to make arbitrary decisions at substep levels of data analysis. We hereby introduce a novel Bayesian probabilistic algorithm for toxicological screening. The method tackles the problem with a different strategy. It is not aimed at reaching a final conclusion regarding the presence of the XOI, but it estimates its probability. The algorithm effectively and efficiently combines all possible pieces of evidence from the chromatogram and calculates the posterior probability of the presence/absence of XOI features. This way, the model can accommodate more information by updating the probability if extra evidence is acquired. The final probabilistic result assists the end user to make a final decision with respect to the presence/absence of the xenobiotic. The Bayesian method was validated and found to perform better (in terms of false positives and false negatives) than the vendor-supplied software package.
Recursive algorithms for phylogenetic tree counting.
Gavryushkina, Alexandra; Welch, David; Drummond, Alexei J
2013-10-28
In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult. We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals. These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.
Engelhardt, Benjamin; Kschischo, Maik; Fröhlich, Holger
2017-06-01
Ordinary differential equations (ODEs) are a popular approach to quantitatively model molecular networks based on biological knowledge. However, such knowledge is typically restricted. Wrongly modelled biological mechanisms as well as relevant external influence factors that are not included into the model are likely to manifest in major discrepancies between model predictions and experimental data. Finding the exact reasons for such observed discrepancies can be quite challenging in practice. In order to address this issue, we suggest a Bayesian approach to estimate hidden influences in ODE-based models. The method can distinguish between exogenous and endogenous hidden influences. Thus, we can detect wrongly specified as well as missed molecular interactions in the model. We demonstrate the performance of our Bayesian dynamic elastic-net with several ordinary differential equation models from the literature, such as human JAK-STAT signalling, information processing at the erythropoietin receptor, isomerization of liquid α -Pinene, G protein cycling in yeast and UV-B triggered signalling in plants. Moreover, we investigate a set of commonly known network motifs and a gene-regulatory network. Altogether our method supports the modeller in an algorithmic manner to identify possible sources of errors in ODE-based models on the basis of experimental data. © 2017 The Author(s).
NASA Astrophysics Data System (ADS)
Nguyen, Emmanuel; Antoni, Jerome; Grondin, Olivier
2009-12-01
In the automotive industry, the necessary reduction of pollutant emission for new Diesel engines requires the control of combustion events. This control is efficient provided combustion parameters such as combustion occurrence and combustion energy are relevant. Combustion parameters are traditionally measured from cylinder pressure sensors. However this kind of sensor is expensive and has a limited lifetime. Thus this paper proposes to use only one cylinder pressure on a multi-cylinder engine and to extract combustion parameters from the other cylinders with low cost knock sensors. Knock sensors measure the vibration circulating on the engine block, hence they do not all contain the information on the combustion processes, but they are also contaminated by other mechanical noises that disorder the signal. The question is how to combine the information coming from one cylinder pressure and knock sensors to obtain the most relevant combustion parameters in all engine cylinders. In this paper, the issue is addressed trough the Bayesian inference formalism. In that cylinder where a cylinder pressure sensor is mounted, combustion parameters will be measured directly. In the other cylinders, they will be measured indirectly from Bayesian inference. Experimental results obtained on a four cylinder Diesel engine demonstrate the effectiveness of the proposed algorithm toward that purpose.
Testing Bayesian and heuristic predictions of mass judgments of colliding objects
Sanborn, Adam N.
2014-01-01
Mass judgments of colliding objects have been used to explore people's understanding of the physical world because they are ecologically relevant, yet people display biases that are most easily explained by a small set of heuristics. Recent work has challenged the heuristic explanation, by producing the same biases from a model that copes with perceptual uncertainty by using Bayesian inference with a prior based on the correct combination rules from Newtonian mechanics (noisy Newton). Here I test the predictions of the leading heuristic model (Gilden and Proffitt, 1989) against the noisy Newton model using a novel manipulation of the standard mass judgment task: making one of the objects invisible post-collision. The noisy Newton model uses the remaining information to predict above-chance performance, while the leading heuristic model predicts chance performance when one or the other final velocity is occluded. An experiment using two different types of occlusion showed better-than-chance performance and response patterns that followed the predictions of the noisy Newton model. The results demonstrate that people can make sensible physical judgments even when information critical for the judgment is missing, and that a Bayesian model can serve as a guide in these situations. Possible algorithmic-level accounts of this task that more closely correspond to the noisy Newton model are explored. PMID:25206345
Gaussian process tomography for soft x-ray spectroscopy at WEST without equilibrium information
NASA Astrophysics Data System (ADS)
Wang, T.; Mazon, D.; Svensson, J.; Li, D.; Jardin, A.; Verdoolaege, G.
2018-06-01
Gaussian process tomography (GPT) is a recently developed tomography method based on the Bayesian probability theory [J. Svensson, JET Internal Report EFDA-JET-PR(11)24, 2011 and Li et al., Rev. Sci. Instrum. 84, 083506 (2013)]. By modeling the soft X-ray (SXR) emissivity field in a poloidal cross section as a Gaussian process, the Bayesian SXR tomography can be carried out in a robust and extremely fast way. Owing to the short execution time of the algorithm, GPT is an important candidate for providing real-time reconstructions with a view to impurity transport and fast magnetohydrodynamic control. In addition, the Bayesian formalism allows quantifying uncertainty on the inferred parameters. In this paper, the GPT technique is validated using a synthetic data set expected from the WEST tokamak, and the results are shown of its application to the reconstruction of SXR emissivity profiles measured on Tore Supra. The method is compared with the standard algorithm based on minimization of the Fisher information.
pyblocxs: Bayesian Low-Counts X-ray Spectral Analysis in Sherpa
NASA Astrophysics Data System (ADS)
Siemiginowska, A.; Kashyap, V.; Refsdal, B.; van Dyk, D.; Connors, A.; Park, T.
2011-07-01
Typical X-ray spectra have low counts and should be modeled using the Poisson distribution. However, χ2 statistic is often applied as an alternative and the data are assumed to follow the Gaussian distribution. A variety of weights to the statistic or a binning of the data is performed to overcome the low counts issues. However, such modifications introduce biases or/and a loss of information. Standard modeling packages such as XSPEC and Sherpa provide the Poisson likelihood and allow computation of rudimentary MCMC chains, but so far do not allow for setting a full Bayesian model. We have implemented a sophisticated Bayesian MCMC-based algorithm to carry out spectral fitting of low counts sources in the Sherpa environment. The code is a Python extension to Sherpa and allows to fit a predefined Sherpa model to high-energy X-ray spectral data and other generic data. We present the algorithm and discuss several issues related to the implementation, including flexible definition of priors and allowing for variations in the calibration information.
A Bayesian Approach for Sensor Optimisation in Impact Identification
Mallardo, Vincenzo; Sharif Khodaei, Zahra; Aliabadi, Ferri M. H.
2016-01-01
This paper presents a Bayesian approach for optimizing the position of sensors aimed at impact identification in composite structures under operational conditions. The uncertainty in the sensor data has been represented by statistical distributions of the recorded signals. An optimisation strategy based on the genetic algorithm is proposed to find the best sensor combination aimed at locating impacts on composite structures. A Bayesian-based objective function is adopted in the optimisation procedure as an indicator of the performance of meta-models developed for different sensor combinations to locate various impact events. To represent a real structure under operational load and to increase the reliability of the Structural Health Monitoring (SHM) system, the probability of malfunctioning sensors is included in the optimisation. The reliability and the robustness of the procedure is tested with experimental and numerical examples. Finally, the proposed optimisation algorithm is applied to a composite stiffened panel for both the uniform and non-uniform probability of impact occurrence. PMID:28774064
Bayesian image reconstruction for improving detection performance of muon tomography.
Wang, Guobao; Schultz, Larry J; Qi, Jinyi
2009-05-01
Muon tomography is a novel technology that is being developed for detecting high-Z materials in vehicles or cargo containers. Maximum likelihood methods have been developed for reconstructing the scattering density image from muon measurements. However, the instability of maximum likelihood estimation often results in noisy images and low detectability of high-Z targets. In this paper, we propose using regularization to improve the image quality of muon tomography. We formulate the muon reconstruction problem in a Bayesian framework by introducing a prior distribution on scattering density images. An iterative shrinkage algorithm is derived to maximize the log posterior distribution. At each iteration, the algorithm obtains the maximum a posteriori update by shrinking an unregularized maximum likelihood update. Inverse quadratic shrinkage functions are derived for generalized Laplacian priors and inverse cubic shrinkage functions are derived for generalized Gaussian priors. Receiver operating characteristic studies using simulated data demonstrate that the Bayesian reconstruction can greatly improve the detection performance of muon tomography.
NASA Astrophysics Data System (ADS)
Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.
2015-03-01
We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.
Bayesian decoding using unsorted spikes in the rat hippocampus
Layton, Stuart P.; Chen, Zhe; Wilson, Matthew A.
2013-01-01
A fundamental task in neuroscience is to understand how neural ensembles represent information. Population decoding is a useful tool to extract information from neuronal populations based on the ensemble spiking activity. We propose a novel Bayesian decoding paradigm to decode unsorted spikes in the rat hippocampus. Our approach uses a direct mapping between spike waveform features and covariates of interest and avoids accumulation of spike sorting errors. Our decoding paradigm is nonparametric, encoding model-free for representing stimuli, and extracts information from all available spikes and their waveform features. We apply the proposed Bayesian decoding algorithm to a position reconstruction task for freely behaving rats based on tetrode recordings of rat hippocampal neuronal activity. Our detailed decoding analyses demonstrate that our approach is efficient and better utilizes the available information in the nonsortable hash than the standard sorting-based decoding algorithm. Our approach can be adapted to an online encoding/decoding framework for applications that require real-time decoding, such as brain-machine interfaces. PMID:24089403
Bayesian Networks for Modeling Dredging Decisions
2011-10-01
change scenarios. Arctic Expert elicitation Netica Bacon et al . 2002 Identify factors that might lead to a change in land use from farming to...tree) algorithms developed by Lauritzen and Spiegelhalter (1988) and Jensen et al . (1990). Statistical inference is simply the process of...causality when constructing a Bayesian network (Kjaerulff and Madsen 2008, Darwiche 2009, Marcot et al . 2006). A knowledge representation approach is the
BaTMAn: Bayesian Technique for Multi-image Analysis
NASA Astrophysics Data System (ADS)
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.
Informed Source Separation: A Bayesian Tutorial
NASA Technical Reports Server (NTRS)
Knuth, Kevin H.
2005-01-01
Source separation problems are ubiquitous in the physical sciences; any situation where signals are superimposed calls for source separation to estimate the original signals. In h s tutorial I will discuss the Bayesian approach to the source separation problem. This approach has a specific advantage in that it requires the designer to explicitly describe the signal model in addition to any other information or assumptions that go into the problem description. This leads naturally to the idea of informed source separation, where the algorithm design incorporates relevant information about the specific problem. This approach promises to enable researchers to design their own high-quality algorithms that are specifically tailored to the problem at hand.
Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy
NASA Astrophysics Data System (ADS)
Sharma, Sanjib
2017-08-01
Markov Chain Monte Carlo based Bayesian data analysis has now become the method of choice for analyzing and interpreting data in almost all disciplines of science. In astronomy, over the last decade, we have also seen a steady increase in the number of papers that employ Monte Carlo based Bayesian analysis. New, efficient Monte Carlo based methods are continuously being developed and explored. In this review, we first explain the basics of Bayesian theory and discuss how to set up data analysis problems within this framework. Next, we provide an overview of various Monte Carlo based methods for performing Bayesian data analysis. Finally, we discuss advanced ideas that enable us to tackle complex problems and thus hold great promise for the future. We also distribute downloadable computer software (available at https://github.com/sanjibs/bmcmc/ ) that implements some of the algorithms and examples discussed here.
Unsupervised Unmixing of Hyperspectral Images Accounting for Endmember Variability.
Halimi, Abderrahim; Dobigeon, Nicolas; Tourneret, Jean-Yves
2015-12-01
This paper presents an unsupervised Bayesian algorithm for hyperspectral image unmixing, accounting for endmember variability. The pixels are modeled by a linear combination of endmembers weighted by their corresponding abundances. However, the endmembers are assumed random to consider their variability in the image. An additive noise is also considered in the proposed model, generalizing the normal compositional model. The proposed algorithm exploits the whole image to benefit from both spectral and spatial information. It estimates both the mean and the covariance matrix of each endmember in the image. This allows the behavior of each material to be analyzed and its variability to be quantified in the scene. A spatial segmentation is also obtained based on the estimated abundances. In order to estimate the parameters associated with the proposed Bayesian model, we propose to use a Hamiltonian Monte Carlo algorithm. The performance of the resulting unmixing strategy is evaluated through simulations conducted on both synthetic and real data.
A Bayesian Nonparametric Approach to Image Super-Resolution.
Polatkan, Gungor; Zhou, Mingyuan; Carin, Lawrence; Blei, David; Daubechies, Ingrid
2015-02-01
Super-resolution methods form high-resolution images from low-resolution images. In this paper, we develop a new Bayesian nonparametric model for super-resolution. Our method uses a beta-Bernoulli process to learn a set of recurring visual patterns, called dictionary elements, from the data. Because it is nonparametric, the number of elements found is also determined from the data. We test the results on both benchmark and natural images, comparing with several other models from the research literature. We perform large-scale human evaluation experiments to assess the visual quality of the results. In a first implementation, we use Gibbs sampling to approximate the posterior. However, this algorithm is not feasible for large-scale data. To circumvent this, we then develop an online variational Bayes (VB) algorithm. This algorithm finds high quality dictionaries in a fraction of the time needed by the Gibbs sampler.
Tan, Bingyao; Wong, Alexander; Bizheva, Kostadinka
2018-01-01
A novel image processing algorithm based on a modified Bayesian residual transform (MBRT) was developed for the enhancement of morphological and vascular features in optical coherence tomography (OCT) and OCT angiography (OCTA) images. The MBRT algorithm decomposes the original OCT image into multiple residual images, where each image presents information at a unique scale. Scale selective residual adaptation is used subsequently to enhance morphological features of interest, such as blood vessels and tissue layers, and to suppress irrelevant image features such as noise and motion artefacts. The performance of the proposed MBRT algorithm was tested on a series of cross-sectional and enface OCT and OCTA images of retina and brain tissue that were acquired in-vivo. Results show that the MBRT reduces speckle noise and motion-related imaging artefacts locally, thus improving significantly the contrast and visibility of morphological features in the OCT and OCTA images. PMID:29760996
Bayesian analyses of time-interval data for environmental radiation monitoring.
Luo, Peng; Sharp, Julia L; DeVol, Timothy A
2013-01-01
Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
NASA Astrophysics Data System (ADS)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic systemmore » leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.« less
NASA Astrophysics Data System (ADS)
An, M.; Assumpcao, M.
2003-12-01
The joint inversion of receiver function and surface wave is an effective way to diminish the influences of the strong tradeoff among parameters and the different sensitivity to the model parameters in their respective inversions, but the inversion problem becomes more complex. Multi-objective problems can be much more complicated than single-objective inversion in the model selection and optimization. If objectives are involved and conflicting, models can be ordered only partially. In this case, Pareto-optimal preference should be used to select solutions. On the other hand, the inversion to get only a few optimal solutions can not deal properly with the strong tradeoff between parameters, the uncertainties in the observation, the geophysical complexities and even the incompetency of the inversion technique. The effective way is to retrieve the geophysical information statistically from many acceptable solutions, which requires more competent global algorithms. Competent genetic algorithms recently proposed are far superior to the conventional genetic algorithm and can solve hard problems quickly, reliably and accurately. In this work we used one of competent genetic algorithms, Bayesian Optimization Algorithm as the main inverse procedure. This algorithm uses Bayesian networks to draw out inherited information and can use Pareto-optimal preference in the inversion. With this algorithm, the lithospheric structure of Paran"› basin is inverted to fit both the observations of inter-station surface wave dispersion and receiver function.
A comparison of machine learning and Bayesian modelling for molecular serotyping.
Newton, Richard; Wernisch, Lorenz
2017-08-11
Streptococcus pneumoniae is a human pathogen that is a major cause of infant mortality. Identifying the pneumococcal serotype is an important step in monitoring the impact of vaccines used to protect against disease. Genomic microarrays provide an effective method for molecular serotyping. Previously we developed an empirical Bayesian model for the classification of serotypes from a molecular serotyping array. With only few samples available, a model driven approach was the only option. In the meanwhile, several thousand samples have been made available to us, providing an opportunity to investigate serotype classification by machine learning methods, which could complement the Bayesian model. We compare the performance of the original Bayesian model with two machine learning algorithms: Gradient Boosting Machines and Random Forests. We present our results as an example of a generic strategy whereby a preliminary probabilistic model is complemented or replaced by a machine learning classifier once enough data are available. Despite the availability of thousands of serotyping arrays, a problem encountered when applying machine learning methods is the lack of training data containing mixtures of serotypes; due to the large number of possible combinations. Most of the available training data comprises samples with only a single serotype. To overcome the lack of training data we implemented an iterative analysis, creating artificial training data of serotype mixtures by combining raw data from single serotype arrays. With the enhanced training set the machine learning algorithms out perform the original Bayesian model. However, for serotypes currently lacking sufficient training data the best performing implementation was a combination of the results of the Bayesian Model and the Gradient Boosting Machine. As well as being an effective method for classifying biological data, machine learning can also be used as an efficient method for revealing subtle biological insights, which we illustrate with an example.
Efficient Mean Field Variational Algorithm for Data Assimilation (Invited)
NASA Astrophysics Data System (ADS)
Vrettas, M. D.; Cornford, D.; Opper, M.
2013-12-01
Data assimilation algorithms combine available observations of physical systems with the assumed model dynamics in a systematic manner, to produce better estimates of initial conditions for prediction. Broadly they can be categorized in three main approaches: (a) sequential algorithms, (b) sampling methods and (c) variational algorithms which transform the density estimation problem to an optimization problem. However, given finite computational resources, only a handful of ensemble Kalman filters and 4DVar algorithms have been applied operationally to very high dimensional geophysical applications, such as weather forecasting. In this paper we present a recent extension to our variational Bayesian algorithm which seeks the ';optimal' posterior distribution over the continuous time states, within a family of non-stationary Gaussian processes. Our initial work on variational Bayesian approaches to data assimilation, unlike the well-known 4DVar method which seeks only the most probable solution, computes the best time varying Gaussian process approximation to the posterior smoothing distribution for dynamical systems that can be represented by stochastic differential equations. This approach was based on minimising the Kullback-Leibler divergence, over paths, between the true posterior and our Gaussian process approximation. Whilst the observations were informative enough to keep the posterior smoothing density close to Gaussian the algorithm proved very effective on low dimensional systems (e.g. O(10)D). However for higher dimensional systems, the high computational demands make the algorithm prohibitively expensive. To overcome the difficulties presented in the original framework and make our approach more efficient in higher dimensional systems we have been developing a new mean field version of the algorithm which treats the state variables at any given time as being independent in the posterior approximation, while still accounting for their relationships in the mean solution arising from the original system dynamics. Here we present this new mean field approach, illustrating its performance on a range of benchmark data assimilation problems whose dimensionality varies from O(10) to O(10^3)D. We emphasise that the variational Bayesian approach we adopt, unlike other variational approaches, provides a natural bound on the marginal likelihood of the observations given the model parameters which also allows for inference of (hyper-) parameters such as observational errors, parameters in the dynamical model and model error representation. We also stress that since our approach is intrinsically parallel it can be implemented very efficiently to address very long data assimilation time windows. Moreover, like most traditional variational approaches our Bayesian variational method has the benefit of being posed as an optimisation problem therefore its complexity can be tuned to the available computational resources. We finish with a sketch of possible future directions.
Scheduling multimedia services in cloud computing environment
NASA Astrophysics Data System (ADS)
Liu, Yunchang; Li, Chunlin; Luo, Youlong; Shao, Yanling; Zhang, Jing
2018-02-01
Currently, security is a critical factor for multimedia services running in the cloud computing environment. As an effective mechanism, trust can improve security level and mitigate attacks within cloud computing environments. Unfortunately, existing scheduling strategy for multimedia service in the cloud computing environment do not integrate trust mechanism when making scheduling decisions. In this paper, we propose a scheduling scheme for multimedia services in multi clouds. At first, a novel scheduling architecture is presented. Then, We build a trust model including both subjective trust and objective trust to evaluate the trust degree of multimedia service providers. By employing Bayesian theory, the subjective trust degree between multimedia service providers and users is obtained. According to the attributes of QoS, the objective trust degree of multimedia service providers is calculated. Finally, a scheduling algorithm integrating trust of entities is proposed by considering the deadline, cost and trust requirements of multimedia services. The scheduling algorithm heuristically hunts for reasonable resource allocations and satisfies the requirement of trust and meets deadlines for the multimedia services. Detailed simulated experiments demonstrate the effectiveness and feasibility of the proposed trust scheduling scheme.
Hu, Ying S; Zhu, Quan; Elkins, Keri; Tse, Kevin; Li, Yu; Fitzpatrick, James A J; Verma, Inder M; Cang, Hu
2013-01-01
Heterochromatin in the nucleus of human embryonic cells plays an important role in the epigenetic regulation of gene expression. The architecture of heterochromatin and its dynamic organization remain elusive because of the lack of fast and high-resolution deep-cell imaging tools. We enable this task by advancing instrumental and algorithmic implementation of the localization-based super-resolution technique. We present light-sheet Bayesian super-resolution microscopy (LSBM). We adapt light-sheet illumination for super-resolution imaging by using a novel prism-coupled condenser design to illuminate a thin slice of the nucleus with high signal-to-noise ratio. Coupled with a Bayesian algorithm that resolves overlapping fluorophores from high-density areas, we show, for the first time, nanoscopic features of the heterochromatin structure in both fixed and live human embryonic stem cells. The enhanced temporal resolution allows capturing the dynamic change of heterochromatin with a lateral resolution of 50-60 nm on a time scale of 2.3 s. Light-sheet Bayesian microscopy opens up broad new possibilities of probing nanometer-scale nuclear structures and real-time sub-cellular processes and other previously difficult-to-access intracellular regions of living cells at the single-molecule, and single cell level.
Hu, Ying S; Zhu, Quan; Elkins, Keri; Tse, Kevin; Li, Yu; Fitzpatrick, James A J; Verma, Inder M; Cang, Hu
2016-01-01
Background Heterochromatin in the nucleus of human embryonic cells plays an important role in the epigenetic regulation of gene expression. The architecture of heterochromatin and its dynamic organization remain elusive because of the lack of fast and high-resolution deep-cell imaging tools. We enable this task by advancing instrumental and algorithmic implementation of the localization-based super-resolution technique. Results We present light-sheet Bayesian super-resolution microscopy (LSBM). We adapt light-sheet illumination for super-resolution imaging by using a novel prism-coupled condenser design to illuminate a thin slice of the nucleus with high signal-to-noise ratio. Coupled with a Bayesian algorithm that resolves overlapping fluorophores from high-density areas, we show, for the first time, nanoscopic features of the heterochromatin structure in both fixed and live human embryonic stem cells. The enhanced temporal resolution allows capturing the dynamic change of heterochromatin with a lateral resolution of 50–60 nm on a time scale of 2.3 s. Conclusion Light-sheet Bayesian microscopy opens up broad new possibilities of probing nanometer-scale nuclear structures and real-time sub-cellular processes and other previously difficult-to-access intracellular regions of living cells at the single-molecule, and single cell level. PMID:27795878
Iterative updating of model error for Bayesian inversion
NASA Astrophysics Data System (ADS)
Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew
2018-02-01
In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.
Finite element model updating using the shadow hybrid Monte Carlo technique
NASA Astrophysics Data System (ADS)
Boulkaibet, I.; Mthembu, L.; Marwala, T.; Friswell, M. I.; Adhikari, S.
2015-02-01
Recent research in the field of finite element model updating (FEM) advocates the adoption of Bayesian analysis techniques to dealing with the uncertainties associated with these models. However, Bayesian formulations require the evaluation of the Posterior Distribution Function which may not be available in analytical form. This is the case in FEM updating. In such cases sampling methods can provide good approximations of the Posterior distribution when implemented in the Bayesian context. Markov Chain Monte Carlo (MCMC) algorithms are the most popular sampling tools used to sample probability distributions. However, the efficiency of these algorithms is affected by the complexity of the systems (the size of the parameter space). The Hybrid Monte Carlo (HMC) offers a very important MCMC approach to dealing with higher-dimensional complex problems. The HMC uses the molecular dynamics (MD) steps as the global Monte Carlo (MC) moves to reach areas of high probability where the gradient of the log-density of the Posterior acts as a guide during the search process. However, the acceptance rate of HMC is sensitive to the system size as well as the time step used to evaluate the MD trajectory. To overcome this limitation we propose the use of the Shadow Hybrid Monte Carlo (SHMC) algorithm. The SHMC algorithm is a modified version of the Hybrid Monte Carlo (HMC) and designed to improve sampling for large-system sizes and time steps. This is done by sampling from a modified Hamiltonian function instead of the normal Hamiltonian function. In this paper, the efficiency and accuracy of the SHMC method is tested on the updating of two real structures; an unsymmetrical H-shaped beam structure and a GARTEUR SM-AG19 structure and is compared to the application of the HMC algorithm on the same structures.
al3c: high-performance software for parameter inference using Approximate Bayesian Computation.
Stram, Alexander H; Marjoram, Paul; Chen, Gary K
2015-11-01
The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. astram@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Advanced obstacle avoidance for a laser based wheelchair using optimised Bayesian neural networks.
Trieu, Hoang T; Nguyen, Hung T; Willey, Keith
2008-01-01
In this paper we present an advanced method of obstacle avoidance for a laser based intelligent wheelchair using optimized Bayesian neural networks. Three neural networks are designed for three separate sub-tasks: passing through a door way, corridor and wall following and general obstacle avoidance. The accurate usable accessible space is determined by including the actual wheelchair dimensions in a real-time map used as inputs to each networks. Data acquisitions are performed separately to collect the patterns required for specified sub-tasks. Bayesian frame work is used to determine the optimal neural network structure in each case. Then these networks are trained under the supervision of Bayesian rule. Experiment results showed that compare to the VFH algorithm our neural networks navigated a smoother path following a near optimum trajectory.
NASA Astrophysics Data System (ADS)
Li, L.; Xu, C.-Y.; Engeland, K.
2012-04-01
With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD
Dang, Shilpa; Chaudhury, Santanu; Lall, Brejesh; Roy, Prasun Kumar
2017-06-15
Determination of effective connectivity (EC) among brain regions using fMRI is helpful in understanding the underlying neural mechanisms. Dynamic Bayesian Networks (DBNs) are an appropriate class of probabilistic graphical temporal-models that have been used in past to model EC from fMRI, specifically order-one. High-order DBNs (HO-DBNs) have still not been explored for fMRI data. A fundamental problem faced in the structure-learning of HO-DBN is high computational-burden and low accuracy by the existing heuristic search techniques used for EC detection from fMRI. In this paper, we propose using dynamic programming (DP) principle along with integration of properties of scoring-function in a way to reduce search space for structure-learning of HO-DBNs and finally, for identifying EC from fMRI which has not been done yet to the best of our knowledge. The proposed exact search-&-score learning approach HO-DBN-DP is an extension of the technique which was originally devised for learning a BN's structure from static data (Singh and Moore, 2005). The effectiveness in structure-learning is shown on synthetic fMRI dataset. The algorithm reaches globally-optimal solution in appreciably reduced time-complexity than the static counterpart due to integration of properties. The proof of optimality is provided. The results demonstrate that HO-DBN-DP is comparably more accurate and faster than currently used structure-learning algorithms used for identifying EC from fMRI. The real data EC from HO-DBN-DP shows consistency with previous literature than the classical Granger Causality method. Hence, the DP algorithm can be employed for reliable EC estimates from experimental fMRI data. Copyright © 2017 Elsevier B.V. All rights reserved.
An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.
Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu
2017-08-05
The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K -Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.
NASA Technical Reports Server (NTRS)
Mengshoel, Ole J.; Wilkins, David C.; Roth, Dan
2010-01-01
For hard computational problems, stochastic local search has proven to be a competitive approach to finding optimal or approximately optimal problem solutions. Two key research questions for stochastic local search algorithms are: Which algorithms are effective for initialization? When should the search process be restarted? In the present work we investigate these research questions in the context of approximate computation of most probable explanations (MPEs) in Bayesian networks (BNs). We introduce a novel approach, based on the Viterbi algorithm, to explanation initialization in BNs. While the Viterbi algorithm works on sequences and trees, our approach works on BNs with arbitrary topologies. We also give a novel formalization of stochastic local search, with focus on initialization and restart, using probability theory and mixture models. Experimentally, we apply our methods to the problem of MPE computation, using a stochastic local search algorithm known as Stochastic Greedy Search. By carefully optimizing both initialization and restart, we reduce the MPE search time for application BNs by several orders of magnitude compared to using uniform at random initialization without restart. On several BNs from applications, the performance of Stochastic Greedy Search is competitive with clique tree clustering, a state-of-the-art exact algorithm used for MPE computation in BNs.
Teaching Markov Chain Monte Carlo: Revealing the Basic Ideas behind the Algorithm
ERIC Educational Resources Information Center
Stewart, Wayne; Stewart, Sepideh
2014-01-01
For many scientists, researchers and students Markov chain Monte Carlo (MCMC) simulation is an important and necessary tool to perform Bayesian analyses. The simulation is often presented as a mathematical algorithm and then translated into an appropriate computer program. However, this can result in overlooking the fundamental and deeper…
Stinnett, Jacob; Sullivan, Clair J.; Xiong, Hao
2017-03-02
Low-resolution isotope identifiers are widely deployed for nuclear security purposes, but these detectors currently demonstrate problems in making correct identifications in many typical usage scenarios. While there are many hardware alternatives and improvements that can be made, performance on existing low resolution isotope identifiers should be able to be improved by developing new identification algorithms. We have developed a wavelet-based peak extraction algorithm and an implementation of a Bayesian classifier for automated peak-based identification. The peak extraction algorithm has been extended to compute uncertainties in the peak area calculations. To build empirical joint probability distributions of the peak areas andmore » uncertainties, a large set of spectra were simulated in MCNP6 and processed with the wavelet-based feature extraction algorithm. Kernel density estimation was then used to create a new component of the likelihood function in the Bayesian classifier. Furthermore, identification performance is demonstrated on a variety of real low-resolution spectra, including Category I quantities of special nuclear material.« less
Zeng, Xueqiang; Luo, Gang
2017-12-01
Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era. To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values. We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization. This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.
Uncertainty aggregation and reduction in structure-material performance prediction
NASA Astrophysics Data System (ADS)
Hu, Zhen; Mahadevan, Sankaran; Ao, Dan
2018-02-01
An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.
The image recognition based on neural network and Bayesian decision
NASA Astrophysics Data System (ADS)
Wang, Chugege
2018-04-01
The artificial neural network began in 1940, which is an important part of artificial intelligence. At present, it has become a hot topic in the fields of neuroscience, computer science, brain science, mathematics, and psychology. Thomas Bayes firstly reported the Bayesian theory in 1763. After the development in the twentieth century, it has been widespread in all areas of statistics. In recent years, due to the solution of the problem of high-dimensional integral calculation, Bayesian Statistics has been improved theoretically, which solved many problems that cannot be solved by classical statistics and is also applied to the interdisciplinary fields. In this paper, the related concepts and principles of the artificial neural network are introduced. It also summarizes the basic content and principle of Bayesian Statistics, and combines the artificial neural network technology and Bayesian decision theory and implement them in all aspects of image recognition, such as enhanced face detection method based on neural network and Bayesian decision, as well as the image classification based on the Bayesian decision. It can be seen that the combination of artificial intelligence and statistical algorithms has always been the hot research topic.
Bayesian ensemble refinement by replica simulations and reweighting.
Hummer, Gerhard; Köfinger, Jürgen
2015-12-28
We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
Bayesian ensemble refinement by replica simulations and reweighting
NASA Astrophysics Data System (ADS)
Hummer, Gerhard; Köfinger, Jürgen
2015-12-01
We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
NASA Astrophysics Data System (ADS)
Meillier, Céline; Chatelain, Florent; Michel, Olivier; Bacon, Roland; Piqueras, Laure; Bacher, Raphael; Ayasso, Hacheme
2016-04-01
We present SELFI, the Source Emission Line FInder, a new Bayesian method optimized for detection of faint galaxies in Multi Unit Spectroscopic Explorer (MUSE) deep fields. MUSE is the new panoramic integral field spectrograph at the Very Large Telescope (VLT) that has unique capabilities for spectroscopic investigation of the deep sky. It has provided data cubes with 324 million voxels over a single 1 arcmin2 field of view. To address the challenge of faint-galaxy detection in these large data cubes, we developed a new method that processes 3D data either for modeling or for estimation and extraction of source configurations. This object-based approach yields a natural sparse representation of the sources in massive data fields, such as MUSE data cubes. In the Bayesian framework, the parameters that describe the observed sources are considered random variables. The Bayesian model leads to a general and robust algorithm where the parameters are estimated in a fully data-driven way. This detection algorithm was applied to the MUSE observation of Hubble Deep Field-South. With 27 h total integration time, these observations provide a catalog of 189 sources of various categories and with secured redshift. The algorithm retrieved 91% of the galaxies with only 9% false detection. This method also allowed the discovery of three new Lyα emitters and one [OII] emitter, all without any Hubble Space Telescope counterpart. We analyzed the reasons for failure for some targets, and found that the most important limitation of the method is when faint sources are located in the vicinity of bright spatially resolved galaxies that cannot be approximated by the Sérsic elliptical profile. The software and its documentation are available on the MUSE science web service (muse-vlt.eu/science).
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
The Bayesian Decoding of Force Stimuli from Slowly Adapting Type I Fibers in Humans.
Kasi, Patrick; Wright, James; Khamis, Heba; Birznieks, Ingvars; van Schaik, André
2016-01-01
It is well known that signals encoded by mechanoreceptors facilitate precise object manipulation in humans. It is therefore of interest to study signals encoded by the mechanoreceptors because this will contribute further towards the understanding of fundamental sensory mechanisms that are responsible for coordinating force components during object manipulation. From a practical point of view, this may suggest strategies for designing sensory-controlled biomedical devices and robotic manipulators. We use a two-stage nonlinear decoding paradigm to reconstruct the force stimulus given signals from slowly adapting type one (SA-I) tactile afferents. First, we describe a nonhomogeneous Poisson encoding model which is a function of the force stimulus and the force's rate of change. In the decoding phase, we use a recursive nonlinear Bayesian filter to reconstruct the force profile, given the SA-I spike patterns and parameters described by the encoding model. Under the current encoding model, the mode ratio of force to its derivative is: 1.26 to 1.02. This indicates that the force derivative contributes significantly to the rate of change to the SA-I afferent spike modulation. Furthermore, using recursive Bayesian decoding algorithms is advantageous because it can incorporate past and current information in order to make predictions--consistent with neural systems--with little computational resources. This makes it suitable for interfacing with prostheses.
The Bayesian Decoding of Force Stimuli from Slowly Adapting Type I Fibers in Humans
Wright, James; Khamis, Heba; Birznieks, Ingvars; van Schaik, André
2016-01-01
It is well known that signals encoded by mechanoreceptors facilitate precise object manipulation in humans. It is therefore of interest to study signals encoded by the mechanoreceptors because this will contribute further towards the understanding of fundamental sensory mechanisms that are responsible for coordinating force components during object manipulation. From a practical point of view, this may suggest strategies for designing sensory-controlled biomedical devices and robotic manipulators. We use a two-stage nonlinear decoding paradigm to reconstruct the force stimulus given signals from slowly adapting type one (SA-I) tactile afferents. First, we describe a nonhomogeneous Poisson encoding model which is a function of the force stimulus and the force’s rate of change. In the decoding phase, we use a recursive nonlinear Bayesian filter to reconstruct the force profile, given the SA-I spike patterns and parameters described by the encoding model. Under the current encoding model, the mode ratio of force to its derivative is: 1.26 to 1.02. This indicates that the force derivative contributes significantly to the rate of change to the SA-I afferent spike modulation. Furthermore, using recursive Bayesian decoding algorithms is advantageous because it can incorporate past and current information in order to make predictions—consistent with neural systems—with little computational resources. This makes it suitable for interfacing with prostheses. PMID:27077750
A novel approach for pilot error detection using Dynamic Bayesian Networks.
Saada, Mohamad; Meng, Qinggang; Huang, Tingwen
2014-06-01
In the last decade Dynamic Bayesian Networks (DBNs) have become one type of the most attractive probabilistic modelling framework extensions of Bayesian Networks (BNs) for working under uncertainties from a temporal perspective. Despite this popularity not many researchers have attempted to study the use of these networks in anomaly detection or the implications of data anomalies on the outcome of such models. An abnormal change in the modelled environment's data at a given time, will cause a trailing chain effect on data of all related environment variables in current and consecutive time slices. Albeit this effect fades with time, it still can have an ill effect on the outcome of such models. In this paper we propose an algorithm for pilot error detection, using DBNs as the modelling framework for learning and detecting anomalous data. We base our experiments on the actions of an aircraft pilot, and a flight simulator is created for running the experiments. The proposed anomaly detection algorithm has achieved good results in detecting pilot errors and effects on the whole system.
2013-05-28
those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms . When one class occurs...incremental support vector machine algorithm for online learning when fewer than 50 data points are available. (a) Papers published in peer-reviewed journals...learning environments, where data processing occurs one observation at a time and the classification algorithm improves over time with new
Nonlinear and non-Gaussian Bayesian based handwriting beautification
NASA Astrophysics Data System (ADS)
Shi, Cao; Xiao, Jianguo; Xu, Canhui; Jia, Wenhua
2013-03-01
A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-10-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the technique also allows one to compute informative "error bars" on the volume estimates of individual structures. Copyright © 2013 Elsevier B.V. All rights reserved.
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Leemput, Koen Van
2013-01-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer’s disease classification task. As an additional benefit, the technique also allows one to compute informative “error bars” on the volume estimates of individual structures. PMID:23773521
Zeng, Jianyang; Roberts, Kyle E.; Zhou, Pei
2011-01-01
Abstract A major bottleneck in protein structure determination via nuclear magnetic resonance (NMR) is the lengthy and laborious process of assigning resonances and nuclear Overhauser effect (NOE) cross peaks. Recent studies have shown that accurate backbone folds can be determined using sparse NMR data, such as residual dipolar couplings (RDCs) or backbone chemical shifts. This opens a question of whether we can also determine the accurate protein side-chain conformations using sparse or unassigned NMR data. We attack this question by using unassigned nuclear Overhauser effect spectroscopy (NOESY) data, which records the through-space dipolar interactions between protons nearby in three-dimensional (3D) space. We propose a Bayesian approach with a Markov random field (MRF) model to integrate the likelihood function derived from observed experimental data, with prior information (i.e., empirical molecular mechanics energies) about the protein structures. We unify the side-chain structure prediction problem with the side-chain structure determination problem using unassigned NMR data, and apply the deterministic dead-end elimination (DEE) and A* search algorithms to provably find the global optimum solution that maximizes the posterior probability. We employ a Hausdorff-based measure to derive the likelihood of a rotamer or a pairwise rotamer interaction from unassigned NOESY data. In addition, we apply a systematic and rigorous approach to estimate the experimental noise in NMR data, which also determines the weighting factor of the data term in the scoring function derived from the Bayesian framework. We tested our approach on real NMR data of three proteins: the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), and human ubiquitin. The promising results indicate that our algorithm can be applied in high-resolution protein structure determination. Since our approach does not require any NOE assignment, it can accelerate the NMR structure determination process. PMID:21970619
Bayesian estimation of multicomponent relaxation parameters in magnetic resonance fingerprinting.
McGivney, Debra; Deshmane, Anagha; Jiang, Yun; Ma, Dan; Badve, Chaitra; Sloan, Andrew; Gulani, Vikas; Griswold, Mark
2018-07-01
To estimate multiple components within a single voxel in magnetic resonance fingerprinting when the number and types of tissues comprising the voxel are not known a priori. Multiple tissue components within a single voxel are potentially separable with magnetic resonance fingerprinting as a result of differences in signal evolutions of each component. The Bayesian framework for inverse problems provides a natural and flexible setting for solving this problem when the tissue composition per voxel is unknown. Assuming that only a few entries from the dictionary contribute to a mixed signal, sparsity-promoting priors can be placed upon the solution. An iterative algorithm is applied to compute the maximum a posteriori estimator of the posterior probability density to determine the magnetic resonance fingerprinting dictionary entries that contribute most significantly to mixed or pure voxels. Simulation results show that the algorithm is robust in finding the component tissues of mixed voxels. Preliminary in vivo data confirm this result, and show good agreement in voxels containing pure tissue. The Bayesian framework and algorithm shown provide accurate solutions for the partial-volume problem in magnetic resonance fingerprinting. The flexibility of the method will allow further study into different priors and hyperpriors that can be applied in the model. Magn Reson Med 80:159-170, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
USDA-ARS?s Scientific Manuscript database
An algorithm using a Bayesian classifier was developed to automatically detect olive fruit fly infestations in x-ray images of olives. The data set consisted of 249 olives with various degrees of infestation and 161 non-infested olives. Each olive was x-rayed on film and digital images were acquired...
A Bayesian Approach to Period Searching in Solar Coronal Loops
NASA Astrophysics Data System (ADS)
Scherrer, Bryan; McKenzie, David
2017-03-01
We have applied a Bayesian generalized Lomb-Scargle period searching algorithm to movies of coronal loop images obtained with the Hinode X-ray Telescope (XRT) to search for evidence of periodicities that would indicate resonant heating of the loops. The algorithm makes as its only assumption that there is a single sinusoidal signal within each light curve of the data. Both the amplitudes and noise are taken as free parameters. It is argued that this procedure should be used alongside Fourier and wavelet analyses to more accurately extract periodic intensity modulations in coronal loops. The data analyzed are from XRT Observation Program #129C: “MHD Wave Heating (Thin Filters),” which occurred during 2006 November 13 and focused on active region 10293, which included coronal loops. The first data set spans approximately 10 min with an average cadence of 2 s, 2″ per pixel resolution, and used the Al-mesh analysis filter. The second data set spans approximately 4 min with a 3 s average cadence, 1″ per pixel resolution, and used the Al-poly analysis filter. The final data set spans approximately 22 min at a 6 s average cadence, and used the Al-poly analysis filter. In total, 55 periods of sinusoidal coronal loop oscillations between 5.5 and 59.6 s are discussed, supporting proposals in the literature that resonant absorption of magnetic waves is a viable mechanism for depositing energy in the corona.
A Bayesian Approach to Period Searching in Solar Coronal Loops
DOE Office of Scientific and Technical Information (OSTI.GOV)
Scherrer, Bryan; McKenzie, David
2017-03-01
We have applied a Bayesian generalized Lomb–Scargle period searching algorithm to movies of coronal loop images obtained with the Hinode X-ray Telescope (XRT) to search for evidence of periodicities that would indicate resonant heating of the loops. The algorithm makes as its only assumption that there is a single sinusoidal signal within each light curve of the data. Both the amplitudes and noise are taken as free parameters. It is argued that this procedure should be used alongside Fourier and wavelet analyses to more accurately extract periodic intensity modulations in coronal loops. The data analyzed are from XRT Observation Programmore » 129C: “MHD Wave Heating (Thin Filters),” which occurred during 2006 November 13 and focused on active region 10293, which included coronal loops. The first data set spans approximately 10 min with an average cadence of 2 s, 2″ per pixel resolution, and used the Al-mesh analysis filter. The second data set spans approximately 4 min with a 3 s average cadence, 1″ per pixel resolution, and used the Al-poly analysis filter. The final data set spans approximately 22 min at a 6 s average cadence, and used the Al-poly analysis filter. In total, 55 periods of sinusoidal coronal loop oscillations between 5.5 and 59.6 s are discussed, supporting proposals in the literature that resonant absorption of magnetic waves is a viable mechanism for depositing energy in the corona.« less
MultiNest: Efficient and Robust Bayesian Inference
NASA Astrophysics Data System (ADS)
Feroz, F.; Hobson, M. P.; Bridges, M.
2011-09-01
We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focusing on the extension of the vanilla LambdaCDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software is fully parallelized using MPI and includes an interface to CosmoMC. It will also be released as part of the SuperBayeS package, for the analysis of supersymmetric theories of particle physics, at this http URL.
Searching for efficient Markov chain Monte Carlo proposal kernels
Yang, Ziheng; Rodríguez, Carlos E.
2013-01-01
Markov chain Monte Carlo (MCMC) or the Metropolis–Hastings algorithm is a simulation algorithm that has made modern Bayesian statistical inference possible. Nevertheless, the efficiency of different Metropolis–Hastings proposal kernels has rarely been studied except for the Gaussian proposal. Here we propose a unique class of Bactrian kernels, which avoid proposing values that are very close to the current value, and compare their efficiency with a number of proposals for simulating different target distributions, with efficiency measured by the asymptotic variance of a parameter estimate. The uniform kernel is found to be more efficient than the Gaussian kernel, whereas the Bactrian kernel is even better. When optimal scales are used for both, the Bactrian kernel is at least 50% more efficient than the Gaussian. Implementation in a Bayesian program for molecular clock dating confirms the general applicability of our results to generic MCMC algorithms. Our results refute a previous claim that all proposals had nearly identical performance and will prompt further research into efficient MCMC proposals. PMID:24218600
Logarithmic Laplacian Prior Based Bayesian Inverse Synthetic Aperture Radar Imaging.
Zhang, Shuanghui; Liu, Yongxiang; Li, Xiang; Bi, Guoan
2016-04-28
This paper presents a novel Inverse Synthetic Aperture Radar Imaging (ISAR) algorithm based on a new sparse prior, known as the logarithmic Laplacian prior. The newly proposed logarithmic Laplacian prior has a narrower main lobe with higher tail values than the Laplacian prior, which helps to achieve performance improvement on sparse representation. The logarithmic Laplacian prior is used for ISAR imaging within the Bayesian framework to achieve better focused radar image. In the proposed method of ISAR imaging, the phase errors are jointly estimated based on the minimum entropy criterion to accomplish autofocusing. The maximum a posterior (MAP) estimation and the maximum likelihood estimation (MLE) are utilized to estimate the model parameters to avoid manually tuning process. Additionally, the fast Fourier Transform (FFT) and Hadamard product are used to minimize the required computational efficiency. Experimental results based on both simulated and measured data validate that the proposed algorithm outperforms the traditional sparse ISAR imaging algorithms in terms of resolution improvement and noise suppression.
Weiss, Scott T.
2014-01-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com. PMID:24922310
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Recursive Bayesian recurrent neural networks for time-series modeling.
Mirikitani, Derrick T; Nikolaev, Nikolay
2010-02-01
This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg-Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.
Application of Bayesian a Priori Distributions for Vehicles' Video Tracking Systems
NASA Astrophysics Data System (ADS)
Mazurek, Przemysław; Okarma, Krzysztof
Intelligent Transportation Systems (ITS) helps to improve the quality and quantity of many car traffic parameters. The use of the ITS is possible when the adequate measuring infrastructure is available. Video systems allow for its implementation with relatively low cost due to the possibility of simultaneous video recording of a few lanes of the road at a considerable distance from the camera. The process of tracking can be realized through different algorithms, the most attractive algorithms are Bayesian, because they use the a priori information derived from previous observations or known limitations. Use of this information is crucial for improving the quality of tracking especially for difficult observability conditions, which occur in the video systems under the influence of: smog, fog, rain, snow and poor lighting conditions.
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood.
Karabatsos, George
2018-06-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon previous methods because it provides an omnibus test of the entire hierarchy of cancellation axioms, beyond double cancellation. It does so while accounting for the posterior uncertainty that is inherent in the empirical orderings that are implied by these axioms, together. The new method is illustrated through a test of the cancellation axioms on a classic survey data set, and through the analysis of simulated data.
NASA Technical Reports Server (NTRS)
Olson, William S.; Kummerow, Christian D.; Yang, Song; Petty, Grant W.; Tao, Wei-Kuo; Bell, Thomas L.; Braun, Scott A.; Wang, Yansen; Lang, Stephen E.; Johnson, Daniel E.;
2006-01-01
A revised Bayesian algorithm for estimating surface rain rate, convective rain proportion, and latent heating profiles from satellite-borne passive microwave radiometer observations over ocean backgrounds is described. The algorithm searches a large database of cloud-radiative model simulations to find cloud profiles that are radiatively consistent with a given set of microwave radiance measurements. The properties of these radiatively consistent profiles are then composited to obtain best estimates of the observed properties. The revised algorithm is supported by an expanded and more physically consistent database of cloud-radiative model simulations. The algorithm also features a better quantification of the convective and nonconvective contributions to total rainfall, a new geographic database, and an improved representation of background radiances in rain-free regions. Bias and random error estimates are derived from applications of the algorithm to synthetic radiance data, based upon a subset of cloud-resolving model simulations, and from the Bayesian formulation itself. Synthetic rain-rate and latent heating estimates exhibit a trend of high (low) bias for low (high) retrieved values. The Bayesian estimates of random error are propagated to represent errors at coarser time and space resolutions, based upon applications of the algorithm to TRMM Microwave Imager (TMI) data. Errors in TMI instantaneous rain-rate estimates at 0.5 -resolution range from approximately 50% at 1 mm/h to 20% at 14 mm/h. Errors in collocated spaceborne radar rain-rate estimates are roughly 50%-80% of the TMI errors at this resolution. The estimated algorithm random error in TMI rain rates at monthly, 2.5deg resolution is relatively small (less than 6% at 5 mm day.1) in comparison with the random error resulting from infrequent satellite temporal sampling (8%-35% at the same rain rate). Percentage errors resulting from sampling decrease with increasing rain rate, and sampling errors in latent heating rates follow the same trend. Averaging over 3 months reduces sampling errors in rain rates to 6%-15% at 5 mm day.1, with proportionate reductions in latent heating sampling errors.
Small-Noise Analysis and Symmetrization of Implicit Monte Carlo Samplers
Goodman, Jonathan; Lin, Kevin K.; Morzfeld, Matthias
2015-07-06
Implicit samplers are algorithms for producing independent, weighted samples from multivariate probability distributions. These are often applied in Bayesian data assimilation algorithms. We use Laplace asymptotic expansions to analyze two implicit samplers in the small noise regime. Our analysis suggests a symmetrization of the algorithms that leads to improved implicit sampling schemes at a relatively small additional cost. Here, computational experiments confirm the theory and show that symmetrization is effective for small noise sampling problems.
NASA Astrophysics Data System (ADS)
Noh, Hae Young; Rajagopal, Ram; Kiremidjian, Anne S.
2012-04-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method for the cases where the post-damage feature distribution is unknown a priori. This algorithm extracts features from structural vibration data using time-series analysis and then declares damage using the change point detection method. The change point detection method asymptotically minimizes detection delay for a given false alarm rate. The conventional method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori. Therefore, our algorithm estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using multiple sets of simulated data and a set of experimental data collected from a four-story steel special moment-resisting frame. Our algorithm was able to estimate the post-damage distribution consistently and resulted in detection delays only a few seconds longer than the delays from the conventional method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
Improving diagnostic recognition of primary hyperparathyroidism with machine learning.
Somnay, Yash R; Craven, Mark; McCoy, Kelly L; Carty, Sally E; Wang, Tracy S; Greenberg, Caprice C; Schneider, David F
2017-04-01
Parathyroidectomy offers the only cure for primary hyperparathyroidism, but today only 50% of primary hyperparathyroidism patients are referred for operation, in large part, because the condition is widely under-recognized. The diagnosis of primary hyperparathyroidism can be especially challenging with mild biochemical indices. Machine learning is a collection of methods in which computers build predictive algorithms based on labeled examples. With the aim of facilitating diagnosis, we tested the ability of machine learning to distinguish primary hyperparathyroidism from normal physiology using clinical and laboratory data. This retrospective cohort study used a labeled training set and 10-fold cross-validation to evaluate accuracy of the algorithm. Measures of accuracy included area under the receiver operating characteristic curve, precision (sensitivity), and positive and negative predictive value. Several different algorithms and ensembles of algorithms were tested using the Weka platform. Among 11,830 patients managed operatively at 3 high-volume endocrine surgery programs from March 2001 to August 2013, 6,777 underwent parathyroidectomy for confirmed primary hyperparathyroidism, and 5,053 control patients without primary hyperparathyroidism underwent thyroidectomy. Test-set accuracies for machine learning models were determined using 10-fold cross-validation. Age, sex, and serum levels of preoperative calcium, phosphate, parathyroid hormone, vitamin D, and creatinine were defined as potential predictors of primary hyperparathyroidism. Mild primary hyperparathyroidism was defined as primary hyperparathyroidism with normal preoperative calcium or parathyroid hormone levels. After testing a variety of machine learning algorithms, Bayesian network models proved most accurate, classifying correctly 95.2% of all primary hyperparathyroidism patients (area under receiver operating characteristic = 0.989). Omitting parathyroid hormone from the model did not decrease the accuracy significantly (area under receiver operating characteristic = 0.985). In mild disease cases, however, the Bayesian network model classified correctly 71.1% of patients with normal calcium and 92.1% with normal parathyroid hormone levels preoperatively. Bayesian networking and AdaBoost improved the accuracy of all parathyroid hormone patients to 97.2% cases (area under receiver operating characteristic = 0.994), and 91.9% of primary hyperparathyroidism patients with mild disease. This was significantly improved relative to Bayesian networking alone (P < .0001). Machine learning can diagnose accurately primary hyperparathyroidism without human input even in mild disease. Incorporation of this tool into electronic medical record systems may aid in recognition of this under-diagnosed disorder. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn
2013-04-01
SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.
Iyer, Swathi; Shafran, Izhak; Grayson, David; Gates, Kathleen; Nigg, Joel; Fair, Damien
2013-01-01
Resting state functional connectivity MRI (rs-fcMRI) is a popular technique used to gauge the functional relatedness between regions in the brain for typical and special populations. Most of the work to date determines this relationship by using Pearson's correlation on BOLD fMRI timeseries. However, it has been recognized that there are at least two key limitations to this method. First, it is not possible to resolve the direct and indirect connections/influences. Second, the direction of information flow between the regions cannot be differentiated. In the current paper, we follow-up on recent work by Smith et al (2011), and apply a Bayesian approach called the PC algorithm to both simulated data and empirical data to determine whether these two factors can be discerned with group average, as opposed to single subject, functional connectivity data. When applied on simulated individual subjects, the algorithm performs well determining indirect and direct connection but fails in determining directionality. However, when applied at group level, PC algorithm gives strong results for both indirect and direct connections and the direction of information flow. Applying the algorithm on empirical data, using a diffusion-weighted imaging (DWI) structural connectivity matrix as the baseline, the PC algorithm outperformed the direct correlations. We conclude that, under certain conditions, the PC algorithm leads to an improved estimate of brain network structure compared to the traditional connectivity analysis based on correlations. PMID:23501054
Bayesian aerosol retrieval algorithm for MODIS AOD retrieval over land
NASA Astrophysics Data System (ADS)
Lipponen, Antti; Mielonen, Tero; Pitkänen, Mikko R. A.; Levy, Robert C.; Sawyer, Virginia R.; Romakkaniemi, Sami; Kolehmainen, Ville; Arola, Antti
2018-03-01
We have developed a Bayesian aerosol retrieval (BAR) algorithm for the retrieval of aerosol optical depth (AOD) over land from the Moderate Resolution Imaging Spectroradiometer (MODIS). In the BAR algorithm, we simultaneously retrieve all dark land pixels in a granule, utilize spatial correlation models for the unknown aerosol parameters, use a statistical prior model for the surface reflectance, and take into account the uncertainties due to fixed aerosol models. The retrieved parameters are total AOD at 0.55 µm, fine-mode fraction (FMF), and surface reflectances at four different wavelengths (0.47, 0.55, 0.64, and 2.1 µm). The accuracy of the new algorithm is evaluated by comparing the AOD retrievals to Aerosol Robotic Network (AERONET) AOD. The results show that the BAR significantly improves the accuracy of AOD retrievals over the operational Dark Target (DT) algorithm. A reduction of about 29 % in the AOD root mean square error and decrease of about 80 % in the median bias of AOD were found globally when the BAR was used instead of the DT algorithm. Furthermore, the fraction of AOD retrievals inside the ±(0.05+15 %) expected error envelope increased from 55 to 76 %. In addition to retrieving the values of AOD, FMF, and surface reflectance, the BAR also gives pixel-level posterior uncertainty estimates for the retrieved parameters. The BAR algorithm always results in physical, non-negative AOD values, and the average computation time for a single granule was less than a minute on a modern personal computer.
Bayesian just-so stories in psychology and neuroscience.
Bowers, Jeffrey S; Davis, Colin J
2012-05-01
According to Bayesian theories in psychology and neuroscience, minds and brains are (near) optimal in solving a wide range of tasks. We challenge this view and argue that more traditional, non-Bayesian approaches are more promising. We make 3 main arguments. First, we show that the empirical evidence for Bayesian theories in psychology is weak. This weakness relates to the many arbitrary ways that priors, likelihoods, and utility functions can be altered in order to account for the data that are obtained, making the models unfalsifiable. It further relates to the fact that Bayesian theories are rarely better at predicting data compared with alternative (and simpler) non-Bayesian theories. Second, we show that the empirical evidence for Bayesian theories in neuroscience is weaker still. There are impressive mathematical analyses showing how populations of neurons could compute in a Bayesian manner but little or no evidence that they do. Third, we challenge the general scientific approach that characterizes Bayesian theorizing in cognitive science. A common premise is that theories in psychology should largely be constrained by a rational analysis of what the mind ought to do. We question this claim and argue that many of the important constraints come from biological, evolutionary, and processing (algorithmic) considerations that have no adaptive relevance to the problem per se. In our view, these factors have contributed to the development of many Bayesian "just so" stories in psychology and neuroscience; that is, mathematical analyses of cognition that can be used to explain almost any behavior as optimal. 2012 APA, all rights reserved.
Using Bayesian Networks for Candidate Generation in Consistency-based Diagnosis
NASA Technical Reports Server (NTRS)
Narasimhan, Sriram; Mengshoel, Ole
2008-01-01
Consistency-based diagnosis relies heavily on the assumption that discrepancies between model predictions and sensor observations can be detected accurately. When sources of uncertainty like sensor noise and model abstraction exist robust schemes have to be designed to make a binary decision on whether predictions are consistent with observations. This risks the occurrence of false alarms and missed alarms when an erroneous decision is made. Moreover when multiple sensors (with differing sensing properties) are available the degree of match between predictions and observations can be used to guide the search for fault candidates. In this paper we propose a novel approach to handle this problem using Bayesian networks. In the consistency- based diagnosis formulation, automatically generated Bayesian networks are used to encode a probabilistic measure of fit between predictions and observations. A Bayesian network inference algorithm is used to compute most probable fault candidates.
Bayesian least squares deconvolution
NASA Astrophysics Data System (ADS)
Asensio Ramos, A.; Petit, P.
2015-11-01
Aims: We develop a fully Bayesian least squares deconvolution (LSD) that can be applied to the reliable detection of magnetic signals in noise-limited stellar spectropolarimetric observations using multiline techniques. Methods: We consider LSD under the Bayesian framework and we introduce a flexible Gaussian process (GP) prior for the LSD profile. This prior allows the result to automatically adapt to the presence of signal. We exploit several linear algebra identities to accelerate the calculations. The final algorithm can deal with thousands of spectral lines in a few seconds. Results: We demonstrate the reliability of the method with synthetic experiments and we apply it to real spectropolarimetric observations of magnetic stars. We are able to recover the magnetic signals using a small number of spectral lines, together with the uncertainty at each velocity bin. This allows the user to consider if the detected signal is reliable. The code to compute the Bayesian LSD profile is freely available.
Bayesian Group Bridge for Bi-level Variable Selection.
Mallick, Himel; Yi, Nengjun
2017-06-01
A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.
Uncertainty Management for Diagnostics and Prognostics of Batteries using Bayesian Techniques
NASA Technical Reports Server (NTRS)
Saha, Bhaskar; Goebel, kai
2007-01-01
Uncertainty management has always been the key hurdle faced by diagnostics and prognostics algorithms. A Bayesian treatment of this problem provides an elegant and theoretically sound approach to the modern Condition- Based Maintenance (CBM)/Prognostic Health Management (PHM) paradigm. The application of the Bayesian techniques to regression and classification in the form of Relevance Vector Machine (RVM), and to state estimation as in Particle Filters (PF), provides a powerful tool to integrate the diagnosis and prognosis of battery health. The RVM, which is a Bayesian treatment of the Support Vector Machine (SVM), is used for model identification, while the PF framework uses the learnt model, statistical estimates of noise and anticipated operational conditions to provide estimates of remaining useful life (RUL) in the form of a probability density function (PDF). This type of prognostics generates a significant value addition to the management of any operation involving electrical systems.
NASA Astrophysics Data System (ADS)
Ebrahimian, Hamed; Astroza, Rodrigo; Conte, Joel P.; de Callafon, Raymond A.
2017-02-01
This paper presents a framework for structural health monitoring (SHM) and damage identification of civil structures. This framework integrates advanced mechanics-based nonlinear finite element (FE) modeling and analysis techniques with a batch Bayesian estimation approach to estimate time-invariant model parameters used in the FE model of the structure of interest. The framework uses input excitation and dynamic response of the structure and updates a nonlinear FE model of the structure to minimize the discrepancies between predicted and measured response time histories. The updated FE model can then be interrogated to detect, localize, classify, and quantify the state of damage and predict the remaining useful life of the structure. As opposed to recursive estimation methods, in the batch Bayesian estimation approach, the entire time history of the input excitation and output response of the structure are used as a batch of data to estimate the FE model parameters through a number of iterations. In the case of non-informative prior, the batch Bayesian method leads to an extended maximum likelihood (ML) estimation method to estimate jointly time-invariant model parameters and the measurement noise amplitude. The extended ML estimation problem is solved efficiently using a gradient-based interior-point optimization algorithm. Gradient-based optimization algorithms require the FE response sensitivities with respect to the model parameters to be identified. The FE response sensitivities are computed accurately and efficiently using the direct differentiation method (DDM). The estimation uncertainties are evaluated based on the Cramer-Rao lower bound (CRLB) theorem by computing the exact Fisher Information matrix using the FE response sensitivities with respect to the model parameters. The accuracy of the proposed uncertainty quantification approach is verified using a sampling approach based on the unscented transformation. Two validation studies, based on realistic structural FE models of a bridge pier and a moment resisting steel frame, are performed to validate the performance and accuracy of the presented nonlinear FE model updating approach and demonstrate its application to SHM. These validation studies show the excellent performance of the proposed framework for SHM and damage identification even in the presence of high measurement noise and/or way-out initial estimates of the model parameters. Furthermore, the detrimental effects of the input measurement noise on the performance of the proposed framework are illustrated and quantified through one of the validation studies.
Link, W.A.; Barker, R.J.
2008-01-01
Judicious choice of candidate generating distributions improves efficiency of the Metropolis-Hastings algorithm. In Bayesian applications, it is sometimes possible to identify an approximation to the target posterior distribution; this approximate posterior distribution is a good choice for candidate generation. These observations are applied to analysis of the Cormack?Jolly?Seber model and its extensions.
Non-proliferative diabetic retinopathy symptoms detection and classification using neural network.
Al-Jarrah, Mohammad A; Shatnawi, Hadeel
2017-08-01
Diabetic retinopathy (DR) causes blindness in the working age for people with diabetes in most countries. The increasing number of people with diabetes worldwide suggests that DR will continue to be major contributors to vision loss. Early detection of retinopathy progress in individuals with diabetes is critical for preventing visual loss. Non-proliferative DR (NPDR) is an early stage of DR. Moreover, NPDR can be classified into mild, moderate and severe. This paper proposes a novel morphology-based algorithm for detecting retinal lesions and classifying each case. First, the proposed algorithm detects the three DR lesions, namely haemorrhages, microaneurysms and exudates. Second, we defined and extracted a set of features from detected lesions. The set of selected feature emulates what physicians looked for in classifying NPDR case. Finally, we designed an artificial neural network (ANN) classifier with three layers to classify NPDR to normal, mild, moderate and severe. Bayesian regularisation and resilient backpropagation algorithms are used to train ANN. The accuracy for the proposed classifiers based on Bayesian regularisation and resilient backpropagation algorithms are 96.6 and 89.9, respectively. The obtained results are compared with results of the recent published classifier. Our proposed classifier outperforms the best in terms of sensitivity and specificity.
Zhang, Zhilin; Jung, Tzyy-Ping; Makeig, Scott; Rao, Bhaskar D
2013-02-01
Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graves, Todd L; Hamada, Michael S
2008-01-01
Good estimates of the reliability of a system make use of test data and expert knowledge at all available levels. Furthermore, by integrating all these information sources, one can determine how best to allocate scarce testing resources to reduce uncertainty. Both of these goals are facilitated by modern Bayesian computational methods. We apply these tools to examples that were previously solvable only through the use of ingenious approximations, and use genetic algorithms to guide resource allocation.
SOMBI: Bayesian identification of parameter relations in unstructured cosmological data
NASA Astrophysics Data System (ADS)
Frank, Philipp; Jasche, Jens; Enßlin, Torsten A.
2016-11-01
This work describes the implementation and application of a correlation determination method based on self organizing maps and Bayesian inference (SOMBI). SOMBI aims to automatically identify relations between different observed parameters in unstructured cosmological or astrophysical surveys by automatically identifying data clusters in high-dimensional datasets via the self organizing map neural network algorithm. Parameter relations are then revealed by means of a Bayesian inference within respective identified data clusters. Specifically such relations are assumed to be parametrized as a polynomial of unknown order. The Bayesian approach results in a posterior probability distribution function for respective polynomial coefficients. To decide which polynomial order suffices to describe correlation structures in data, we include a method for model selection, the Bayesian information criterion, to the analysis. The performance of the SOMBI algorithm is tested with mock data. As illustration we also provide applications of our method to cosmological data. In particular, we present results of a correlation analysis between galaxy and active galactic nucleus (AGN) properties provided by the SDSS catalog with the cosmic large-scale-structure (LSS). The results indicate that the combined galaxy and LSS dataset indeed is clustered into several sub-samples of data with different average properties (for example different stellar masses or web-type classifications). The majority of data clusters appear to have a similar correlation structure between galaxy properties and the LSS. In particular we revealed a positive and linear dependency between the stellar mass, the absolute magnitude and the color of a galaxy with the corresponding cosmic density field. A remaining subset of data shows inverted correlations, which might be an artifact of non-linear redshift distortions.
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
NASA Technical Reports Server (NTRS)
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Extensions and applications of ensemble-of-trees methods in machine learning
NASA Astrophysics Data System (ADS)
Bleich, Justin
Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of violence during probation hearings in court systems.
Markov Chain Monte Carlo Bayesian Learning for Neural Networks
NASA Technical Reports Server (NTRS)
Goodrich, Michael S.
2011-01-01
Conventional training methods for neural networks involve starting al a random location in the solution space of the network weights, navigating an error hyper surface to reach a minimum, and sometime stochastic based techniques (e.g., genetic algorithms) to avoid entrapment in a local minimum. It is further typically necessary to preprocess the data (e.g., normalization) to keep the training algorithm on course. Conversely, Bayesian based learning is an epistemological approach concerned with formally updating the plausibility of competing candidate hypotheses thereby obtaining a posterior distribution for the network weights conditioned on the available data and a prior distribution. In this paper, we developed a powerful methodology for estimating the full residual uncertainty in network weights and therefore network predictions by using a modified Jeffery's prior combined with a Metropolis Markov Chain Monte Carlo method.
Hyperspectral Image Classification With Markov Random Fields and a Convolutional Neural Network
NASA Astrophysics Data System (ADS)
Cao, Xiangyong; Zhou, Feng; Xu, Lin; Meng, Deyu; Xu, Zongben; Paisley, John
2018-05-01
This paper presents a new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework. First, we formulate the HSI classification problem from a Bayesian perspective. Then, we adopt a convolutional neural network (CNN) to learn the posterior class distributions using a patch-wise training strategy to better use the spatial information. Next, spatial information is further considered by placing a spatial smoothness prior on the labels. Finally, we iteratively update the CNN parameters using stochastic gradient decent (SGD) and update the class labels of all pixel vectors using an alpha-expansion min-cut-based algorithm. Compared with other state-of-the-art methods, the proposed classification method achieves better performance on one synthetic dataset and two benchmark HSI datasets in a number of experimental settings.
BATMAN: Bayesian Technique for Multi-image Analysis
NASA Astrophysics Data System (ADS)
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2017-04-01
This paper describes the Bayesian Technique for Multi-image Analysis (BATMAN), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical data set containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (I.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real integral-field spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of 2. Our analysis reveals that the algorithm prioritizes conservation of all the statistically significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BaTMAn is not to be used as a 'black box' to improve the signal-to-noise ratio, but as a new approach to characterize spatially resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn.
Enhancements of Bayesian Blocks; Application to Large Light Curve Databases
NASA Technical Reports Server (NTRS)
Scargle, Jeff
2015-01-01
Bayesian Blocks are optimal piecewise linear representations (step function fits) of light-curves. The simple algorithm implementing this idea, using dynamic programming, has been extended to include more data modes and fitness metrics, multivariate analysis, and data on the circle (Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations, Scargle, Norris, Jackson and Chiang 2013, ApJ, 764, 167), as well as new results on background subtraction and refinement of the procedure for precise timing of transient events in sparse data. Example demonstrations will include exploratory analysis of the Kepler light curve archive in a search for "star-tickling" signals from extraterrestrial civilizations. (The Cepheid Galactic Internet, Learned, Kudritzki, Pakvasa1, and Zee, 2008, arXiv: 0809.0339; Walkowicz et al., in progress).
BANYAN_Sigma: Bayesian classifier for members of young stellar associations
NASA Astrophysics Data System (ADS)
Gagné, Jonathan; Mamajek, Eric E.; Malo, Lison; Riedel, Adric; Rodriguez, David; Lafrenière, David; Faherty, Jacqueline K.; Roy-Loubier, Olivier; Pueyo, Laurent; Robin, Annie C.; Doyon, René
2018-01-01
BANYAN_Sigma calculates the membership probability that a given astrophysical object belongs to one of the currently known 27 young associations within 150 pc of the Sun, using Bayesian inference. This tool uses the sky position and proper motion measurements of an object, with optional radial velocity (RV) and distance (D) measurements, to derive a Bayesian membership probability. By default, the priors are adjusted such that a probability threshold of 90% will recover 50%, 68%, 82% or 90% of true association members depending on what observables are input (only sky position and proper motion, with RV, with D, with both RV and D, respectively). The algorithm is implemented in a Python package, in IDL, and is also implemented as an interactive web page.
Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Dynamic Bayesian wavelet transform: New methodology for extraction of repetitive transients
NASA Astrophysics Data System (ADS)
Wang, Dong; Tsui, Kwok-Leung
2017-05-01
Thanks to some recent research works, dynamic Bayesian wavelet transform as new methodology for extraction of repetitive transients is proposed in this short communication to reveal fault signatures hidden in rotating machine. The main idea of the dynamic Bayesian wavelet transform is to iteratively estimate posterior parameters of wavelet transform via artificial observations and dynamic Bayesian inference. First, a prior wavelet parameter distribution can be established by one of many fast detection algorithms, such as the fast kurtogram, the improved kurtogram, the enhanced kurtogram, the sparsogram, the infogram, continuous wavelet transform, discrete wavelet transform, wavelet packets, multiwavelets, empirical wavelet transform, empirical mode decomposition, local mean decomposition, etc.. Second, artificial observations can be constructed based on one of many metrics, such as kurtosis, the sparsity measurement, entropy, approximate entropy, the smoothness index, a synthesized criterion, etc., which are able to quantify repetitive transients. Finally, given artificial observations, the prior wavelet parameter distribution can be posteriorly updated over iterations by using dynamic Bayesian inference. More importantly, the proposed new methodology can be extended to establish the optimal parameters required by many other signal processing methods for extraction of repetitive transients.
Hierarchical Bayesian sparse image reconstruction with application to MRFM.
Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves
2009-09-01
This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.
Feature Reinforcement Learning: Part I. Unstructured MDPs
NASA Astrophysics Data System (ADS)
Hutter, Marcus
2009-12-01
General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The primary goal of this work is to automate the reduction process and thereby significantly expand the scope of many existing reinforcement learning algorithms and the agents that employ them. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in Part II (Hutter, 2009c). The role of POMDPs is also considered there.
Models and simulation of 3D neuronal dendritic trees using Bayesian networks.
López-Cruz, Pedro L; Bielza, Concha; Larrañaga, Pedro; Benavides-Piccione, Ruth; DeFelipe, Javier
2011-12-01
Neuron morphology is crucial for neuronal connectivity and brain information processing. Computational models are important tools for studying dendritic morphology and its role in brain function. We applied a class of probabilistic graphical models called Bayesian networks to generate virtual dendrites from layer III pyramidal neurons from three different regions of the neocortex of the mouse. A set of 41 morphological variables were measured from the 3D reconstructions of real dendrites and their probability distributions used in a machine learning algorithm to induce the model from the data. A simulation algorithm is also proposed to obtain new dendrites by sampling values from Bayesian networks. The main advantage of this approach is that it takes into account and automatically locates the relationships between variables in the data instead of using predefined dependencies. Therefore, the methodology can be applied to any neuronal class while at the same time exploiting class-specific properties. Also, a Bayesian network was defined for each part of the dendrite, allowing the relationships to change in the different sections and to model heterogeneous developmental factors or spatial influences. Several univariate statistical tests and a novel multivariate test based on Kullback-Leibler divergence estimation confirmed that virtual dendrites were similar to real ones. The analyses of the models showed relationships that conform to current neuroanatomical knowledge and support model correctness. At the same time, studying the relationships in the models can help to identify new interactions between variables related to dendritic morphology.
Inferring the most probable maps of underground utilities using Bayesian mapping model
NASA Astrophysics Data System (ADS)
Bilal, Muhammad; Khan, Wasiq; Muggleton, Jennifer; Rustighi, Emiliano; Jenks, Hugo; Pennock, Steve R.; Atkins, Phil R.; Cohn, Anthony
2018-03-01
Mapping the Underworld (MTU), a major initiative in the UK, is focused on addressing social, environmental and economic consequences raised from the inability to locate buried underground utilities (such as pipes and cables) by developing a multi-sensor mobile device. The aim of MTU device is to locate different types of buried assets in real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization is challenging and requires the implementation of robust machine learning/data fusion approaches. An approach for automated creation of revised maps was developed as a Bayesian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensors was for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classification techniques for segment-manhole connections. The model consisting of image segmentation algorithm and various Bayesian classification techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing refined 2D/3D maps.
NASA Astrophysics Data System (ADS)
Ojha, Maheswar; Maiti, Saumen
2016-03-01
A novel approach based on the concept of Bayesian neural network (BNN) has been implemented for classifying sediment boundaries using downhole log data obtained during Integrated Ocean Drilling Program (IODP) Expedition 323 in the Bering Sea slope region. The Bayesian framework in conjunction with Markov Chain Monte Carlo (MCMC)/hybrid Monte Carlo (HMC) learning paradigm has been applied to constrain the lithology boundaries using density, density porosity, gamma ray, sonic P-wave velocity and electrical resistivity at the Hole U1344A. We have demonstrated the effectiveness of our supervised classification methodology by comparing our findings with a conventional neural network and a Bayesian neural network optimized by scaled conjugate gradient method (SCG), and tested the robustness of the algorithm in the presence of red noise in the data. The Bayesian results based on the HMC algorithm (BNN.HMC) resolve detailed finer structures at certain depths in addition to main lithology such as silty clay, diatom clayey silt and sandy silt. Our method also recovers the lithology information from a depth ranging between 615 and 655 m Wireline log Matched depth below Sea Floor of no core recovery zone. Our analyses demonstrate that the BNN based approach renders robust means for the classification of complex lithology successions at the Hole U1344A, which could be very useful for other studies and understanding the oceanic crustal inhomogeneity and structural discontinuities.
Staatz, Christine E; Tett, Susan E
2011-12-01
This review seeks to summarize the available data about Bayesian estimation of area under the plasma concentration-time curve (AUC) and dosage prediction for mycophenolic acid (MPA) and evaluate whether sufficient evidence is available for routine use of Bayesian dosage prediction in clinical practice. A literature search identified 14 studies that assessed the predictive performance of maximum a posteriori Bayesian estimation of MPA AUC and one report that retrospectively evaluated how closely dosage recommendations based on Bayesian forecasting achieved targeted MPA exposure. Studies to date have mostly been undertaken in renal transplant recipients, with limited investigation in patients treated with MPA for autoimmune disease or haematopoietic stem cell transplantation. All of these studies have involved use of the mycophenolate mofetil (MMF) formulation of MPA, rather than the enteric-coated mycophenolate sodium (EC-MPS) formulation. Bias associated with estimation of MPA AUC using Bayesian forecasting was generally less than 10%. However some difficulties with imprecision was evident, with values ranging from 4% to 34% (based on estimation involving two or more concentration measurements). Evaluation of whether MPA dosing decisions based on Bayesian forecasting (by the free website service https://pharmaco.chu-limoges.fr) achieved target drug exposure has only been undertaken once. When MMF dosage recommendations were applied by clinicians, a higher proportion (72-80%) of subsequent estimated MPA AUC values were within the 30-60 mg · h/L target range, compared with when dosage recommendations were not followed (only 39-57% within target range). Such findings provide evidence that Bayesian dosage prediction is clinically useful for achieving target MPA AUC. This study, however, was retrospective and focussed only on adult renal transplant recipients. Furthermore, in this study, Bayesian-generated AUC estimations and dosage predictions were not compared with a later full measured AUC but rather with a further AUC estimate based on a second Bayesian analysis. This study also provided some evidence that a useful monitoring schedule for MPA AUC following adult renal transplant would be every 2 weeks during the first month post-transplant, every 1-3 months between months 1 and 12, and each year thereafter. It will be interesting to see further validations in different patient groups using the free website service. In summary, the predictive performance of Bayesian estimation of MPA, comparing estimated with measured AUC values, has been reported in several studies. However, the next step of predicting dosages based on these Bayesian-estimated AUCs, and prospectively determining how closely these predicted dosages give drug exposure matching targeted AUCs, remains largely unaddressed. Further prospective studies are required, particularly in non-renal transplant patients and with the EC-MPS formulation. Other important questions remain to be answered, such as: do Bayesian forecasting methods devised to date use the best population pharmacokinetic models or most accurate algorithms; are the methods simple to use for routine clinical practice; do the algorithms actually improve dosage estimations beyond empirical recommendations in all groups that receive MPA therapy; and, importantly, do the dosage predictions, when followed, improve patient health outcomes?
An agglomerative hierarchical clustering approach to visualisation in Bayesian clustering problems
Dawson, Kevin J.; Belkhir, Khalid
2009-01-01
Clustering problems (including the clustering of individuals into outcrossing populations, hybrid generations, full-sib families and selfing lines) have recently received much attention in population genetics. In these clustering problems, the parameter of interest is a partition of the set of sampled individuals, - the sample partition. In a fully Bayesian approach to clustering problems of this type, our knowledge about the sample partition is represented by a probability distribution on the space of possible sample partitions. Since the number of possible partitions grows very rapidly with the sample size, we can not visualise this probability distribution in its entirety, unless the sample is very small. As a solution to this visualisation problem, we recommend using an agglomerative hierarchical clustering algorithm, which we call the exact linkage algorithm. This algorithm is a special case of the maximin clustering algorithm that we introduced previously. The exact linkage algorithm is now implemented in our software package Partition View. The exact linkage algorithm takes the posterior co-assignment probabilities as input, and yields as output a rooted binary tree, - or more generally, a forest of such trees. Each node of this forest defines a set of individuals, and the node height is the posterior co-assignment probability of this set. This provides a useful visual representation of the uncertainty associated with the assignment of individuals to categories. It is also a useful starting point for a more detailed exploration of the posterior distribution in terms of the co-assignment probabilities. PMID:19337306
NASA Astrophysics Data System (ADS)
Sheng, Zheng
2013-02-01
The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem. This paper deals with the RFC problem in a Bayesian framework. It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique, which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework. In contrast to the global optimization algorithm, the Bayesian—MCMC can obtain not only the approximate solutions, but also the probability distributions of the solutions, that is, uncertainty analyses of solutions. The Bayesian—MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar sea-clutter data. Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter. The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions.
Buried landmine detection using multivariate normal clustering
NASA Astrophysics Data System (ADS)
Duston, Brian M.
2001-10-01
A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.
Quantum Graphical Models and Belief Propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leifer, M.S.; Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo Ont., N2L 2Y5; Poulin, D.
Belief Propagation algorithms acting on Graphical Models of classical probability distributions, such as Markov Networks, Factor Graphs and Bayesian Networks, are amongst the most powerful known methods for deriving probabilistic inferences amongst large numbers of random variables. This paper presents a generalization of these concepts and methods to the quantum case, based on the idea that quantum theory can be thought of as a noncommutative, operator-valued, generalization of classical probability theory. Some novel characterizations of quantum conditional independence are derived, and definitions of Quantum n-Bifactor Networks, Markov Networks, Factor Graphs and Bayesian Networks are proposed. The structure of Quantum Markovmore » Networks is investigated and some partial characterization results are obtained, along the lines of the Hammersley-Clifford theorem. A Quantum Belief Propagation algorithm is presented and is shown to converge on 1-Bifactor Networks and Markov Networks when the underlying graph is a tree. The use of Quantum Belief Propagation as a heuristic algorithm in cases where it is not known to converge is discussed. Applications to decoding quantum error correcting codes and to the simulation of many-body quantum systems are described.« less
Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation
Hao, Jiucang; Attias, Hagai; Nagarajan, Srikantan; Lee, Te-Won; Sejnowski, Terrence J.
2010-01-01
This paper presents a new approximate Bayesian estimator for enhancing a noisy speech signal. The speech model is assumed to be a Gaussian mixture model (GMM) in the log-spectral domain. This is in contrast to most current models in frequency domain. Exact signal estimation is a computationally intractable problem. We derive three approximations to enhance the efficiency of signal estimation. The Gaussian approximation transforms the log-spectral domain GMM into the frequency domain using minimal Kullback–Leiber (KL)-divergency criterion. The frequency domain Laplace method computes the maximum a posteriori (MAP) estimator for the spectral amplitude. Correspondingly, the log-spectral domain Laplace method computes the MAP estimator for the log-spectral amplitude. Further, the gain and noise spectrum adaptation are implemented using the expectation–maximization (EM) algorithm within the GMM under Gaussian approximation. The proposed algorithms are evaluated by applying them to enhance the speeches corrupted by the speech-shaped noise (SSN). The experimental results demonstrate that the proposed algorithms offer improved signal-to-noise ratio, lower word recognition error rate, and less spectral distortion. PMID:20428253
Experimental Verification of Bayesian Planet Detection Algorithms with a Shaped Pupil Coronagraph
NASA Astrophysics Data System (ADS)
Savransky, D.; Groff, T. D.; Kasdin, N. J.
2010-10-01
We evaluate the feasibility of applying Bayesian detection techniques to discovering exoplanets using high contrast laboratory data with simulated planetary signals. Background images are generated at the Princeton High Contrast Imaging Lab (HCIL), with a coronagraphic system utilizing a shaped pupil and two deformable mirrors (DMs) in series. Estimates of the electric field at the science camera are used to correct for quasi-static speckle and produce symmetric high contrast dark regions in the image plane. Planetary signals are added in software, or via a physical star-planet simulator which adds a second off-axis point source before the coronagraph with a beam recombiner, calibrated to a fixed contrast level relative to the source. We produce a variety of images, with varying integration times and simulated planetary brightness. We then apply automated detection algorithms such as matched filtering to attempt to extract the planetary signals. This allows us to evaluate the efficiency of these techniques in detecting planets in a high noise regime and eliminating false positives, as well as to test existing algorithms for calculating the required integration times for these techniques to be applicable.
Algorithmic procedures for Bayesian MEG/EEG source reconstruction in SPM☆
López, J.D.; Litvak, V.; Espinosa, J.J.; Friston, K.; Barnes, G.R.
2014-01-01
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (Minimum Norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy—an approximation to the marginal likelihood or evidence of the solution. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardise the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with the help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational Free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm. PMID:24041874
Sequential structural damage diagnosis algorithm using a change point detection method
NASA Astrophysics Data System (ADS)
Noh, H.; Rajagopal, R.; Kiremidjian, A. S.
2013-11-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method. The general change point detection method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori, unless we are looking for a known specific type of damage. Therefore, we introduce an additional algorithm that estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using a set of experimental data collected from a four-story steel special moment-resisting frame and multiple sets of simulated data. Various features of different dimensions have been explored, and the algorithm was able to identify damage, particularly when it uses multidimensional damage sensitive features and lower false alarm rates, with a known post-damage feature distribution. For unknown feature distribution cases, the post-damage distribution was consistently estimated and the detection delays were only a few time steps longer than the delays from the general method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
NASA Astrophysics Data System (ADS)
Arabzadeh, Vida; Niaki, S. T. A.; Arabzadeh, Vahid
2017-10-01
One of the most important processes in the early stages of construction projects is to estimate the cost involved. This process involves a wide range of uncertainties, which make it a challenging task. Because of unknown issues, using the experience of the experts or looking for similar cases are the conventional methods to deal with cost estimation. The current study presents data-driven methods for cost estimation based on the application of artificial neural network (ANN) and regression models. The learning algorithms of the ANN are the Levenberg-Marquardt and the Bayesian regulated. Moreover, regression models are hybridized with a genetic algorithm to obtain better estimates of the coefficients. The methods are applied in a real case, where the input parameters of the models are assigned based on the key issues involved in a spherical tank construction. The results reveal that while a high correlation between the estimated cost and the real cost exists; both ANNs could perform better than the hybridized regression models. In addition, the ANN with the Levenberg-Marquardt learning algorithm (LMNN) obtains a better estimation than the ANN with the Bayesian-regulated learning algorithm (BRNN). The correlation between real data and estimated values is over 90%, while the mean square error is achieved around 0.4. The proposed LMNN model can be effective to reduce uncertainty and complexity in the early stages of the construction project.
Pooseh, Shakoor; Bernhardt, Nadine; Guevara, Alvaro; Huys, Quentin J M; Smolka, Michael N
2018-02-01
Using simple mathematical models of choice behavior, we present a Bayesian adaptive algorithm to assess measures of impulsive and risky decision making. Practically, these measures are characterized by discounting rates and are used to classify individuals or population groups, to distinguish unhealthy behavior, and to predict developmental courses. However, a constant demand for improved tools to assess these constructs remains unanswered. The algorithm is based on trial-by-trial observations. At each step, a choice is made between immediate (certain) and delayed (risky) options. Then the current parameter estimates are updated by the likelihood of observing the choice, and the next offers are provided from the indifference point, so that they will acquire the most informative data based on the current parameter estimates. The procedure continues for a certain number of trials in order to reach a stable estimation. The algorithm is discussed in detail for the delay discounting case, and results from decision making under risk for gains, losses, and mixed prospects are also provided. Simulated experiments using prescribed parameter values were performed to justify the algorithm in terms of the reproducibility of its parameters for individual assessments, and to test the reliability of the estimation procedure in a group-level analysis. The algorithm was implemented as an experimental battery to measure temporal and probability discounting rates together with loss aversion, and was tested on a healthy participant sample.
Development of microwave rainfall retrieval algorithm for climate applications
NASA Astrophysics Data System (ADS)
KIM, J. H.; Shin, D. B.
2014-12-01
With the accumulated satellite datasets for decades, it is possible that satellite-based data could contribute to sustained climate applications. Level-3 products from microwave sensors for climate applications can be obtained from several algorithms. For examples, the Microwave Emission brightness Temperature Histogram (METH) algorithm produces level-3 rainfalls directly, whereas the Goddard profiling (GPROF) algorithm first generates instantaneous rainfalls and then temporal and spatial averaging process leads to level-3 products. The rainfall algorithm developed in this study follows a similar approach to averaging instantaneous rainfalls. However, the algorithm is designed to produce instantaneous rainfalls at an optimal resolution showing reduced non-linearity in brightness temperature (TB)-rain rate(R) relations. It is found that the resolution tends to effectively utilize emission channels whose footprints are relatively larger than those of scattering channels. This algorithm is mainly composed of a-priori databases (DBs) and a Bayesian inversion module. The DB contains massive pairs of simulated microwave TBs and rain rates, obtained by WRF (version 3.4) and RTTOV (version 11.1) simulations. To improve the accuracy and efficiency of retrieval process, data mining technique is additionally considered. The entire DB is classified into eight types based on Köppen climate classification criteria using reanalysis data. Among these sub-DBs, only one sub-DB which presents the most similar physical characteristics is selected by considering the thermodynamics of input data. When the Bayesian inversion is applied to the selected DB, instantaneous rain rate with 6 hours interval is retrieved. The retrieved monthly mean rainfalls are statistically compared with CMAP and GPCP, respectively.
Karaca, Sefayet; Erge, Sema; Cesuroglu, Tomris; Polimanti, Renato
2016-06-01
Cardiovascular and metabolic traits (CMT) are influenced by complex interactive processes including diet, lifestyle, and genetic predisposition. The present study investigated the interactions of these risk factors in relation to CMTs in the Turkish population. We applied bootstrap agglomerative hierarchical clustering and Bayesian network learning algorithms to identify the causative relationships among genes involved in different biological mechanisms (i.e., lipid metabolism, hormone metabolism, cellular detoxification, aging, and energy metabolism), lifestyle (i.e., physical activity, smoking behavior, and metropolitan residency), anthropometric traits (i.e., body mass index, body fat ratio, and waist-to-hip ratio), and dietary habits (i.e., daily intakes of macro- and micronutrients) in relation to CMTs (i.e., health conditions and blood parameters). We identified significant correlations between dietary habits (soybean and vitamin B12 intakes) and different cardiometabolic diseases that were confirmed by the Bayesian network-learning algorithm. Genetic factors contributed to these disease risks also through the pleiotropy of some genetic variants (i.e., F5 rs6025 and MTR rs180508). However, we also observed that certain genetic associations are indirect since they are due to the causative relationships among the CMTs (e.g., APOC3 rs5128 is associated with low-density lipoproteins cholesterol and, by extension, total cholesterol). Our study applied a novel approach to integrate various sources of information and dissect the complex interactive processes related to CMTs. Our data indicated that complex causative networks are present: causative relationships exist among CMTs and are affected by genetic factors (with pleiotropic and non-pleiotropic effects) and dietary habits. Copyright © 2016 Elsevier Inc. All rights reserved.
THREAT ANTICIPATION AND DECEPTIVE REASONING USING BAYESIAN BELIEF NETWORKS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Allgood, Glenn O; Olama, Mohammed M; Lake, Joe E
Recent events highlight the need for tools to anticipate threats posed by terrorists. Assessing these threats requires combining information from disparate data sources such as analytic models, simulations, historical data, sensor networks, and user judgments. These disparate data can be combined in a coherent, analytically defensible, and understandable manner using a Bayesian belief network (BBN). In this paper, we develop a BBN threat anticipatory model based on a deceptive reasoning algorithm using a network engineering process that treats the probability distributions of the BBN nodes within the broader context of the system development process.
A Bayesian approach to modeling diffraction profiles and application to ferroelectric materials
Iamsasri, Thanakorn; Guerrier, Jonathon; Esteves, Giovanni; ...
2017-02-01
A new statistical approach for modeling diffraction profiles is introduced, using Bayesian inference and a Markov chain Monte Carlo (MCMC) algorithm. This method is demonstrated by modeling the degenerate reflections during application of an electric field to two different ferroelectric materials: thin-film lead zirconate titanate (PZT) of composition PbZr 0.3Ti 0.7O 3and a bulk commercial PZT polycrystalline ferroelectric. Here, the new method offers a unique uncertainty quantification of the model parameters that can be readily propagated into new calculated parameters.
Data free inference with processed data products
Chowdhary, K.; Najm, H. N.
2014-07-12
Here, we consider the context of probabilistic inference of model parameters given error bars or confidence intervals on model output values, when the data is unavailable. We introduce a class of algorithms in a Bayesian framework, relying on maximum entropy arguments and approximate Bayesian computation methods, to generate consistent data with the given summary statistics. Once we obtain consistent data sets, we pool the respective posteriors, to arrive at a single, averaged density on the parameters. This approach allows us to perform accurate forward uncertainty propagation consistent with the reported statistics.
Bayesian Face Recognition and Perceptual Narrowing in Face-Space
Balas, Benjamin
2012-01-01
During the first year of life, infants’ face recognition abilities are subject to “perceptual narrowing,” the end result of which is that observers lose the ability to distinguish previously discriminable faces (e.g. other-race faces) from one another. Perceptual narrowing has been reported for faces of different species and different races, in developing humans and primates. Though the phenomenon is highly robust and replicable, there have been few efforts to model the emergence of perceptual narrowing as a function of the accumulation of experience with faces during infancy. The goal of the current study is to examine how perceptual narrowing might manifest as statistical estimation in “face space,” a geometric framework for describing face recognition that has been successfully applied to adult face perception. Here, I use a computer vision algorithm for Bayesian face recognition to study how the acquisition of experience in face space and the presence of race categories affect performance for own and other-race faces. Perceptual narrowing follows from the establishment of distinct race categories, suggesting that the acquisition of category boundaries for race is a key computational mechanism in developing face expertise. PMID:22709406
NASA Technical Reports Server (NTRS)
Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri
2004-01-01
Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.
Prognostics Approach for Power MOSFET Under Thermal-Stress
NASA Technical Reports Server (NTRS)
Galvan, Jose Ramon Celaya; Saxena, Abhinav; Kulkarni, Chetan S.; Saha, Sankalita; Goebel, Kai
2012-01-01
The prognostic technique for a power MOSFET presented in this paper is based on accelerated aging of MOSFET IRF520Npbf in a TO-220 package. The methodology utilizes thermal and power cycling to accelerate the life of the devices. The major failure mechanism for the stress conditions is dieattachment degradation, typical for discrete devices with leadfree solder die attachment. It has been determined that dieattach degradation results in an increase in ON-state resistance due to its dependence on junction temperature. Increasing resistance, thus, can be used as a precursor of failure for the die-attach failure mechanism under thermal stress. A feature based on normalized ON-resistance is computed from in-situ measurements of the electro-thermal response. An Extended Kalman filter is used as a model-based prognostics techniques based on the Bayesian tracking framework. The proposed prognostics technique reports on preliminary work that serves as a case study on the prediction of remaining life of power MOSFETs and builds upon the work presented in [1]. The algorithm considered in this study had been used as prognostics algorithm in different applications and is regarded as suitable candidate for component level prognostics. This work attempts to further the validation of such algorithm by presenting it with real degradation data including measurements from real sensors, which include all the complications (noise, bias, etc.) that are regularly not captured on simulated degradation data. The algorithm is developed and tested on the accelerated aging test timescale. In real world operation, the timescale of the degradation process and therefore the RUL predictions will be considerable larger. It is hypothesized that even though the timescale will be larger, it remains constant through the degradation process and the algorithm and model would still apply under the slower degradation process. By using accelerated aging data with actual device measurements and real sensors (no simulated behavior), we are attempting to assess how such algorithm behaves under realistic conditions.
The center for causal discovery of biomedical knowledge from big data
Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-01-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
Signal Recovery and System Calibration from Multiple Compressive Poisson Measurements
Wang, Liming; Huang, Jiaji; Yuan, Xin; ...
2015-09-17
The measurement matrix employed in compressive sensing typically cannot be known precisely a priori and must be estimated via calibration. One may take multiple compressive measurements, from which the measurement matrix and underlying signals may be estimated jointly. This is of interest as well when the measurement matrix may change as a function of the details of what is measured. This problem has been considered recently for Gaussian measurement noise, and here we develop this idea with application to Poisson systems. A collaborative maximum likelihood algorithm and alternating proximal gradient algorithm are proposed, and associated theoretical performance guarantees are establishedmore » based on newly derived concentration-of-measure results. A Bayesian model is then introduced, to improve flexibility and generality. Connections between the maximum likelihood methods and the Bayesian model are developed, and example results are presented for a real compressive X-ray imaging system.« less
The influence of omniscient technology on cryptography
NASA Astrophysics Data System (ADS)
Huang, Weihong; Li, Jian
2009-07-01
Scholars agree that concurrent algorithms are an interesting new topic in the field of cyberinformatics, and hackers worldwide concur. In fact, few end-users would disagree with the evaluation of architecture. We propose a Bayesian tool for harnessing massive multiplayer online role-playing games (FIRER), which we use to prove that the well-known ubiquitous algorithm for the improvement of wide-area networks by Karthik Lakshminarayanan is in Co-NP.
Airport Flight Departure Delay Model on Improved BN Structure Learning
NASA Astrophysics Data System (ADS)
Cao, Weidong; Fang, Xiangnong
An high score prior genetic simulated annealing Bayesian network structure learning algorithm (HSPGSA) by combining genetic algorithm(GA) with simulated annealing algorithm(SAA) is developed. The new algorithm provides not only with strong global search capability of GA, but also with strong local hill climb search capability of SAA. The structure with the highest score is prior selected. In the mean time, structures with lower score are also could be choice. It can avoid efficiently prematurity problem by higher score individual wrong direct growing population. Algorithm is applied to flight departure delays analysis in a large hub airport. Based on the flight data a BN model is created. Experiments show that parameters learning can reflect departure delay.
NASA Astrophysics Data System (ADS)
Paloscia, S.; Pettinato, S.; Santi, E.; Pierdicca, N.; Pulvirenti, L.; Notarnicola, C.; Pace, G.; Reppucci, A.
2011-11-01
The main objective of this research is to develop, test and validate a soil moisture (SMC)) algorithm for the GMES Sentinel-1 characteristics, within the framework of an ESA project. The SMC product, to be generated from Sentinel-1 data, requires an algorithm able to process operationally in near-real-time and deliver the product to the GMES services within 3 hours from observations. Two different complementary approaches have been proposed: an Artificial Neural Network (ANN), which represented the best compromise between retrieval accuracy and processing time, thus allowing compliance with the timeliness requirements and a Bayesian Multi-temporal approach, allowing an increase of the retrieval accuracy, especially in case where little ancillary data are available, at the cost of computational efficiency, taking advantage of the frequent revisit time achieved by Sentinel-1. The algorithm was validated in several test areas in Italy, US and Australia, and finally in Spain with a 'blind' validation. The Multi-temporal Bayesian algorithm was validated in Central Italy. The validation results are in all cases very much in line with the requirements. However, the blind validation results were penalized by the availability of only VV polarization SAR images and MODIS lowresolution NDVI, although the RMS is slightly > 4%.
A dynamical approach in exploring the unknown mass in the Solar system using pulsar timing arrays
NASA Astrophysics Data System (ADS)
Guo, Y. J.; Lee, K. J.; Caballero, R. N.
2018-04-01
The error in the Solar system ephemeris will lead to dipolar correlations in the residuals of pulsar timing array for widely separated pulsars. In this paper, we utilize such correlated signals, and construct a Bayesian data-analysis framework to detect the unknown mass in the Solar system and to measure the orbital parameters. The algorithm is designed to calculate the waveform of the induced pulsar-timing residuals due to the unmodelled objects following the Keplerian orbits in the Solar system. The algorithm incorporates a Bayesian-analysis suit used to simultaneously analyse the pulsar-timing data of multiple pulsars to search for coherent waveforms, evaluate the detection significance of unknown objects, and to measure their parameters. When the object is not detectable, our algorithm can be used to place upper limits on the mass. The algorithm is verified using simulated data sets, and cross-checked with analytical calculations. We also investigate the capability of future pulsar-timing-array experiments in detecting the unknown objects. We expect that the future pulsar-timing data can limit the unknown massive objects in the Solar system to be lighter than 10-11-10-12 M⊙, or measure the mass of Jovian system to a fractional precision of 10-8-10-9.
Exploring the Energy Landscapes of Protein Folding Simulations with Bayesian Computation
Burkoff, Nikolas S.; Várnai, Csilla; Wells, Stephen A.; Wild, David L.
2012-01-01
Nested sampling is a Bayesian sampling technique developed to explore probability distributions localized in an exponentially small area of the parameter space. The algorithm provides both posterior samples and an estimate of the evidence (marginal likelihood) of the model. The nested sampling algorithm also provides an efficient way to calculate free energies and the expectation value of thermodynamic observables at any temperature, through a simple post processing of the output. Previous applications of the algorithm have yielded large efficiency gains over other sampling techniques, including parallel tempering. In this article, we describe a parallel implementation of the nested sampling algorithm and its application to the problem of protein folding in a Gō-like force field of empirical potentials that were designed to stabilize secondary structure elements in room-temperature simulations. We demonstrate the method by conducting folding simulations on a number of small proteins that are commonly used for testing protein-folding procedures. A topological analysis of the posterior samples is performed to produce energy landscape charts, which give a high-level description of the potential energy surface for the protein folding simulations. These charts provide qualitative insights into both the folding process and the nature of the model and force field used. PMID:22385859
Bayesian block-diagonal variable selection and model averaging
Papaspiliopoulos, O.; Rossell, D.
2018-01-01
Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501
Exploring the energy landscapes of protein folding simulations with Bayesian computation.
Burkoff, Nikolas S; Várnai, Csilla; Wells, Stephen A; Wild, David L
2012-02-22
Nested sampling is a Bayesian sampling technique developed to explore probability distributions localized in an exponentially small area of the parameter space. The algorithm provides both posterior samples and an estimate of the evidence (marginal likelihood) of the model. The nested sampling algorithm also provides an efficient way to calculate free energies and the expectation value of thermodynamic observables at any temperature, through a simple post processing of the output. Previous applications of the algorithm have yielded large efficiency gains over other sampling techniques, including parallel tempering. In this article, we describe a parallel implementation of the nested sampling algorithm and its application to the problem of protein folding in a Gō-like force field of empirical potentials that were designed to stabilize secondary structure elements in room-temperature simulations. We demonstrate the method by conducting folding simulations on a number of small proteins that are commonly used for testing protein-folding procedures. A topological analysis of the posterior samples is performed to produce energy landscape charts, which give a high-level description of the potential energy surface for the protein folding simulations. These charts provide qualitative insights into both the folding process and the nature of the model and force field used. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Prediction of Sybil attack on WSN using Bayesian network and swarm intelligence
NASA Astrophysics Data System (ADS)
Muraleedharan, Rajani; Ye, Xiang; Osadciw, Lisa Ann
2008-04-01
Security in wireless sensor networks is typically sacrificed or kept minimal due to limited resources such as memory and battery power. Hence, the sensor nodes are prone to Denial-of-service attacks and detecting the threats is crucial in any application. In this paper, the Sybil attack is analyzed and a novel prediction method, combining Bayesian algorithm and Swarm Intelligence (SI) is proposed. Bayesian Networks (BN) is used in representing and reasoning problems, by modeling the elements of uncertainty. The decision from the BN is applied to SI forming an Hybrid Intelligence Scheme (HIS) to re-route the information and disconnecting the malicious nodes in future routes. A performance comparison based on the prediction using HIS vs. Ant System (AS) helps in prioritizing applications where decisions are time-critical.
Bayesian Image Segmentations by Potts Prior and Loopy Belief Propagation
NASA Astrophysics Data System (ADS)
Tanaka, Kazuyuki; Kataoka, Shun; Yasuda, Muneki; Waizumi, Yuji; Hsu, Chiou-Ting
2014-12-01
This paper presents a Bayesian image segmentation model based on Potts prior and loopy belief propagation. The proposed Bayesian model involves several terms, including the pairwise interactions of Potts models, and the average vectors and covariant matrices of Gauss distributions in color image modeling. These terms are often referred to as hyperparameters in statistical machine learning theory. In order to determine these hyperparameters, we propose a new scheme for hyperparameter estimation based on conditional maximization of entropy in the Potts prior. The algorithm is given based on loopy belief propagation. In addition, we compare our conditional maximum entropy framework with the conventional maximum likelihood framework, and also clarify how the first order phase transitions in loopy belief propagations for Potts models influence our hyperparameter estimation procedures.
Abanto-Valle, C. A.; Bandyopadhyay, D.; Lachos, V. H.; Enriquez, I.
2009-01-01
A Bayesian analysis of stochastic volatility (SV) models using the class of symmetric scale mixtures of normal (SMN) distributions is considered. In the face of non-normality, this provides an appealing robust alternative to the routine use of the normal distribution. Specific distributions examined include the normal, student-t, slash and the variance gamma distributions. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo (MCMC) algorithm is introduced for parameter estimation. Moreover, the mixing parameters obtained as a by-product of the scale mixture representation can be used to identify outliers. The methods developed are applied to analyze daily stock returns data on S&P500 index. Bayesian model selection criteria as well as out-of- sample forecasting results reveal that the SV models based on heavy-tailed SMN distributions provide significant improvement in model fit as well as prediction to the S&P500 index data over the usual normal model. PMID:20730043
NASA Astrophysics Data System (ADS)
Shen, Chien-wen
2009-01-01
During the processes of TFT-LCD manufacturing, steps like visual inspection of panel surface defects still heavily rely on manual operations. As the manual inspection time of TFT-LCD manufacturing could range from 4 hours to 1 day, the reliability of time forecasting is thus important for production planning, scheduling and customer response. This study would like to propose a practical and easy-to-implement prediction model through the approach of Bayesian networks for time estimation of manual operated procedures in TFT-LCD manufacturing. Given the lack of prior knowledge about manual operation time, algorithms of necessary path condition and expectation-maximization are used for structural learning and estimation of conditional probability distributions respectively. This study also applied Bayesian inference to evaluate the relationships between explanatory variables and manual operation time. With the empirical applications of this proposed forecasting model, approach of Bayesian networks demonstrates its practicability and prediction accountability.
Bayesian analysis of caustic-crossing microlensing events
NASA Astrophysics Data System (ADS)
Cassan, A.; Horne, K.; Kains, N.; Tsapras, Y.; Browne, P.
2010-06-01
Aims: Caustic-crossing binary-lens microlensing events are important anomalous events because they are capable of detecting an extrasolar planet companion orbiting the lens star. Fast and robust modelling methods are thus of prime interest in helping to decide whether a planet is detected by an event. Cassan introduced a new set of parameters to model binary-lens events, which are closely related to properties of the light curve. In this work, we explain how Bayesian priors can be added to this framework, and investigate on interesting options. Methods: We develop a mathematical formulation that allows us to compute analytically the priors on the new parameters, given some previous knowledge about other physical quantities. We explicitly compute the priors for a number of interesting cases, and show how this can be implemented in a fully Bayesian, Markov chain Monte Carlo algorithm. Results: Using Bayesian priors can accelerate microlens fitting codes by reducing the time spent considering physically implausible models, and helps us to discriminate between alternative models based on the physical plausibility of their parameters.
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data
Zhao, Xin; Cheung, Leo Wang-Kit
2007-01-01
Background Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. Results A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. Conclusion Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently. PMID:17328811
Wang, Tingting; Chen, Yi-Ping Phoebe; Bowman, Phil J; Goddard, Michael E; Hayes, Ben J
2016-09-21
Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets.
Borchani, Hanen; Bielza, Concha; Toro, Carlos; Larrañaga, Pedro
2013-03-01
Our aim is to use multi-dimensional Bayesian network classifiers in order to predict the human immunodeficiency virus type 1 (HIV-1) reverse transcriptase and protease inhibitors given an input set of respective resistance mutations that an HIV patient carries. Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models especially designed to solve multi-dimensional classification problems, where each input instance in the data set has to be assigned simultaneously to multiple output class variables that are not necessarily binary. In this paper, we introduce a new method, named MB-MBC, for learning MBCs from data by determining the Markov blanket around each class variable using the HITON algorithm. Our method is applied to both reverse transcriptase and protease data sets obtained from the Stanford HIV-1 database. Regarding the prediction of antiretroviral combination therapies, the experimental study shows promising results in terms of classification accuracy compared with state-of-the-art MBC learning algorithms. For reverse transcriptase inhibitors, we get 71% and 11% in mean and global accuracy, respectively; while for protease inhibitors, we get more than 84% and 31% in mean and global accuracy, respectively. In addition, the analysis of MBC graphical structures lets us gain insight into both known and novel interactions between reverse transcriptase and protease inhibitors and their respective resistance mutations. MB-MBC algorithm is a valuable tool to analyze the HIV-1 reverse transcriptase and protease inhibitors prediction problem and to discover interactions within and between these two classes of inhibitors. Copyright © 2012 Elsevier B.V. All rights reserved.
Predicting the survival of diabetes using neural network
NASA Astrophysics Data System (ADS)
Mamuda, Mamman; Sathasivam, Saratha
2017-08-01
Data mining techniques at the present time are used in predicting diseases of health care industries. Neural Network is one among the prevailing method in data mining techniques of an intelligent field for predicting diseases in health care industries. This paper presents a study on the prediction of the survival of diabetes diseases using different learning algorithms from the supervised learning algorithms of neural network. Three learning algorithms are considered in this study: (i) The levenberg-marquardt learning algorithm (ii) The Bayesian regulation learning algorithm and (iii) The scaled conjugate gradient learning algorithm. The network is trained using the Pima Indian Diabetes Dataset with the help of MATLAB R2014(a) software. The performance of each algorithm is further discussed through regression analysis. The prediction accuracy of the best algorithm is further computed to validate the accurate prediction
Bayesian networks in overlay recipe optimization
NASA Astrophysics Data System (ADS)
Binns, Lewis A.; Reynolds, Greg; Rigden, Timothy C.; Watkins, Stephen; Soroka, Andrew
2005-05-01
Currently, overlay measurements are characterized by "recipe", which defines both physical parameters such as focus, illumination et cetera, and also the software parameters such as algorithm to be used and regions of interest. Setting up these recipes requires both engineering time and wafer availability on an overlay tool, so reducing these requirements will result in higher tool productivity. One of the significant challenges to automating this process is that the parameters are highly and complexly correlated. At the same time, a high level of traceability and transparency is required in the recipe creation process, so a technique that maintains its decisions in terms of well defined physical parameters is desirable. Running time should be short, given the system (automatic recipe creation) is being implemented to reduce overheads. Finally, a failure of the system to determine acceptable parameters should be obvious, so a certainty metric is also desirable. The complex, nonlinear interactions make solution by an expert system difficult at best, especially in the verification of the resulting decision network. The transparency requirements tend to preclude classical neural networks and similar techniques. Genetic algorithms and other "global minimization" techniques require too much computational power (given system footprint and cost requirements). A Bayesian network, however, provides a solution to these requirements. Such a network, with appropriate priors, can be used during recipe creation / optimization not just to select a good set of parameters, but also to guide the direction of search, by evaluating the network state while only incomplete information is available. As a Bayesian network maintains an estimate of the probability distribution of nodal values, a maximum-entropy approach can be utilized to obtain a working recipe in a minimum or near-minimum number of steps. In this paper we discuss the potential use of a Bayesian network in such a capacity, reducing the amount of engineering intervention. We discuss the benefits of this approach, especially improved repeatability and traceability of the learning process, and quantification of uncertainty in decisions made. We also consider the problems associated with this approach, especially in detailed construction of network topology, validation of the Bayesian network and the recipes it generates, and issues arising from the integration of a Bayesian network with a complex multithreaded application; these primarily relate to maintaining Bayesian network and system architecture integrity.
Bayesian seismic tomography by parallel interacting Markov chains
NASA Astrophysics Data System (ADS)
Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas
2014-05-01
The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the high probability zones of the model space while avoiding the chains to end stuck in a probability maximum. This approach supplies thus a robust way to analyze the tomography imaging uncertainties. The interacting MCMC approach is illustrated on two synthetic examples of tomography of calibration shots such as encountered in induced microseismic studies. On the second application, a wavelet based model parameterization is presented that allows to significantly reduce the dimension of the problem, making thus the algorithm efficient even for a complex velocity model.
MapReduce Based Parallel Bayesian Network for Manufacturing Quality Control
NASA Astrophysics Data System (ADS)
Zheng, Mao-Kuan; Ming, Xin-Guo; Zhang, Xian-Yu; Li, Guo-Ming
2017-09-01
Increasing complexity of industrial products and manufacturing processes have challenged conventional statistics based quality management approaches in the circumstances of dynamic production. A Bayesian network and big data analytics integrated approach for manufacturing process quality analysis and control is proposed. Based on Hadoop distributed architecture and MapReduce parallel computing model, big volume and variety quality related data generated during the manufacturing process could be dealt with. Artificial intelligent algorithms, including Bayesian network learning, classification and reasoning, are embedded into the Reduce process. Relying on the ability of the Bayesian network in dealing with dynamic and uncertain problem and the parallel computing power of MapReduce, Bayesian network of impact factors on quality are built based on prior probability distribution and modified with posterior probability distribution. A case study on hull segment manufacturing precision management for ship and offshore platform building shows that computing speed accelerates almost directly proportionally to the increase of computing nodes. It is also proved that the proposed model is feasible for locating and reasoning of root causes, forecasting of manufacturing outcome, and intelligent decision for precision problem solving. The integration of bigdata analytics and BN method offers a whole new perspective in manufacturing quality control.
A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.
Yu, Qingzhao; Zhu, Lin; Zhu, Han
2017-11-01
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Li, D.
2016-12-01
Sudden water pollution accidents are unavoidable risk events that we must learn to co-exist with. In China's Taihu River Basin, the river flow conditions are complicated with frequently artificial interference. Sudden water pollution accident occurs mainly in the form of a large number of abnormal discharge of wastewater, and has the characteristics with the sudden occurrence, the uncontrollable scope, the uncertainty object and the concentrated distribution of many risk sources. Effective prevention of pollution accidents that may occur is of great significance for the water quality safety management. Bayesian networks can be applied to represent the relationship between pollution sources and river water quality intuitively. Using the time sequential Monte Carlo algorithm, the pollution sources state switching model, water quality model for river network and Bayesian reasoning is integrated together, and the sudden water pollution risk assessment model for river network is developed to quantify the water quality risk under the collective influence of multiple pollution sources. Based on the isotope water transport mechanism, a dynamic tracing model of multiple pollution sources is established, which can describe the relationship between the excessive risk of the system and the multiple risk sources. Finally, the diagnostic reasoning algorithm based on Bayesian network is coupled with the multi-source tracing model, which can identify the contribution of each risk source to the system risk under the complex flow conditions. Taking Taihu Lake water system as the research object, the model is applied to obtain the reasonable results under the three typical years. Studies have shown that the water quality risk at critical sections are influenced by the pollution risk source, the boundary water quality, the hydrological conditions and self -purification capacity, and the multiple pollution sources have obvious effect on water quality risk of the receiving water body. The water quality risk assessment approach developed in this study offers a effective tool for systematically quantifying the random uncertainty in plain river network system, and it also provides the technical support for the decision-making of controlling the sudden water pollution through identification of critical pollution sources.
NASA Astrophysics Data System (ADS)
Perera, B. B. P.; Stappers, B. W.; Babak, S.; Keith, M. J.; Antoniadis, J.; Bassa, C. G.; Caballero, R. N.; Champion, D. J.; Cognard, I.; Desvignes, G.; Graikou, E.; Guillemot, L.; Janssen, G. H.; Karuppusamy, R.; Kramer, M.; Lazarus, P.; Lentati, L.; Liu, K.; Lyne, A. G.; McKee, J. W.; Osłowski, S.; Perrodin, D.; Sanidas, S. A.; Sesana, A.; Shaifullah, G.; Theureau, G.; Verbiest, J. P. W.; Taylor, S. R.
2018-07-01
We search for continuous gravitational waves (CGWs) produced by individual supermassive black hole binaries in circular orbits using high-cadence timing observations of PSR J1713+0747. We observe this millisecond pulsar using the telescopes in the European Pulsar Timing Array with an average cadence of approximately 1.6 d over the period between 2011 April and 2015 July, including an approximately daily average between 2013 February and 2014 April. The high-cadence observations are used to improve the pulsar timing sensitivity across the gravitational wave frequency range of 0.008-5μHz. We use two algorithms in the analysis, including a spectral fitting method and a Bayesian approach. For an independent comparison, we also use a previously published Bayesian algorithm. We find that the Bayesian approaches provide optimal results and the timing observations of the pulsar place a 95 per cent upper limit on the sky-averaged strain amplitude of CGWs to be ≲3.5 × 10-13 at a reference frequency of 1 μHz. We also find a 95 per cent upper limit on the sky-averaged strain amplitude of low-frequency CGWs to be ≲1.4 × 10-14 at a reference frequency of 20 nHz.
NASA Astrophysics Data System (ADS)
Rajaona, Harizo; Septier, François; Armand, Patrick; Delignon, Yves; Olry, Christophe; Albergel, Armand; Moussafir, Jacques
2015-12-01
In the eventuality of an accidental or intentional atmospheric release, the reconstruction of the source term using measurements from a set of sensors is an important and challenging inverse problem. A rapid and accurate estimation of the source allows faster and more efficient action for first-response teams, in addition to providing better damage assessment. This paper presents a Bayesian probabilistic approach to estimate the location and the temporal emission profile of a pointwise source. The release rate is evaluated analytically by using a Gaussian assumption on its prior distribution, and is enhanced with a positivity constraint to improve the estimation. The source location is obtained by the means of an advanced iterative Monte-Carlo technique called Adaptive Multiple Importance Sampling (AMIS), which uses a recycling process at each iteration to accelerate its convergence. The proposed methodology is tested using synthetic and real concentration data in the framework of the Fusion Field Trials 2007 (FFT-07) experiment. The quality of the obtained results is comparable to those coming from the Markov Chain Monte Carlo (MCMC) algorithm, a popular Bayesian method used for source estimation. Moreover, the adaptive processing of the AMIS provides a better sampling efficiency by reusing all the generated samples.
Bayesian automated cortical segmentation for neonatal MRI
NASA Astrophysics Data System (ADS)
Chou, Zane; Paquette, Natacha; Ganesh, Bhavana; Wang, Yalin; Ceschin, Rafael; Nelson, Marvin D.; Macyszyn, Luke; Gaonkar, Bilwaj; Panigrahy, Ashok; Lepore, Natasha
2017-11-01
Several attempts have been made in the past few years to develop and implement an automated segmentation of neonatal brain structural MRI. However, accurate automated MRI segmentation remains challenging in this population because of the low signal-to-noise ratio, large partial volume effects and inter-individual anatomical variability of the neonatal brain. In this paper, we propose a learning method for segmenting the whole brain cortical grey matter on neonatal T2-weighted images. We trained our algorithm using a neonatal dataset composed of 3 fullterm and 4 preterm infants scanned at term equivalent age. Our segmentation pipeline combines the FAST algorithm from the FSL library software and a Bayesian segmentation approach to create a threshold matrix that minimizes the error of mislabeling brain tissue types. Our method shows promising results with our pilot training set. In both preterm and full-term neonates, automated Bayesian segmentation generates a smoother and more consistent parcellation compared to FAST, while successfully removing the subcortical structure and cleaning the edges of the cortical grey matter. This method show promising refinement of the FAST segmentation by considerably reducing manual input and editing required from the user, and further improving reliability and processing time of neonatal MR images. Further improvement will include a larger dataset of training images acquired from different manufacturers.
Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.
Chatzis, Sotirios P; Andreou, Andreas S
2015-11-01
Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.
Nonlinear inversion of electrical resistivity imaging using pruning Bayesian neural networks
NASA Astrophysics Data System (ADS)
Jiang, Fei-Bo; Dai, Qian-Wei; Dong, Li
2016-06-01
Conventional artificial neural networks used to solve electrical resistivity imaging (ERI) inversion problem suffer from overfitting and local minima. To solve these problems, we propose to use a pruning Bayesian neural network (PBNN) nonlinear inversion method and a sample design method based on the K-medoids clustering algorithm. In the sample design method, the training samples of the neural network are designed according to the prior information provided by the K-medoids clustering results; thus, the training process of the neural network is well guided. The proposed PBNN, based on Bayesian regularization, is used to select the hidden layer structure by assessing the effect of each hidden neuron to the inversion results. Then, the hyperparameter α k , which is based on the generalized mean, is chosen to guide the pruning process according to the prior distribution of the training samples under the small-sample condition. The proposed algorithm is more efficient than other common adaptive regularization methods in geophysics. The inversion of synthetic data and field data suggests that the proposed method suppresses the noise in the neural network training stage and enhances the generalization. The inversion results with the proposed method are better than those of the BPNN, RBFNN, and RRBFNN inversion methods as well as the conventional least squares inversion.
Bayesian analysis of energy and count rate data for detection of low count rate radioactive sources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klumpp, John
We propose a radiation detection system which generates its own discrete sampling distribution based on past measurements of background. The advantage to this approach is that it can take into account variations in background with respect to time, location, energy spectra, detector-specific characteristics (i.e. different efficiencies at different count rates and energies), etc. This would therefore be a 'machine learning' approach, in which the algorithm updates and improves its characterization of background over time. The system would have a 'learning mode,' in which it measures and analyzes background count rates, and a 'detection mode,' in which it compares measurements frommore » an unknown source against its unique background distribution. By characterizing and accounting for variations in the background, general purpose radiation detectors can be improved with little or no increase in cost. The statistical and computational techniques to perform this kind of analysis have already been developed. The necessary signal analysis can be accomplished using existing Bayesian algorithms which account for multiple channels, multiple detectors, and multiple time intervals. Furthermore, Bayesian machine-learning techniques have already been developed which, with trivial modifications, can generate appropriate decision thresholds based on the comparison of new measurements against a nonparametric sampling distribution. (authors)« less
NASA Astrophysics Data System (ADS)
Perera, B. B. P.; Stappers, B. W.; Babak, S.; Keith, M. J.; Antoniadis, J.; Bassa, C. G.; Caballero, R. N.; Champion, D. J.; Cognard, I.; Desvignes, G.; Graikou, E.; Guillemot, L.; Janssen, G. H.; Karuppusamy, R.; Kramer, M.; Lazarus, P.; Lentati, L.; Liu, K.; Lyne, A. G.; McKee, J. W.; Osłowski, S.; Perrodin, D.; Sanidas, S. A.; Sesana, A.; Shaifullah, G.; Theureau, G.; Verbiest, J. P. W.; Taylor, S. R.
2018-05-01
We search for continuous gravitational waves (CGWs) produced by individual super-massive black-hole binaries (SMBHBs) in circular orbits using high-cadence timing observations of PSR J1713+0747. We observe this millisecond pulsar using the telescopes in the European Pulsar Timing Array (EPTA) with an average cadence of approximately 1.6 days over the period between April 2011 and July 2015, including an approximately daily average between February 2013 and April 2014. The high-cadence observations are used to improve the pulsar timing sensitivity across the GW frequency range of 0.008 - 5 μHz. We use two algorithms in the analysis, including a spectral fitting method and a Bayesian approach. For an independent comparison, we also use a previously published Bayesian algorithm. We find that the Bayesian approaches provide optimal results and the timing observations of the pulsar place a 95 per cent upper limit on the sky-averaged strain amplitude of CGWs to be ≲ 3.5 × 10-13 at a reference frequency of 1 μHz. We also find a 95 per cent upper limit on the sky-averaged strain amplitude of low-frequency CGWs to be ≲ 1.4 × 10-14 at a reference frequency of 20 nHz.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Farrell, Kathryn, E-mail: kfarrell@ices.utexas.edu; Oden, J. Tinsley, E-mail: oden@ices.utexas.edu; Faghihi, Danial, E-mail: danial@ices.utexas.edu
A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.
Hu, Junguo; Zhou, Jian; Zhou, Guomo; Luo, Yiqi; Xu, Xiaojun; Li, Pingheng; Liang, Junyi
2016-01-01
Soil respiration inherently shows strong spatial variability. It is difficult to obtain an accurate characterization of soil respiration with an insufficient number of monitoring points. However, it is expensive and cumbersome to deploy many sensors. To solve this problem, we proposed employing the Bayesian Maximum Entropy (BME) algorithm, using soil temperature as auxiliary information, to study the spatial distribution of soil respiration. The BME algorithm used the soft data (auxiliary information) effectively to improve the estimation accuracy of the spatiotemporal distribution of soil respiration. Based on the functional relationship between soil temperature and soil respiration, the BME algorithm satisfactorily integrated soil temperature data into said spatial distribution. As a means of comparison, we also applied the Ordinary Kriging (OK) and Co-Kriging (Co-OK) methods. The results indicated that the root mean squared errors (RMSEs) and absolute values of bias for both Day 1 and Day 2 were the lowest for the BME method, thus demonstrating its higher estimation accuracy. Further, we compared the performance of the BME algorithm coupled with auxiliary information, namely soil temperature data, and the OK method without auxiliary information in the same study area for 9, 21, and 37 sampled points. The results showed that the RMSEs for the BME algorithm (0.972 and 1.193) were less than those for the OK method (1.146 and 1.539) when the number of sampled points was 9 and 37, respectively. This indicates that the former method using auxiliary information could reduce the required number of sampling points for studying spatial distribution of soil respiration. Thus, the BME algorithm, coupled with soil temperature data, can not only improve the accuracy of soil respiration spatial interpolation but can also reduce the number of sampling points.
Hu, Junguo; Zhou, Jian; Zhou, Guomo; Luo, Yiqi; Xu, Xiaojun; Li, Pingheng; Liang, Junyi
2016-01-01
Soil respiration inherently shows strong spatial variability. It is difficult to obtain an accurate characterization of soil respiration with an insufficient number of monitoring points. However, it is expensive and cumbersome to deploy many sensors. To solve this problem, we proposed employing the Bayesian Maximum Entropy (BME) algorithm, using soil temperature as auxiliary information, to study the spatial distribution of soil respiration. The BME algorithm used the soft data (auxiliary information) effectively to improve the estimation accuracy of the spatiotemporal distribution of soil respiration. Based on the functional relationship between soil temperature and soil respiration, the BME algorithm satisfactorily integrated soil temperature data into said spatial distribution. As a means of comparison, we also applied the Ordinary Kriging (OK) and Co-Kriging (Co-OK) methods. The results indicated that the root mean squared errors (RMSEs) and absolute values of bias for both Day 1 and Day 2 were the lowest for the BME method, thus demonstrating its higher estimation accuracy. Further, we compared the performance of the BME algorithm coupled with auxiliary information, namely soil temperature data, and the OK method without auxiliary information in the same study area for 9, 21, and 37 sampled points. The results showed that the RMSEs for the BME algorithm (0.972 and 1.193) were less than those for the OK method (1.146 and 1.539) when the number of sampled points was 9 and 37, respectively. This indicates that the former method using auxiliary information could reduce the required number of sampling points for studying spatial distribution of soil respiration. Thus, the BME algorithm, coupled with soil temperature data, can not only improve the accuracy of soil respiration spatial interpolation but can also reduce the number of sampling points. PMID:26807579
Bayesian microsaccade detection
Mihali, Andra; van Opheusden, Bas; Ma, Wei Ji
2017-01-01
Microsaccades are high-velocity fixational eye movements, with special roles in perception and cognition. The default microsaccade detection method is to determine when the smoothed eye velocity exceeds a threshold. We have developed a new method, Bayesian microsaccade detection (BMD), which performs inference based on a simple statistical model of eye positions. In this model, a hidden state variable changes between drift and microsaccade states at random times. The eye position is a biased random walk with different velocity distributions for each state. BMD generates samples from the posterior probability distribution over the eye state time series given the eye position time series. Applied to simulated data, BMD recovers the “true” microsaccades with fewer errors than alternative algorithms, especially at high noise. Applied to EyeLink eye tracker data, BMD detects almost all the microsaccades detected by the default method, but also apparent microsaccades embedded in high noise—although these can also be interpreted as false positives. Next we apply the algorithms to data collected with a Dual Purkinje Image eye tracker, whose higher precision justifies defining the inferred microsaccades as ground truth. When we add artificial measurement noise, the inferences of all algorithms degrade; however, at noise levels comparable to EyeLink data, BMD recovers the “true” microsaccades with 54% fewer errors than the default algorithm. Though unsuitable for online detection, BMD has other advantages: It returns probabilities rather than binary judgments, and it can be straightforwardly adapted as the generative model is refined. We make our algorithm available as a software package. PMID:28114483
Kim, D; Burge, J; Lane, T; Pearlson, G D; Kiehl, K A; Calhoun, V D
2008-10-01
We utilized a discrete dynamic Bayesian network (dDBN) approach (Burge, J., Lane, T., Link, H., Qiu, S., Clark, V.P., 2007. Discrete dynamic Bayesian network analysis of fMRI data. Hum Brain Mapp.) to determine differences in brain regions between patients with schizophrenia and healthy controls on a measure of effective connectivity, termed the approximate conditional likelihood score (ACL) (Burge, J., Lane, T., 2005. Learning Class-Discriminative Dynamic Bayesian Networks. Proceedings of the International Conference on Machine Learning, Bonn, Germany, pp. 97-104.). The ACL score represents a class-discriminative measure of effective connectivity by measuring the relative likelihood of the correlation between brain regions in one group versus another. The algorithm is capable of finding non-linear relationships between brain regions because it uses discrete rather than continuous values and attempts to model temporal relationships with a first-order Markov and stationary assumption constraint (Papoulis, A., 1991. Probability, random variables, and stochastic processes. McGraw-Hill, New York.). Since Bayesian networks are overly sensitive to noisy data, we introduced an independent component analysis (ICA) filtering approach that attempted to reduce the noise found in fMRI data by unmixing the raw datasets into a set of independent spatial component maps. Components that represented noise were removed and the remaining components reconstructed into the dimensions of the original fMRI datasets. We applied the dDBN algorithm to a group of 35 patients with schizophrenia and 35 matched healthy controls using an ICA filtered and unfiltered approach. We determined that filtering the data significantly improved the magnitude of the ACL score. Patients showed the greatest ACL scores in several regions, most markedly the cerebellar vermis and hemispheres. Our findings suggest that schizophrenia patients exhibit weaker connectivity than healthy controls in multiple regions, including bilateral temporal, frontal, and cerebellar regions during an auditory paradigm.
Dong, Linsong; Wang, Zhiyong
2018-06-11
Genomic prediction is feasible for estimating genomic breeding values because of dense genome-wide markers and credible statistical methods, such as Genomic Best Linear Unbiased Prediction (GBLUP) and various Bayesian methods. Compared with GBLUP, Bayesian methods propose more flexible assumptions for the distributions of SNP effects. However, most Bayesian methods are performed based on Markov chain Monte Carlo (MCMC) algorithms, leading to computational efficiency challenges. Hence, some fast Bayesian approaches, such as fast BayesB (fBayesB), were proposed to speed up the calculation. This study proposed another fast Bayesian method termed fast BayesC (fBayesC). The prior distribution of fBayesC assumes that a SNP with probability γ has a non-zero effect which comes from a normal density with a common variance. The simulated data from QTLMAS XII workshop and actual data on large yellow croaker were used to compare the predictive results of fBayesB, fBayesC and (MCMC-based) BayesC. The results showed that when γ was set as a small value, such as 0.01 in the simulated data or 0.001 in the actual data, fBayesB and fBayesC yielded lower prediction accuracies (abilities) than BayesC. In the actual data, fBayesC could yield very similar predictive abilities as BayesC when γ ≥ 0.01. When γ = 0.01, fBayesB could also yield similar results as fBayesC and BayesC. However, fBayesB could not yield an explicit result when γ ≥ 0.1, but a similar situation was not observed for fBayesC. Moreover, the computational speed of fBayesC was significantly faster than that of BayesC, making fBayesC a promising method for genomic prediction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weaver, Brian Phillip
The purpose of this document is to describe the statistical modeling effort for gas concentrations in WIPP storage containers. The concentration (in ppm) of CO 2 in the headspace volume of standard waste box (SWB) 68685 is shown. A Bayesian approach and an adaptive Metropolis-Hastings algorithm were used.
Using the Detectability Index to Predict P300 Speller Performance
Mainsah, B.O.; Collins, L.M.; Throckmorton, C.S.
2017-01-01
Objective The P300 speller is a popular brain-computer interface (BCI) system that has been investigated as a potential communication alternative for individuals with severe neuromuscular limitations. To achieve acceptable accuracy levels for communication, the system requires repeated data measurements in a given signal condition to enhance the signal-to-noise ratio of elicited brain responses. These elicited brain responses, which are used as control signals, are embedded in noisy electroencephalography (EEG) data. The discriminability between target and non-target EEG responses defines a user’s performance with the system. A previous P300 speller model has been proposed to estimate system accuracy given a certain amount of data collection. However, the approach was limited to a static stopping algorithm, i.e. averaging over a fixed number of measurements, and the row-column paradigm. A generalized method that is also applicable to dynamic stopping algorithms and other stimulus paradigms is desirable. Approach We developed a new probabilistic model-based approach to predicting BCI performance, where performance functions can be derived analytically or via Monte Carlo methods. Within this framework, we introduce a new model for the P300 speller with the Bayesian dynamic stopping (DS) algorithm, by simplifying a multi-hypothesis to a binary hypothesis problem using the likelihood ratio test. Under a normality assumption, the performance functions for the Bayesian algorithm can be parameterized with the detectability index, a measure which quantifies the discriminability between target and non-target EEG responses. Main results Simulations with synthetic and empirical data provided initial verification of the proposed method of estimating performance with Bayesian DS using the detectability index. Analysis of results from previous online studies validated the proposed method. Significance The proposed method could serve as a useful tool to initially asses BCI performance without extensive online testing, in order to estimate the amount of data required to achieve a desired accuracy level. PMID:27705956
Using the detectability index to predict P300 speller performance
NASA Astrophysics Data System (ADS)
Mainsah, B. O.; Collins, L. M.; Throckmorton, C. S.
2016-12-01
Objective. The P300 speller is a popular brain-computer interface (BCI) system that has been investigated as a potential communication alternative for individuals with severe neuromuscular limitations. To achieve acceptable accuracy levels for communication, the system requires repeated data measurements in a given signal condition to enhance the signal-to-noise ratio of elicited brain responses. These elicited brain responses, which are used as control signals, are embedded in noisy electroencephalography (EEG) data. The discriminability between target and non-target EEG responses defines a user’s performance with the system. A previous P300 speller model has been proposed to estimate system accuracy given a certain amount of data collection. However, the approach was limited to a static stopping algorithm, i.e. averaging over a fixed number of measurements, and the row-column paradigm. A generalized method that is also applicable to dynamic stopping (DS) algorithms and other stimulus paradigms is desirable. Approach. We developed a new probabilistic model-based approach to predicting BCI performance, where performance functions can be derived analytically or via Monte Carlo methods. Within this framework, we introduce a new model for the P300 speller with the Bayesian DS algorithm, by simplifying a multi-hypothesis to a binary hypothesis problem using the likelihood ratio test. Under a normality assumption, the performance functions for the Bayesian algorithm can be parameterized with the detectability index, a measure which quantifies the discriminability between target and non-target EEG responses. Main results. Simulations with synthetic and empirical data provided initial verification of the proposed method of estimating performance with Bayesian DS using the detectability index. Analysis of results from previous online studies validated the proposed method. Significance. The proposed method could serve as a useful tool to initially assess BCI performance without extensive online testing, in order to estimate the amount of data required to achieve a desired accuracy level.
A Bayesian Multinomial Probit MODEL FOR THE ANALYSIS OF PANEL CHOICE DATA.
Fong, Duncan K H; Kim, Sunghoon; Chen, Zhe; DeSarbo, Wayne S
2016-03-01
A new Bayesian multinomial probit model is proposed for the analysis of panel choice data. Using a parameter expansion technique, we are able to devise a Markov Chain Monte Carlo algorithm to compute our Bayesian estimates efficiently. We also show that the proposed procedure enables the estimation of individual level coefficients for the single-period multinomial probit model even when the available prior information is vague. We apply our new procedure to consumer purchase data and reanalyze a well-known scanner panel dataset that reveals new substantive insights. In addition, we delineate a number of advantageous features of our proposed procedure over several benchmark models. Finally, through a simulation analysis employing a fractional factorial design, we demonstrate that the results from our proposed model are quite robust with respect to differing factors across various conditions.
Evolution in Mind: Evolutionary Dynamics, Cognitive Processes, and Bayesian Inference.
Suchow, Jordan W; Bourgin, David D; Griffiths, Thomas L
2017-07-01
Evolutionary theory describes the dynamics of population change in settings affected by reproduction, selection, mutation, and drift. In the context of human cognition, evolutionary theory is most often invoked to explain the origins of capacities such as language, metacognition, and spatial reasoning, framing them as functional adaptations to an ancestral environment. However, evolutionary theory is useful for understanding the mind in a second way: as a mathematical framework for describing evolving populations of thoughts, ideas, and memories within a single mind. In fact, deep correspondences exist between the mathematics of evolution and of learning, with perhaps the deepest being an equivalence between certain evolutionary dynamics and Bayesian inference. This equivalence permits reinterpretation of evolutionary processes as algorithms for Bayesian inference and has relevance for understanding diverse cognitive capacities, including memory and creativity. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Bayesian approach for the cost-effectiveness evaluation of healthcare technologies].
Berchialla, Paola; Gregori, Dario; Brunello, Franco; Veltri, Andrea; Petrinco, Michele; Pagano, Eva
2009-01-01
The development of Bayesian statistical methods for the assessment of the cost-effectiveness of health care technologies is reviewed. Although many studies adopt a frequentist approach, several authors have advocated the use of Bayesian methods in health economics. Emphasis has been placed on the advantages of the Bayesian approach, which include: (i) the ability to make more intuitive and meaningful inferences; (ii) the ability to tackle complex problems, such as allowing for the inclusion of patients who generate no cost, thanks to the availability of powerful computational algorithms; (iii) the importance of a full use of quantitative and structural prior information to produce realistic inferences. Much literature comparing the cost-effectiveness of two treatments is based on the incremental cost-effectiveness ratio. However, new methods are arising with the purpose of decision making. These methods are based on a net benefits approach. In the present context, the cost-effectiveness acceptability curves have been pointed out to be intrinsically Bayesian in their formulation. They plot the probability of a positive net benefit against the threshold cost of a unit increase in efficacy.A case study is presented in order to illustrate the Bayesian statistics in the cost-effectiveness analysis. Emphasis is placed on the cost-effectiveness acceptability curves. Advantages and disadvantages of the method described in this paper have been compared to frequentist methods and discussed.
Probability, statistics, and computational science.
Beerenwinkel, Niko; Siebourg, Juliane
2012-01-01
In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.
Bayesian denoising in digital radiography: a comparison in the dental field.
Frosio, I; Olivieri, C; Lucchese, M; Borghese, N A; Boccacci, P
2013-01-01
We compared two Bayesian denoising algorithms for digital radiographs, based on Total Variation regularization and wavelet decomposition. The comparison was performed on simulated radiographs with different photon counts and frequency content and on real dental radiographs. Four different quality indices were considered to quantify the quality of the filtered radiographs. The experimental results suggested that Total Variation is more suited to preserve fine anatomical details, whereas wavelets produce images of higher quality at global scale; they also highlighted the need for more reliable image quality indices. Copyright © 2012 Elsevier Ltd. All rights reserved.
Variational Gaussian approximation for Poisson data
NASA Astrophysics Data System (ADS)
Arridge, Simon R.; Ito, Kazufumi; Jin, Bangti; Zhang, Chen
2018-02-01
The Poisson model is frequently employed to describe count data, but in a Bayesian context it leads to an analytically intractable posterior probability distribution. In this work, we analyze a variational Gaussian approximation to the posterior distribution arising from the Poisson model with a Gaussian prior. This is achieved by seeking an optimal Gaussian distribution minimizing the Kullback-Leibler divergence from the posterior distribution to the approximation, or equivalently maximizing the lower bound for the model evidence. We derive an explicit expression for the lower bound, and show the existence and uniqueness of the optimal Gaussian approximation. The lower bound functional can be viewed as a variant of classical Tikhonov regularization that penalizes also the covariance. Then we develop an efficient alternating direction maximization algorithm for solving the optimization problem, and analyze its convergence. We discuss strategies for reducing the computational complexity via low rank structure of the forward operator and the sparsity of the covariance. Further, as an application of the lower bound, we discuss hierarchical Bayesian modeling for selecting the hyperparameter in the prior distribution, and propose a monotonically convergent algorithm for determining the hyperparameter. We present extensive numerical experiments to illustrate the Gaussian approximation and the algorithms.
Fuzzy CMAC With incremental Bayesian Ying-Yang learning and dynamic rule construction.
Nguyen, M N
2010-04-01
Inspired by the philosophy of ancient Chinese Taoism, Xu's Bayesian ying-yang (BYY) learning technique performs clustering by harmonizing the training data (yang) with the solution (ying). In our previous work, the BYY learning technique was applied to a fuzzy cerebellar model articulation controller (FCMAC) to find the optimal fuzzy sets; however, this is not suitable for time series data analysis. To address this problem, we propose an incremental BYY learning technique in this paper, with the idea of sliding window and rule structure dynamic algorithms. Three contributions are made as a result of this research. First, an online expectation-maximization algorithm incorporated with the sliding window is proposed for the fuzzification phase. Second, the memory requirement is greatly reduced since the entire data set no longer needs to be obtained during the prediction process. Third, the rule structure dynamic algorithm with dynamically initializing, recruiting, and pruning rules relieves the "curse of dimensionality" problem that is inherent in the FCMAC. Because of these features, the experimental results of the benchmark data sets of currency exchange rates and Mackey-Glass show that the proposed model is more suitable for real-time streaming data analysis.
Algorithmic procedures for Bayesian MEG/EEG source reconstruction in SPM.
López, J D; Litvak, V; Espinosa, J J; Friston, K; Barnes, G R
2014-01-01
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (Minimum Norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy-an approximation to the marginal likelihood or evidence of the solution. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardise the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with the help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational Free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm. © 2013. Published by Elsevier Inc. All rights reserved.
Semiparametric modeling: Correcting low-dimensional model error in parametric models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berry, Tyrus, E-mail: thb11@psu.edu; Harlim, John, E-mail: jharlim@psu.edu; Department of Meteorology, the Pennsylvania State University, 503 Walker Building, University Park, PA 16802-5013
2016-03-01
In this paper, a semiparametric modeling approach is introduced as a paradigm for addressing model error arising from unresolved physical phenomena. Our approach compensates for model error by learning an auxiliary dynamical model for the unknown parameters. Practically, the proposed approach consists of the following steps. Given a physics-based model and a noisy data set of historical observations, a Bayesian filtering algorithm is used to extract a time-series of the parameter values. Subsequently, the diffusion forecast algorithm is applied to the retrieved time-series in order to construct the auxiliary model for the time evolving parameters. The semiparametric forecasting algorithm consistsmore » of integrating the existing physics-based model with an ensemble of parameters sampled from the probability density function of the diffusion forecast. To specify initial conditions for the diffusion forecast, a Bayesian semiparametric filtering method that extends the Kalman-based filtering framework is introduced. In difficult test examples, which introduce chaotically and stochastically evolving hidden parameters into the Lorenz-96 model, we show that our approach can effectively compensate for model error, with forecasting skill comparable to that of the perfect model.« less
Pisharady, Pramod Kumar; Duarte-Carvajalino, Julio M; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe
2017-01-01
The RubiX [1] algorithm combines high SNR characteristics of low resolution data with high spacial specificity of high resolution data, to extract microstructural tissue parameters from diffusion MRI. In this paper we focus on estimating crossing fiber orientations and introduce sparsity to the RubiX algorithm, making it suitable for reconstruction from compressed (under-sampled) data. We propose a sparse Bayesian algorithm for estimation of fiber orientations and volume fractions from compressed diffusion MRI. The data at high resolution is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible directions. Volume fractions of fibers along these orientations define the dictionary weights. The data at low resolution is modeled using a spatial partial volume representation. The proposed dictionary representation and sparsity priors consider the dependence between fiber orientations and the spatial redundancy in data representation. Our method exploits the sparsity of fiber orientations, therefore facilitating inference from under-sampled data. Experimental results show improved accuracy and decreased uncertainty in fiber orientation estimates. For under-sampled data, the proposed method is also shown to produce more robust estimates of fiber orientations. PMID:28845484
Ramanujam, Nedunchelian; Kaliappan, Manivannan
2016-01-01
Nowadays, automatic multidocument text summarization systems can successfully retrieve the summary sentences from the input documents. But, it has many limitations such as inaccurate extraction to essential sentences, low coverage, poor coherence among the sentences, and redundancy. This paper introduces a new concept of timestamp approach with Naïve Bayesian Classification approach for multidocument text summarization. The timestamp provides the summary an ordered look, which achieves the coherent looking summary. It extracts the more relevant information from the multiple documents. Here, scoring strategy is also used to calculate the score for the words to obtain the word frequency. The higher linguistic quality is estimated in terms of readability and comprehensibility. In order to show the efficiency of the proposed method, this paper presents the comparison between the proposed methods with the existing MEAD algorithm. The timestamp procedure is also applied on the MEAD algorithm and the results are examined with the proposed method. The results show that the proposed method results in lesser time than the existing MEAD algorithm to execute the summarization process. Moreover, the proposed method results in better precision, recall, and F-score than the existing clustering with lexical chaining approach. PMID:27034971
MULTINEST: an efficient and robust Bayesian inference tool for cosmology and particle physics
NASA Astrophysics Data System (ADS)
Feroz, F.; Hobson, M. P.; Bridges, M.
2009-10-01
We present further development and the first public release of our multimodal nested sampling algorithm, called MULTINEST. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson, which itself significantly outperformed existing Markov chain Monte Carlo techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MULTINEST algorithm are demonstrated by application to two toy problems and to a cosmological inference problem focusing on the extension of the vanilla Λ cold dark matter model to include spatial curvature and a varying equation of state for dark energy. The MULTINEST software, which is fully parallelized using MPI and includes an interface to COSMOMC, is available at http://www.mrao.cam.ac.uk/software/multinest/. It will also be released as part of the SUPERBAYES package, for the analysis of supersymmetric theories of particle physics, at http://www.superbayes.org.
Optimization of the resources management in fighting wildfires.
Martin-Fernández, Susana; Martínez-Falero, Eugenio; Pérez-González, J Manuel
2002-09-01
Wildfires lead to important economic, social, and environmental losses, especially in areas of Mediterranean climate where they are of a high intensity and frequency. Over the past 30 years there has been a dramatic surge in the development and use of fire spread models. However, given the chaotic nature of environmental systems, it is very difficult to develop real-time fire-extinguishing models. This article proposes a method of optimizing the performance of wildfire fighting resources such that losses are kept to a minimum. The optimization procedure includes discrete simulation algorithms and Bayesian optimization methods for discrete and continuous problems (simulated annealing and Bayesian global optimization). Fast calculus algorithms are applied to provide optimization outcomes in short periods of time such that the predictions of the model and the real behavior of the fire, combat resources, and meteorological conditions are similar. In addition, adaptive algorithms take into account the chaotic behavior of wildfire so that the system can be updated with data corresponding to the real situation to obtain a new optimum solution. The application of this method to the Northwest Forest of Madrid (Spain) is also described. This application allowed us to check that it is a helpful tool in the decision-making process.
Bayesian inference on EMRI signals using low frequency approximations
NASA Astrophysics Data System (ADS)
Ali, Asad; Christensen, Nelson; Meyer, Renate; Röver, Christian
2012-07-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions.
Xiao, Hu; Cui, Rongxin; Xu, Demin
2018-06-01
This paper presents a cooperative multiagent search algorithm to solve the problem of searching for a target on a 2-D plane under multiple constraints. A Bayesian framework is used to update the local probability density functions (PDFs) of the target when the agents obtain observation information. To obtain the global PDF used for decision making, a sampling-based logarithmic opinion pool algorithm is proposed to fuse the local PDFs, and a particle sampling approach is used to represent the continuous PDF. Then the Gaussian mixture model (GMM) is applied to reconstitute the global PDF from the particles, and a weighted expectation maximization algorithm is presented to estimate the parameters of the GMM. Furthermore, we propose an optimization objective which aims to guide agents to find the target with less resource consumptions, and to keep the resource consumption of each agent balanced simultaneously. To this end, a utility function-based optimization problem is put forward, and it is solved by a gradient-based approach. Several contrastive simulations demonstrate that compared with other existing approaches, the proposed one uses less overall resources and shows a better performance of balancing the resource consumption.
Pisharady, Pramod Kumar; Duarte-Carvajalino, Julio M; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe
2015-10-01
The RubiX [1] algorithm combines high SNR characteristics of low resolution data with high spacial specificity of high resolution data, to extract microstructural tissue parameters from diffusion MRI. In this paper we focus on estimating crossing fiber orientations and introduce sparsity to the RubiX algorithm, making it suitable for reconstruction from compressed (under-sampled) data. We propose a sparse Bayesian algorithm for estimation of fiber orientations and volume fractions from compressed diffusion MRI. The data at high resolution is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible directions. Volume fractions of fibers along these orientations define the dictionary weights. The data at low resolution is modeled using a spatial partial volume representation. The proposed dictionary representation and sparsity priors consider the dependence between fiber orientations and the spatial redundancy in data representation. Our method exploits the sparsity of fiber orientations, therefore facilitating inference from under-sampled data. Experimental results show improved accuracy and decreased uncertainty in fiber orientation estimates. For under-sampled data, the proposed method is also shown to produce more robust estimates of fiber orientations.
Optimization of the Resources Management in Fighting Wildfires
NASA Astrophysics Data System (ADS)
Martin-Fernández, Susana; Martínez-Falero, Eugenio; Pérez-González, J. Manuel
2002-09-01
Wildfires lead to important economic, social, and environmental losses, especially in areas of Mediterranean climate where they are of a high intensity and frequency. Over the past 30 years there has been a dramatic surge in the development and use of fire spread models. However, given the chaotic nature of environmental systems, it is very difficult to develop real-time fire-extinguishing models. This article proposes a method of optimizing the performance of wildfire fighting resources such that losses are kept to a minimum. The optimization procedure includes discrete simulation algorithms and Bayesian optimization methods for discrete and continuous problems (simulated annealing and Bayesian global optimization). Fast calculus algorithms are applied to provide optimization outcomes in short periods of time such that the predictions of the model and the real behavior of the fire, combat resources, and meteorological conditions are similar. In addition, adaptive algorithms take into account the chaotic behavior of wildfire so that the system can be updated with data corresponding to the real situation to obtain a new optimum solution. The application of this method to the Northwest Forest of Madrid (Spain) is also described. This application allowed us to check that it is a helpful tool in the decision-making process.
NASA Technical Reports Server (NTRS)
Eckstein, Miguel P.; Abbey, Craig K.; Pham, Binh T.; Shimozaki, Steven S.
2004-01-01
Human performance in visual detection, discrimination, identification, and search tasks typically improves with practice. Psychophysical studies suggest that perceptual learning is mediated by an enhancement in the coding of the signal, and physiological studies suggest that it might be related to the plasticity in the weighting or selection of sensory units coding task relevant information (learning through attention optimization). We propose an experimental paradigm (optimal perceptual learning paradigm) to systematically study the dynamics of perceptual learning in humans by allowing comparisons to that of an optimal Bayesian algorithm and a number of suboptimal learning models. We measured improvement in human localization (eight-alternative forced-choice with feedback) performance of a target randomly sampled from four elongated Gaussian targets with different orientations and polarities and kept as a target for a block of four trials. The results suggest that the human perceptual learning can occur within a lapse of four trials (<1 min) but that human learning is slower and incomplete with respect to the optimal algorithm (23.3% reduction in human efficiency from the 1st-to-4th learning trials). The greatest improvement in human performance, occurring from the 1st-to-2nd learning trial, was also present in the optimal observer, and, thus reflects a property inherent to the visual task and not a property particular to the human perceptual learning mechanism. One notable source of human inefficiency is that, unlike the ideal observer, human learning relies more heavily on previous decisions than on the provided feedback, resulting in no human learning on trials following a previous incorrect localization decision. Finally, the proposed theory and paradigm provide a flexible framework for future studies to evaluate the optimality of human learning of other visual cues and/or sensory modalities.
Bayesian Nonlinear Assimilation of Eulerian and Lagrangian Coastal Flow Data
2015-09-30
Lagrangian Coastal Flow Data Dr. Pierre F.J. Lermusiaux Department of Mechanical Engineering Center for Ocean Science and Engineering Massachusetts...Develop and apply theory, schemes and computational systems for rigorous Bayesian nonlinear assimilation of Eulerian and Lagrangian coastal flow data...coastal ocean fields, both in Eulerian and Lagrangian forms. - Further develop and implement our GMM-DO schemes for robust Bayesian nonlinear estimation
Analytic continuation of quantum Monte Carlo data by stochastic analytical inference.
Fuchs, Sebastian; Pruschke, Thomas; Jarrell, Mark
2010-05-01
We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data which is strictly based on principles of Bayesian statistical inference. Within this framework we are able to obtain an explicit expression for the calculation of a weighted average over possible energy spectra, which can be evaluated by standard Monte Carlo simulations, yielding as by-product also the distribution function as function of the regularization parameter. Our algorithm thus avoids the usual ad hoc assumptions introduced in similar algorithms to fix the regularization parameter. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum-entropy calculation.
A Parallel and Incremental Approach for Data-Intensive Learning of Bayesian Networks.
Yue, Kun; Fang, Qiyu; Wang, Xiaoling; Li, Jin; Liu, Weiyi
2015-12-01
Bayesian network (BN) has been adopted as the underlying model for representing and inferring uncertain knowledge. As the basis of realistic applications centered on probabilistic inferences, learning a BN from data is a critical subject of machine learning, artificial intelligence, and big data paradigms. Currently, it is necessary to extend the classical methods for learning BNs with respect to data-intensive computing or in cloud environments. In this paper, we propose a parallel and incremental approach for data-intensive learning of BNs from massive, distributed, and dynamically changing data by extending the classical scoring and search algorithm and using MapReduce. First, we adopt the minimum description length as the scoring metric and give the two-pass MapReduce-based algorithms for computing the required marginal probabilities and scoring the candidate graphical model from sample data. Then, we give the corresponding strategy for extending the classical hill-climbing algorithm to obtain the optimal structure, as well as that for storing a BN by
Gunji, Yukio-Pegio; Shinohara, Shuji; Haruna, Taichi; Basios, Vasileios
2017-02-01
To overcome the dualism between mind and matter and to implement consciousness in science, a physical entity has to be embedded with a measurement process. Although quantum mechanics have been regarded as a candidate for implementing consciousness, nature at its macroscopic level is inconsistent with quantum mechanics. We propose a measurement-oriented inference system comprising Bayesian and inverse Bayesian inferences. While Bayesian inference contracts probability space, the newly defined inverse one relaxes the space. These two inferences allow an agent to make a decision corresponding to an immediate change in their environment. They generate a particular pattern of joint probability for data and hypotheses, comprising multiple diagonal and noisy matrices. This is expressed as a nondistributive orthomodular lattice equivalent to quantum logic. We also show that an orthomodular lattice can reveal information generated by inverse syllogism as well as the solutions to the frame and symbol-grounding problems. Our model is the first to connect macroscopic cognitive processes with the mathematical structure of quantum mechanics with no additional assumptions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Yu, Fang; Chen, Ming-Hui; Kuo, Lynn; Talbott, Heather; Davis, John S
2015-08-07
Recently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators. In this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387-404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783-802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods. The confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.
Theory-based Bayesian Models of Inductive Inference
2010-07-19
Subjective randomness and natural scene statistics. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom/papers/randscenes.pdf Page 1...in press). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom
Bayesian learning of visual chunks by human observers
Orbán, Gergő; Fiser, József; Aslin, Richard N.; Lengyel, Máté
2008-01-01
Efficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input. PMID:18268353
Depaoli, Sarah
2013-06-01
Growth mixture modeling (GMM) represents a technique that is designed to capture change over time for unobserved subgroups (or latent classes) that exhibit qualitatively different patterns of growth. The aim of the current article was to explore the impact of latent class separation (i.e., how similar growth trajectories are across latent classes) on GMM performance. Several estimation conditions were compared: maximum likelihood via the expectation maximization (EM) algorithm and the Bayesian framework implementing diffuse priors, "accurate" informative priors, weakly informative priors, data-driven informative priors, priors reflecting partial-knowledge of parameters, and "inaccurate" (but informative) priors. The main goal was to provide insight about the optimal estimation condition under different degrees of latent class separation for GMM. Results indicated that optimal parameter recovery was obtained though the Bayesian approach using "accurate" informative priors, and partial-knowledge priors showed promise for the recovery of the growth trajectory parameters. Maximum likelihood and the remaining Bayesian estimation conditions yielded poor parameter recovery for the latent class proportions and the growth trajectories. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-08-27
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the fluttermore » speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.« less
NASA Astrophysics Data System (ADS)
Smith, T.; Marshall, L.
2007-12-01
In many mountainous regions, the single most important parameter in forecasting the controls on regional water resources is snowpack (Williams et al., 1999). In an effort to bridge the gap between theoretical understanding and functional modeling of snow-driven watersheds, a flexible hydrologic modeling framework is being developed. The aim is to create a suite of models that move from parsimonious structures, concentrated on aggregated watershed response, to those focused on representing finer scale processes and distributed response. This framework will operate as a tool to investigate the link between hydrologic model predictive performance, uncertainty, model complexity, and observable hydrologic processes. Bayesian methods, and particularly Markov chain Monte Carlo (MCMC) techniques, are extremely useful in uncertainty assessment and parameter estimation of hydrologic models. However, these methods have some difficulties in implementation. In a traditional Bayesian setting, it can be difficult to reconcile multiple data types, particularly those offering different spatial and temporal coverage, depending on the model type. These difficulties are also exacerbated by sensitivity of MCMC algorithms to model initialization and complex parameter interdependencies. As a way of circumnavigating some of the computational complications, adaptive MCMC algorithms have been developed to take advantage of the information gained from each successive iteration. Two adaptive algorithms are compared is this study, the Adaptive Metropolis (AM) algorithm, developed by Haario et al (2001), and the Delayed Rejection Adaptive Metropolis (DRAM) algorithm, developed by Haario et al (2006). While neither algorithm is truly Markovian, it has been proven that each satisfies the desired ergodicity and stationarity properties of Markov chains. Both algorithms were implemented as the uncertainty and parameter estimation framework for a conceptual rainfall-runoff model based on the Probability Distributed Model (PDM), developed by Moore (1985). We implement the modeling framework in Stringer Creek watershed in the Tenderfoot Creek Experimental Forest (TCEF), Montana. The snowmelt-driven watershed offers that additional challenge of modeling snow accumulation and melt and current efforts are aimed at developing a temperature- and radiation-index snowmelt model. Auxiliary data available from within TCEF's watersheds are used to support in the understanding of information value as it relates to predictive performance. Because the model is based on lumped parameters, auxiliary data are hard to incorporate directly. However, these additional data offer benefits through the ability to inform prior distributions of the lumped, model parameters. By incorporating data offering different information into the uncertainty assessment process, a cross-validation technique is engaged to better ensure that modeled results reflect real process complexity.
Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization.
Nishio, Mizuho; Nishizawa, Mitsuo; Sugiyama, Osamu; Kojima, Ryosuke; Yakami, Masahiro; Kuroda, Tomohiro; Togashi, Kaori
2018-01-01
We aimed to evaluate a computer-aided diagnosis (CADx) system for lung nodule classification focussing on (i) usefulness of the conventional CADx system (hand-crafted imaging feature + machine learning algorithm), (ii) comparison between support vector machine (SVM) and gradient tree boosting (XGBoost) as machine learning algorithms, and (iii) effectiveness of parameter optimization using Bayesian optimization and random search. Data on 99 lung nodules (62 lung cancers and 37 benign lung nodules) were included from public databases of CT images. A variant of the local binary pattern was used for calculating a feature vector. SVM or XGBoost was trained using the feature vector and its corresponding label. Tree Parzen Estimator (TPE) was used as Bayesian optimization for parameters of SVM and XGBoost. Random search was done for comparison with TPE. Leave-one-out cross-validation was used for optimizing and evaluating the performance of our CADx system. Performance was evaluated using area under the curve (AUC) of receiver operating characteristic analysis. AUC was calculated 10 times, and its average was obtained. The best averaged AUC of SVM and XGBoost was 0.850 and 0.896, respectively; both were obtained using TPE. XGBoost was generally superior to SVM. Optimal parameters for achieving high AUC were obtained with fewer numbers of trials when using TPE, compared with random search. Bayesian optimization of SVM and XGBoost parameters was more efficient than random search. Based on observer study, AUC values of two board-certified radiologists were 0.898 and 0.822. The results show that diagnostic accuracy of our CADx system was comparable to that of radiologists with respect to classifying lung nodules.
A program for the Bayesian Neural Network in the ROOT framework
NASA Astrophysics Data System (ADS)
Zhong, Jiahang; Huang, Run-Sheng; Lee, Shih-Chang
2011-12-01
We present a Bayesian Neural Network algorithm implemented in the TMVA package (Hoecker et al., 2007 [1]), within the ROOT framework (Brun and Rademakers, 1997 [2]). Comparing to the conventional utilization of Neural Network as discriminator, this new implementation has more advantages as a non-parametric regression tool, particularly for fitting probabilities. It provides functionalities including cost function selection, complexity control and uncertainty estimation. An example of such application in High Energy Physics is shown. The algorithm is available with ROOT release later than 5.29. Program summaryProgram title: TMVA-BNN Catalogue identifier: AEJX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: BSD license No. of lines in distributed program, including test data, etc.: 5094 No. of bytes in distributed program, including test data, etc.: 1,320,987 Distribution format: tar.gz Programming language: C++ Computer: Any computer system or cluster with C++ compiler and UNIX-like operating system Operating system: Most UNIX/Linux systems. The application programs were thoroughly tested under Fedora and Scientific Linux CERN. Classification: 11.9 External routines: ROOT package version 5.29 or higher ( http://root.cern.ch) Nature of problem: Non-parametric fitting of multivariate distributions Solution method: An implementation of Neural Network following the Bayesian statistical interpretation. Uses Laplace approximation for the Bayesian marginalizations. Provides the functionalities of automatic complexity control and uncertainty estimation. Running time: Time consumption for the training depends substantially on the size of input sample, the NN topology, the number of training iterations, etc. For the example in this manuscript, about 7 min was used on a PC/Linux with 2.0 GHz processors.
Selectionist and Evolutionary Approaches to Brain Function: A Critical Appraisal
Fernando, Chrisantha; Szathmáry, Eörs; Husbands, Phil
2012-01-01
We consider approaches to brain dynamics and function that have been claimed to be Darwinian. These include Edelman’s theory of neuronal group selection, Changeux’s theory of synaptic selection and selective stabilization of pre-representations, Seung’s Darwinian synapse, Loewenstein’s synaptic melioration, Adam’s selfish synapse, and Calvin’s replicating activity patterns. Except for the last two, the proposed mechanisms are selectionist but not truly Darwinian, because no replicators with information transfer to copies and hereditary variation can be identified in them. All of them fit, however, a generalized selectionist framework conforming to the picture of Price’s covariance formulation, which deliberately was not specific even to selection in biology, and therefore does not imply an algorithmic picture of biological evolution. Bayesian models and reinforcement learning are formally in agreement with selection dynamics. A classification of search algorithms is shown to include Darwinian replicators (evolutionary units with multiplication, heredity, and variability) as the most powerful mechanism for search in a sparsely occupied search space. Examples are given of cases where parallel competitive search with information transfer among the units is more efficient than search without information transfer between units. Finally, we review our recent attempts to construct and analyze simple models of true Darwinian evolutionary units in the brain in terms of connectivity and activity copying of neuronal groups. Although none of the proposed neuronal replicators include miraculous mechanisms, their identification remains a challenge but also a great promise. PMID:22557963
van de Schoot, Rens; Broere, Joris J.; Perryck, Koen H.; Zondervan-Zwijnenburg, Mariëlle; van Loey, Nancy E.
2015-01-01
Background The analysis of small data sets in longitudinal studies can lead to power issues and often suffers from biased parameter values. These issues can be solved by using Bayesian estimation in conjunction with informative prior distributions. By means of a simulation study and an empirical example concerning posttraumatic stress symptoms (PTSS) following mechanical ventilation in burn survivors, we demonstrate the advantages and potential pitfalls of using Bayesian estimation. Methods First, we show how to specify prior distributions and by means of a sensitivity analysis we demonstrate how to check the exact influence of the prior (mis-) specification. Thereafter, we show by means of a simulation the situations in which the Bayesian approach outperforms the default, maximum likelihood and approach. Finally, we re-analyze empirical data on burn survivors which provided preliminary evidence of an aversive influence of a period of mechanical ventilation on the course of PTSS following burns. Results Not suprisingly, maximum likelihood estimation showed insufficient coverage as well as power with very small samples. Only when Bayesian analysis, in conjunction with informative priors, was used power increased to acceptable levels. As expected, we showed that the smaller the sample size the more the results rely on the prior specification. Conclusion We show that two issues often encountered during analysis of small samples, power and biased parameters, can be solved by including prior information into Bayesian analysis. We argue that the use of informative priors should always be reported together with a sensitivity analysis. PMID:25765534
van de Schoot, Rens; Broere, Joris J; Perryck, Koen H; Zondervan-Zwijnenburg, Mariëlle; van Loey, Nancy E
2015-01-01
Background : The analysis of small data sets in longitudinal studies can lead to power issues and often suffers from biased parameter values. These issues can be solved by using Bayesian estimation in conjunction with informative prior distributions. By means of a simulation study and an empirical example concerning posttraumatic stress symptoms (PTSS) following mechanical ventilation in burn survivors, we demonstrate the advantages and potential pitfalls of using Bayesian estimation. Methods : First, we show how to specify prior distributions and by means of a sensitivity analysis we demonstrate how to check the exact influence of the prior (mis-) specification. Thereafter, we show by means of a simulation the situations in which the Bayesian approach outperforms the default, maximum likelihood and approach. Finally, we re-analyze empirical data on burn survivors which provided preliminary evidence of an aversive influence of a period of mechanical ventilation on the course of PTSS following burns. Results : Not suprisingly, maximum likelihood estimation showed insufficient coverage as well as power with very small samples. Only when Bayesian analysis, in conjunction with informative priors, was used power increased to acceptable levels. As expected, we showed that the smaller the sample size the more the results rely on the prior specification. Conclusion : We show that two issues often encountered during analysis of small samples, power and biased parameters, can be solved by including prior information into Bayesian analysis. We argue that the use of informative priors should always be reported together with a sensitivity analysis.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
A Bayesian sequential design using alpha spending function to control type I error.
Zhu, Han; Yu, Qingzhao
2017-10-01
We propose in this article a Bayesian sequential design using alpha spending functions to control the overall type I error in phase III clinical trials. We provide algorithms to calculate critical values, power, and sample sizes for the proposed design. Sensitivity analysis is implemented to check the effects from different prior distributions, and conservative priors are recommended. We compare the power and actual sample sizes of the proposed Bayesian sequential design with different alpha spending functions through simulations. We also compare the power of the proposed method with frequentist sequential design using the same alpha spending function. Simulations show that, at the same sample size, the proposed method provides larger power than the corresponding frequentist sequential design. It also has larger power than traditional Bayesian sequential design which sets equal critical values for all interim analyses. When compared with other alpha spending functions, O'Brien-Fleming alpha spending function has the largest power and is the most conservative in terms that at the same sample size, the null hypothesis is the least likely to be rejected at early stage of clinical trials. And finally, we show that adding a step of stop for futility in the Bayesian sequential design can reduce the overall type I error and reduce the actual sample sizes.
Distributed Estimation using Bayesian Consensus Filtering
2014-06-06
Convergence rate analysis of distributed gossip (linear parameter) estimation: Fundamental limits and tradeoffs,” IEEE J. Sel. Topics Signal Process...Dimakis, S. Kar, J. Moura, M. Rabbat, and A. Scaglione, “ Gossip algorithms for distributed signal processing,” Proc. of the IEEE, vol. 98, no. 11, pp
Using Stan for Item Response Theory Models
ERIC Educational Resources Information Center
Ames, Allison J.; Au, Chi Hang
2018-01-01
Stan is a flexible probabilistic programming language providing full Bayesian inference through Hamiltonian Monte Carlo algorithms. The benefits of Hamiltonian Monte Carlo include improved efficiency and faster inference, when compared to other MCMC software implementations. Users can interface with Stan through a variety of computing…
Application of Monte Carlo algorithms to the Bayesian analysis of the Cosmic Microwave Background
NASA Technical Reports Server (NTRS)
Jewell, J.; Levin, S.; Anderson, C. H.
2004-01-01
Power spectrum estimation and evaluation of associated errors in the presence of incomplete sky coverage; nonhomogeneous, correlated instrumental noise; and foreground emission are problems of central importance for the extraction of cosmological information from the cosmic microwave background (CMB).
Bayesian inversion analysis of nonlinear dynamics in surface heterogeneous reactions.
Omori, Toshiaki; Kuwatani, Tatsu; Okamoto, Atsushi; Hukushima, Koji
2016-09-01
It is essential to extract nonlinear dynamics from time-series data as an inverse problem in natural sciences. We propose a Bayesian statistical framework for extracting nonlinear dynamics of surface heterogeneous reactions from sparse and noisy observable data. Surface heterogeneous reactions are chemical reactions with conjugation of multiple phases, and they have the intrinsic nonlinearity of their dynamics caused by the effect of surface-area between different phases. We adapt a belief propagation method and an expectation-maximization (EM) algorithm to partial observation problem, in order to simultaneously estimate the time course of hidden variables and the kinetic parameters underlying dynamics. The proposed belief propagation method is performed by using sequential Monte Carlo algorithm in order to estimate nonlinear dynamical system. Using our proposed method, we show that the rate constants of dissolution and precipitation reactions, which are typical examples of surface heterogeneous reactions, as well as the temporal changes of solid reactants and products, were successfully estimated only from the observable temporal changes in the concentration of the dissolved intermediate product.
NASA Astrophysics Data System (ADS)
Yuksel, Kivanc; Chang, Xin; Skarbek, Władysław
2017-08-01
The novel smile recognition algorithm is presented based on extraction of 68 facial salient points (fp68) using the ensemble of regression trees. The smile detector exploits the Support Vector Machine linear model. It is trained with few hundreds exemplar images by SVM algorithm working in 136 dimensional space. It is shown by the strict statistical data analysis that such geometric detector strongly depends on the geometry of mouth opening area, measured by triangulation of outer lip contour. To this goal two Bayesian detectors were developed and compared with SVM detector. The first uses the mouth area in 2D image, while the second refers to the mouth area in 3D animated face model. The 3D modeling is based on Candide-3 model and it is performed in real time along with three smile detectors and statistics estimators. The mouth area/Bayesian detectors exhibit high correlation with fp68/SVM detector in a range [0:8; 1:0], depending mainly on light conditions and individual features with advantage of 3D technique, especially in hard light conditions.
Data analysis using scale-space filtering and Bayesian probabilistic reasoning
NASA Technical Reports Server (NTRS)
Kulkarni, Deepak; Kutulakos, Kiriakos; Robinson, Peter
1991-01-01
This paper describes a program for analysis of output curves from Differential Thermal Analyzer (DTA). The program first extracts probabilistic qualitative features from a DTA curve of a soil sample, and then uses Bayesian probabilistic reasoning to infer the mineral in the soil. The qualifier module employs a simple and efficient extension of scale-space filtering suitable for handling DTA data. We have observed that points can vanish from contours in the scale-space image when filtering operations are not highly accurate. To handle the problem of vanishing points, perceptual organizations heuristics are used to group the points into lines. Next, these lines are grouped into contours by using additional heuristics. Probabilities are associated with these contours using domain-specific correlations. A Bayes tree classifier processes probabilistic features to infer the presence of different minerals in the soil. Experiments show that the algorithm that uses domain-specific correlation to infer qualitative features outperforms a domain-independent algorithm that does not.
Time-reversal and Bayesian inversion
NASA Astrophysics Data System (ADS)
Debski, Wojciech
2017-04-01
Probabilistic inversion technique is superior to the classical optimization-based approach in all but one aspects. It requires quite exhaustive computations which prohibit its use in huge size inverse problems like global seismic tomography or waveform inversion to name a few. The advantages of the approach are, however, so appealing that there is an ongoing continuous afford to make the large inverse task as mentioned above manageable with the probabilistic inverse approach. One of the perspective possibility to achieve this goal relays on exploring the internal symmetry of the seismological modeling problems in hand - a time reversal and reciprocity invariance. This two basic properties of the elastic wave equation when incorporating into the probabilistic inversion schemata open a new horizons for Bayesian inversion. In this presentation we discuss the time reversal symmetry property, its mathematical aspects and propose how to combine it with the probabilistic inverse theory into a compact, fast inversion algorithm. We illustrate the proposed idea with the newly developed location algorithm TRMLOC and discuss its efficiency when applied to mining induced seismic data.
Active sensing in the categorization of visual patterns
Yang, Scott Cheng-Hsin; Lengyel, Máté; Wolpert, Daniel M
2016-01-01
Interpreting visual scenes typically requires us to accumulate information from multiple locations in a scene. Using a novel gaze-contingent paradigm in a visual categorization task, we show that participants' scan paths follow an active sensing strategy that incorporates information already acquired about the scene and knowledge of the statistical structure of patterns. Intriguingly, categorization performance was markedly improved when locations were revealed to participants by an optimal Bayesian active sensor algorithm. By using a combination of a Bayesian ideal observer and the active sensor algorithm, we estimate that a major portion of this apparent suboptimality of fixation locations arises from prior biases, perceptual noise and inaccuracies in eye movements, and the central process of selecting fixation locations is around 70% efficient in our task. Our results suggest that participants select eye movements with the goal of maximizing information about abstract categories that require the integration of information from multiple locations. DOI: http://dx.doi.org/10.7554/eLife.12215.001 PMID:26880546
Denoising, deconvolving, and decomposing photon observations. Derivation of the D3PO algorithm
NASA Astrophysics Data System (ADS)
Selig, Marco; Enßlin, Torsten A.
2015-02-01
The analysis of astronomical images is a non-trivial task. The D3PO algorithm addresses the inference problem of denoising, deconvolving, and decomposing photon observations. Its primary goal is the simultaneous but individual reconstruction of the diffuse and point-like photon flux given a single photon count image, where the fluxes are superimposed. In order to discriminate between these morphologically different signal components, a probabilistic algorithm is derived in the language of information field theory based on a hierarchical Bayesian parameter model. The signal inference exploits prior information on the spatial correlation structure of the diffuse component and the brightness distribution of the spatially uncorrelated point-like sources. A maximum a posteriori solution and a solution minimizing the Gibbs free energy of the inference problem using variational Bayesian methods are discussed. Since the derivation of the solution is not dependent on the underlying position space, the implementation of the D3PO algorithm uses the nifty package to ensure applicability to various spatial grids and at any resolution. The fidelity of the algorithm is validated by the analysis of simulated data, including a realistic high energy photon count image showing a 32 × 32 arcmin2 observation with a spatial resolution of 0.1 arcmin. In all tests the D3PO algorithm successfully denoised, deconvolved, and decomposed the data into a diffuse and a point-like signal estimate for the respective photon flux components. A copy of the code is available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/574/A74
Estimating the hatchery fraction of a natural population: a Bayesian approach
Barber, Jarrett J.; Gerow, Kenneth G.; Connolly, Patrick J.; Singh, Sarabdeep
2011-01-01
There is strong and growing interest in estimating the proportion of hatchery fish that are in a natural population (the hatchery fraction). In a sample of fish from the relevant population, some are observed to be marked, indicating their origin as hatchery fish. The observed proportion of marked fish is usually less than the actual hatchery fraction, since the observed proportion is determined by the proportion originally marked, differential survival (usually lower) of marked fish relative to unmarked hatchery fish, and rates of mark retention and detection. Bayesian methods can work well in a setting such as this, in which empirical data are limited but for which there may be considerable expert judgment regarding these values. We explored a Bayesian estimation of the hatchery fraction using Monte Carlo–Markov chain methods. Based on our findings, we created an interactive Excel tool to implement the algorithm, which we have made available for free.
NASA Astrophysics Data System (ADS)
Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.
2017-12-01
The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.
A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Bayesian Peak Picking for NMR Spectra
Cheng, Yichen; Gao, Xin; Liang, Faming
2013-01-01
Protein structure determination is a very important topic in structural genomics, which helps people to understand varieties of biological functions such as protein-protein interactions, protein–DNA interactions and so on. Nowadays, nuclear magnetic resonance (NMR) has often been used to determine the three-dimensional structures of protein in vivo. This study aims to automate the peak picking step, the most important and tricky step in NMR structure determination. We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem. Under the Bayesian framework, the peak picking problem is casted as a variable selection problem. The proposed method can automatically distinguish true peaks from false ones without preprocessing the data. To the best of our knowledge, this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method. PMID:24184964
Bayesian Cue Integration as a Developmental Outcome of Reward Mediated Learning
Weisswange, Thomas H.; Rothkopf, Constantin A.; Rodemann, Tobias; Triesch, Jochen
2011-01-01
Average human behavior in cue combination tasks is well predicted by Bayesian inference models. As this capability is acquired over developmental timescales, the question arises, how it is learned. Here we investigated whether reward dependent learning, that is well established at the computational, behavioral, and neuronal levels, could contribute to this development. It is shown that a model free reinforcement learning algorithm can indeed learn to do cue integration, i.e. weight uncertain cues according to their respective reliabilities and even do so if reliabilities are changing. We also consider the case of causal inference where multimodal signals can originate from one or multiple separate objects and should not always be integrated. In this case, the learner is shown to develop a behavior that is closest to Bayesian model averaging. We conclude that reward mediated learning could be a driving force for the development of cue integration and causal inference. PMID:21750717
Sparse Bayesian learning for DOA estimation with mutual coupling.
Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi
2015-10-16
Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
A new method for E-government procurement using collaborative filtering and Bayesian approach.
Zhang, Shuai; Xi, Chengyu; Wang, Yan; Zhang, Wenyu; Chen, Yanhong
2013-01-01
Nowadays, as the Internet services increase faster than ever before, government systems are reinvented as E-government services. Therefore, government procurement sectors have to face challenges brought by the explosion of service information. This paper presents a novel method for E-government procurement (eGP) to search for the optimal procurement scheme (OPS). Item-based collaborative filtering and Bayesian approach are used to evaluate and select the candidate services to get the top-M recommendations such that the involved computation load can be alleviated. A trapezoidal fuzzy number similarity algorithm is applied to support the item-based collaborative filtering and Bayesian approach, since some of the services' attributes can be hardly expressed as certain and static values but only be easily represented as fuzzy values. A prototype system is built and validated with an illustrative example from eGP to confirm the feasibility of our approach.
A New Method for E-Government Procurement Using Collaborative Filtering and Bayesian Approach
Wang, Yan
2013-01-01
Nowadays, as the Internet services increase faster than ever before, government systems are reinvented as E-government services. Therefore, government procurement sectors have to face challenges brought by the explosion of service information. This paper presents a novel method for E-government procurement (eGP) to search for the optimal procurement scheme (OPS). Item-based collaborative filtering and Bayesian approach are used to evaluate and select the candidate services to get the top-M recommendations such that the involved computation load can be alleviated. A trapezoidal fuzzy number similarity algorithm is applied to support the item-based collaborative filtering and Bayesian approach, since some of the services' attributes can be hardly expressed as certain and static values but only be easily represented as fuzzy values. A prototype system is built and validated with an illustrative example from eGP to confirm the feasibility of our approach. PMID:24385869
Modular analysis of the probabilistic genetic interaction network.
Hou, Lin; Wang, Lin; Qian, Minping; Li, Dong; Tang, Chao; Zhu, Yunping; Deng, Minghua; Li, Fangting
2011-03-15
Epistatic Miniarray Profiles (EMAP) has enabled the mapping of large-scale genetic interaction networks; however, the quantitative information gained from EMAP cannot be fully exploited since the data are usually interpreted as a discrete network based on an arbitrary hard threshold. To address such limitations, we adopted a mixture modeling procedure to construct a probabilistic genetic interaction network and then implemented a Bayesian approach to identify densely interacting modules in the probabilistic network. Mixture modeling has been demonstrated as an effective soft-threshold technique of EMAP measures. The Bayesian approach was applied to an EMAP dataset studying the early secretory pathway in Saccharomyces cerevisiae. Twenty-seven modules were identified, and 14 of those were enriched by gold standard functional gene sets. We also conducted a detailed comparison with state-of-the-art algorithms, hierarchical cluster and Markov clustering. The experimental results show that the Bayesian approach outperforms others in efficiently recovering biologically significant modules.
Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.
2017-04-12
A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less
Bayesian Analysis of the Power Spectrum of the Cosmic Microwave Background
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey B.; Eriksen, H. K.; O'Dwyer, I. J.; Wandelt, B. D.
2005-01-01
There is a wealth of cosmological information encoded in the spatial power spectrum of temperature anisotropies of the cosmic microwave background. The sky, when viewed in the microwave, is very uniform, with a nearly perfect blackbody spectrum at 2.7 degrees. Very small amplitude brightness fluctuations (to one part in a million!!) trace small density perturbations in the early universe (roughly 300,000 years after the Big Bang), which later grow through gravitational instability to the large-scale structure seen in redshift surveys... In this talk, I will discuss a Bayesian formulation of this problem; discuss a Gibbs sampling approach to numerically sampling from the Bayesian posterior, and the application of this approach to the first-year data from the Wilkinson Microwave Anisotropy Probe. I will also comment on recent algorithmic developments for this approach to be tractable for the even more massive data set to be returned from the Planck satellite.
NASA Astrophysics Data System (ADS)
Christley, Scott; Emr, Bryanna; Ghosh, Auyon; Satalin, Josh; Gatto, Louis; Vodovotz, Yoram; Nieman, Gary F.; An, Gary
2013-06-01
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the alveolar ducts. These two model solutions correspond to significantly different mechanical properties of the tissue, and we discuss the implications of these different properties and the requirements for new experimental data to discriminate between the hypotheses.
Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination.
Zhao, Qibin; Zhang, Liqing; Cichocki, Andrzej
2015-09-01
CANDECOMP/PARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. To address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. To learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance.
NASA Astrophysics Data System (ADS)
Lu, Dan; Ricciuto, Daniel; Walker, Anthony; Safta, Cosmin; Munger, William
2017-09-01
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.
Adaptive selection and validation of models of complex systems in the presence of uncertainty
DOE Office of Scientific and Technical Information (OSTI.GOV)
Farrell-Maupin, Kathryn; Oden, J. T.
This study describes versions of OPAL, the Occam-Plausibility Algorithm in which the use of Bayesian model plausibilities is replaced with information theoretic methods, such as the Akaike Information Criterion and the Bayes Information Criterion. Applications to complex systems of coarse-grained molecular models approximating atomistic models of polyethylene materials are described. All of these model selection methods take into account uncertainties in the model, the observational data, the model parameters, and the predicted quantities of interest. A comparison of the models chosen by Bayesian model selection criteria and those chosen by the information-theoretic criteria is given.
Bayesian identification of acoustic impedance in treated ducts.
Buot de l'Épine, Y; Chazot, J-D; Ville, J-M
2015-07-01
The noise reduction of a liner placed in the nacelle of a turbofan engine is still difficult to predict due to the lack of knowledge of its acoustic impedance that depends on grazing flow profile, mode order, and sound pressure level. An eduction method, based on a Bayesian approach, is presented here to adjust an impedance model of the liner from sound pressures measured in a rectangular treated duct under multimodal propagation and flow. The cost function is regularized with prior information provided by Guess's [J. Sound Vib. 40, 119-137 (1975)] impedance of a perforated plate. The multi-parameter optimization is achieved with an Evolutionary-Markov-Chain-Monte-Carlo algorithm.
Adaptive selection and validation of models of complex systems in the presence of uncertainty
Farrell-Maupin, Kathryn; Oden, J. T.
2017-08-01
This study describes versions of OPAL, the Occam-Plausibility Algorithm in which the use of Bayesian model plausibilities is replaced with information theoretic methods, such as the Akaike Information Criterion and the Bayes Information Criterion. Applications to complex systems of coarse-grained molecular models approximating atomistic models of polyethylene materials are described. All of these model selection methods take into account uncertainties in the model, the observational data, the model parameters, and the predicted quantities of interest. A comparison of the models chosen by Bayesian model selection criteria and those chosen by the information-theoretic criteria is given.
Partitioning Ocean Wave Spectra Obtained from Radar Observations
NASA Astrophysics Data System (ADS)
Delaye, Lauriane; Vergely, Jean-Luc; Hauser, Daniele; Guitton, Gilles; Mouche, Alexis; Tison, Celine
2016-08-01
2D wave spectra of ocean waves can be partitioned into several wave components to better characterize the scene. We present here two methods of component detection: one based on watershed algorithm and the other based on a Bayesian approach. We tested both methods on a set of simulated SWIM data, the Ku-band real aperture radar embarked on the CFOSAT (China- France Oceanography Satellite) mission which launch is planned mid-2018. We present the results and the limits of both approaches and show that Bayesian method can also be applied to other kind of wave spectra observations as those obtained with the radar KuROS, an airborne radar wave spectrometer.
Bayesian analogy with relational transformations.
Lu, Hongjing; Chen, Dawn; Holyoak, Keith J
2012-07-01
How can humans acquire relational representations that enable analogical inference and other forms of high-level reasoning? Using comparative relations as a model domain, we explore the possibility that bottom-up learning mechanisms applied to objects coded as feature vectors can yield representations of relations sufficient to solve analogy problems. We introduce Bayesian analogy with relational transformations (BART) and apply the model to the task of learning first-order comparative relations (e.g., larger, smaller, fiercer, meeker) from a set of animal pairs. Inputs are coded by vectors of continuous-valued features, based either on human magnitude ratings, normed feature ratings (De Deyne et al., 2008), or outputs of the topics model (Griffiths, Steyvers, & Tenenbaum, 2007). Bootstrapping from empirical priors, the model is able to induce first-order relations represented as probabilistic weight distributions, even when given positive examples only. These learned representations allow classification of novel instantiations of the relations and yield a symbolic distance effect of the sort obtained with both humans and other primates. BART then transforms its learned weight distributions by importance-guided mapping, thereby placing distinct dimensions into correspondence. These transformed representations allow BART to reliably solve 4-term analogies (e.g., larger:smaller::fiercer:meeker), a type of reasoning that is arguably specific to humans. Our results provide a proof-of-concept that structured analogies can be solved with representations induced from unstructured feature vectors by mechanisms that operate in a largely bottom-up fashion. We discuss potential implications for algorithmic and neural models of relational thinking, as well as for the evolution of abstract thought. Copyright 2012 APA, all rights reserved.
Detection of low-contrast images in film-grain noise.
Naderi, F; Sawchuk, A A
1978-09-15
When low contrast photographic images are digitized by a very small aperture, extreme film-grain noise almost completely obliterates the image information. Using a large aperture to average out the noise destroys the fine details of the image. In these situations conventional statistical restoration techniques have little effect, and well chosen heuristic algorithms have yielded better results. In this paper we analyze the noisecheating algorithm of Zweig et al. [J. Opt. Soc. Am. 65, 1347 (1975)] and show that it can be justified by classical maximum-likelihood detection theory. A more general algorithm applicable to a broader class of images is then developed by considering the signal-dependent nature of film-grain noise. Finally, a Bayesian detection algorithm with improved performance is presented.
Multiple Signal Classification for Gravitational Wave Burst Search
NASA Astrophysics Data System (ADS)
Cao, Junwei; He, Zhengqi
2013-01-01
This work is mainly focused on the application of the multiple signal classification (MUSIC) algorithm for gravitational wave burst search. This algorithm extracts important gravitational wave characteristics from signals coming from detectors with arbitrary position, orientation and noise covariance. In this paper, the MUSIC algorithm is described in detail along with the necessary adjustments required for gravitational wave burst search. The algorithm's performance is measured using simulated signals and noise. MUSIC is compared with the Q-transform for signal triggering and with Bayesian analysis for direction of arrival (DOA) estimation, using the Ω-pipeline. Experimental results show that MUSIC has a lower resolution but is faster. MUSIC is a promising tool for real-time gravitational wave search for multi-messenger astronomy.
A Primer on Bayesian Analysis for Experimental Psychopathologists
Krypotos, Angelos-Miltiadis; Blanken, Tessa F.; Arnaudova, Inna; Matzke, Dora; Beckers, Tom
2016-01-01
The principal goals of experimental psychopathology (EPP) research are to offer insights into the pathogenic mechanisms of mental disorders and to provide a stable ground for the development of clinical interventions. The main message of the present article is that those goals are better served by the adoption of Bayesian statistics than by the continued use of null-hypothesis significance testing (NHST). In the first part of the article we list the main disadvantages of NHST and explain why those disadvantages limit the conclusions that can be drawn from EPP research. Next, we highlight the advantages of Bayesian statistics. To illustrate, we then pit NHST and Bayesian analysis against each other using an experimental data set from our lab. Finally, we discuss some challenges when adopting Bayesian statistics. We hope that the present article will encourage experimental psychopathologists to embrace Bayesian statistics, which could strengthen the conclusions drawn from EPP research. PMID:28748068
RESOLVE: A new algorithm for aperture synthesis imaging of extended emission in radio astronomy
NASA Astrophysics Data System (ADS)
Junklewitz, H.; Bell, M. R.; Selig, M.; Enßlin, T. A.
2016-02-01
We present resolve, a new algorithm for radio aperture synthesis imaging of extended and diffuse emission in total intensity. The algorithm is derived using Bayesian statistical inference techniques, estimating the surface brightness in the sky assuming a priori log-normal statistics. resolve estimates the measured sky brightness in total intensity, and the spatial correlation structure in the sky, which is used to guide the algorithm to an optimal reconstruction of extended and diffuse sources. During this process, the algorithm succeeds in deconvolving the effects of the radio interferometric point spread function. Additionally, resolve provides a map with an uncertainty estimate of the reconstructed surface brightness. Furthermore, with resolve we introduce a new, optimal visibility weighting scheme that can be viewed as an extension to robust weighting. In tests using simulated observations, the algorithm shows improved performance against two standard imaging approaches for extended sources, Multiscale-CLEAN and the Maximum Entropy Method.
Astrocytic tracer dynamics estimated from [1-¹¹C]-acetate PET measurements.
Arnold, Andrea; Calvetti, Daniela; Gjedde, Albert; Iversen, Peter; Somersalo, Erkki
2015-12-01
We address the problem of estimating the unknown parameters of a model of tracer kinetics from sequences of positron emission tomography (PET) scan data using a statistical sequential algorithm for the inference of magnitudes of dynamic parameters. The method, based on Bayesian statistical inference, is a modification of a recently proposed particle filtering and sequential Monte Carlo algorithm, where instead of preassigning the accuracy in the propagation of each particle, we fix the time step and account for the numerical errors in the innovation term. We apply the algorithm to PET images of [1-¹¹C]-acetate-derived tracer accumulation, estimating the transport rates in a three-compartment model of astrocytic uptake and metabolism of the tracer for a cohort of 18 volunteers from 3 groups, corresponding to healthy control individuals, cirrhotic liver and hepatic encephalopathy patients. The distribution of the parameters for the individuals and for the groups presented within the Bayesian framework support the hypothesis that the parameters for the hepatic encephalopathy group follow a significantly different distribution than the other two groups. The biological implications of the findings are also discussed. © The Authors 2014. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.
Formalizing Neurath's ship: Approximate algorithms for online causal learning.
Bramley, Neil R; Dayan, Peter; Griffiths, Thomas L; Lagnado, David A
2017-04-01
Higher-level cognition depends on the ability to learn models of the world. We can characterize this at the computational level as a structure-learning problem with the goal of best identifying the prevailing causal relationships among a set of relata. However, the computational cost of performing exact Bayesian inference over causal models grows rapidly as the number of relata increases. This implies that the cognitive processes underlying causal learning must be substantially approximate. A powerful class of approximations that focuses on the sequential absorption of successive inputs is captured by the Neurath's ship metaphor in philosophy of science, where theory change is cast as a stochastic and gradual process shaped as much by people's limited willingness to abandon their current theory when considering alternatives as by the ground truth they hope to approach. Inspired by this metaphor and by algorithms for approximating Bayesian inference in machine learning, we propose an algorithmic-level model of causal structure learning under which learners represent only a single global hypothesis that they update locally as they gather evidence. We propose a related scheme for understanding how, under these limitations, learners choose informative interventions that manipulate the causal system to help elucidate its workings. We find support for our approach in the analysis of 3 experiments. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Woldegebriel, Michael; Zomer, Paul; Mol, Hans G J; Vivó-Truyols, Gabriel
2016-08-02
In this work, we introduce an automated, efficient, and elegant model to combine all pieces of evidence (e.g., expected retention times, peak shapes, isotope distributions, fragment-to-parent ratio) obtained from liquid chromatography-tandem mass spectrometry (LC-MS/MS/MS) data for screening purposes. Combining all these pieces of evidence requires a careful assessment of the uncertainties in the analytical system as well as all possible outcomes. To-date, the majority of the existing algorithms are highly dependent on user input parameters. Additionally, the screening process is tackled as a deterministic problem. In this work we present a Bayesian framework to deal with the combination of all these pieces of evidence. Contrary to conventional algorithms, the information is treated in a probabilistic way, and a final probability assessment of the presence/absence of a compound feature is computed. Additionally, all the necessary parameters except the chromatographic band broadening for the method are learned from the data in training and learning phase of the algorithm, avoiding the introduction of a large number of user-defined parameters. The proposed method was validated with a large data set and has shown improved sensitivity and specificity in comparison to a threshold-based commercial software package.
Alemi, Farrokh; Torii, Manabu; Atherton, Martin J; Pattie, David C; Cox, Kenneth L
2012-01-01
This article aims to examine whether words listed in reasons for appointments could effectively predict laboratory-verified influenza cases in syndromic surveillance systems. Data were collected from the Armed Forces Health Longitudinal Technological Application medical record system. We used 2 algorithms to combine the impact of words within reasons for appointments: Dependent (DBSt) and Independent (IBSt) Bayesian System. We used receiver operating characteristic curves to compare the accuracy of these 2 methods of processing reasons for appointments against current and previous lists of diagnoses used in the Department of Defense's syndromic surveillance system. We examined 13,096 cases, where the results of influenza tests were available. Each reason for an appointment had an average of 3.5 words (standard deviation = 2.2 words). There was no difference in performance of the 2 algorithms. The area under the curve for IBSt was 0.58 and for DBSt was 0.56. The difference was not statistically significant (McNemar statistic = 0.0054; P = 0.07). These data suggest that reasons for appointments can improve the accuracy of lists of diagnoses in predicting laboratory-verified influenza cases. This study recommends further exploration of the DBSt algorithm and reasons for appointments in predicting likely influenza cases.
A Probabilistic Model of Social Working Memory for Information Retrieval in Social Interactions.
Li, Liyuan; Xu, Qianli; Gan, Tian; Tan, Cheston; Lim, Joo-Hwee
2018-05-01
Social working memory (SWM) plays an important role in navigating social interactions. Inspired by studies in psychology, neuroscience, cognitive science, and machine learning, we propose a probabilistic model of SWM to mimic human social intelligence for personal information retrieval (IR) in social interactions. First, we establish a semantic hierarchy as social long-term memory to encode personal information. Next, we propose a semantic Bayesian network as the SWM, which integrates the cognitive functions of accessibility and self-regulation. One subgraphical model implements the accessibility function to learn the social consensus about IR-based on social information concept, clustering, social context, and similarity between persons. Beyond accessibility, one more layer is added to simulate the function of self-regulation to perform the personal adaptation to the consensus based on human personality. Two learning algorithms are proposed to train the probabilistic SWM model on a raw dataset of high uncertainty and incompleteness. One is an efficient learning algorithm of Newton's method, and the other is a genetic algorithm. Systematic evaluations show that the proposed SWM model is able to learn human social intelligence effectively and outperforms the baseline Bayesian cognitive model. Toward real-world applications, we implement our model on Google Glass as a wearable assistant for social interaction.
Efficient Probabilistic Diagnostics for Electrical Power Systems
NASA Technical Reports Server (NTRS)
Mengshoel, Ole J.; Chavira, Mark; Cascio, Keith; Poll, Scott; Darwiche, Adnan; Uckun, Serdar
2008-01-01
We consider in this work the probabilistic approach to model-based diagnosis when applied to electrical power systems (EPSs). Our probabilistic approach is formally well-founded, as it based on Bayesian networks and arithmetic circuits. We investigate the diagnostic task known as fault isolation, and pay special attention to meeting two of the main challenges . model development and real-time reasoning . often associated with real-world application of model-based diagnosis technologies. To address the challenge of model development, we develop a systematic approach to representing electrical power systems as Bayesian networks, supported by an easy-to-use speci.cation language. To address the real-time reasoning challenge, we compile Bayesian networks into arithmetic circuits. Arithmetic circuit evaluation supports real-time diagnosis by being predictable and fast. In essence, we introduce a high-level EPS speci.cation language from which Bayesian networks that can diagnose multiple simultaneous failures are auto-generated, and we illustrate the feasibility of using arithmetic circuits, compiled from Bayesian networks, for real-time diagnosis on real-world EPSs of interest to NASA. The experimental system is a real-world EPS, namely the Advanced Diagnostic and Prognostic Testbed (ADAPT) located at the NASA Ames Research Center. In experiments with the ADAPT Bayesian network, which currently contains 503 discrete nodes and 579 edges, we .nd high diagnostic accuracy in scenarios where one to three faults, both in components and sensors, were inserted. The time taken to compute the most probable explanation using arithmetic circuits has a small mean of 0.2625 milliseconds and standard deviation of 0.2028 milliseconds. In experiments with data from ADAPT we also show that arithmetic circuit evaluation substantially outperforms joint tree propagation and variable elimination, two alternative algorithms for diagnosis using Bayesian network inference.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
Theory Learning as Stochastic Search in the Language of Thought
ERIC Educational Resources Information Center
Ullman, Tomer D.; Goodman, Noah D.; Tenenbaum, Joshua B.
2012-01-01
We present an algorithmic model for the development of children's intuitive theories within a hierarchical Bayesian framework, where theories are described as sets of logical laws generated by a probabilistic context-free grammar. We contrast our approach with connectionist and other emergentist approaches to modeling cognitive development. While…
An overview of the essential differences and similarities of system identification techniques
NASA Technical Reports Server (NTRS)
Mehra, Raman K.
1991-01-01
Information is given in the form of outlines, graphs, tables and charts. Topics include system identification, Bayesian statistical decision theory, Maximum Likelihood Estimation, identification methods, structural mode identification using a stochastic realization algorithm, and identification results regarding membrane simulations and X-29 flutter flight test data.
Minsley, Burke J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, ‘best’ model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequencydomain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favorably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment.
NASA Astrophysics Data System (ADS)
Martin, Nicolas F.; Ibata, Rodrigo A.; McConnachie, Alan W.; Mackey, A. Dougal; Ferguson, Annette M. N.; Irwin, Michael J.; Lewis, Geraint F.; Fardal, Mark A.
2013-10-01
We present a generic algorithm to search for dwarf galaxies in photometric catalogs and apply it to the Pan-Andromeda Archaeological Survey (PAndAS). The algorithm is developed in a Bayesian framework and, contrary to most dwarf galaxy search codes, makes use of both the spatial and color-magnitude information of sources in a probabilistic approach. Accounting for the significant contamination from the Milky Way foreground and from the structured stellar halo of the Andromeda galaxy, we recover all known dwarf galaxies in the PAndAS footprint with high significance, even for the least luminous ones. Some Andromeda globular clusters are also recovered and, in one case, discovered. We publish a list of the 143 most significant detections yielded by the algorithm. The combined properties of the 39 most significant isolated detections show hints that at least some of these trace genuine dwarf galaxies, too faint to be individually detected. Follow-up observations by the community are mandatory to establish which are real members of the Andromeda satellite system. The search technique presented here will be used in an upcoming contribution to determine the PAndAS completeness limits for dwarf galaxies. Although here tuned to the search of dwarf galaxies in the PAndAS data, the algorithm can easily be adapted to the search for any localized overdensity whose properties can be modeled reliably in the parameter space of any catalog.
Neuromusculoskeletal model self-calibration for on-line sequential bayesian moment estimation
NASA Astrophysics Data System (ADS)
Bueno, Diana R.; Montano, L.
2017-04-01
Objective. Neuromusculoskeletal models involve many subject-specific physiological parameters that need to be adjusted to adequately represent muscle properties. Traditionally, neuromusculoskeletal models have been calibrated with a forward-inverse dynamic optimization which is time-consuming and unfeasible for rehabilitation therapy. Non self-calibration algorithms have been applied to these models. To the best of our knowledge, the algorithm proposed in this work is the first on-line calibration algorithm for muscle models that allows a generic model to be adjusted to different subjects in a few steps. Approach. In this paper we propose a reformulation of the traditional muscle models that is able to sequentially estimate the kinetics (net joint moments), and also its full self-calibration (subject-specific internal parameters of the muscle from a set of arbitrary uncalibrated data), based on the unscented Kalman filter. The nonlinearity of the model as well as its calibration problem have obliged us to adopt the sum of Gaussians filter suitable for nonlinear systems. Main results. This sequential Bayesian self-calibration algorithm achieves a complete muscle model calibration using as input only a dataset of uncalibrated sEMG and kinematics data. The approach is validated experimentally using data from the upper limbs of 21 subjects. Significance. The results show the feasibility of neuromusculoskeletal model self-calibration. This study will contribute to a better understanding of the generalization of muscle models for subject-specific rehabilitation therapies. Moreover, this work is very promising for rehabilitation devices such as electromyography-driven exoskeletons or prostheses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martin, Nicolas F.; Ibata, Rodrigo A.; McConnachie, Alan W.
We present a generic algorithm to search for dwarf galaxies in photometric catalogs and apply it to the Pan-Andromeda Archaeological Survey (PAndAS). The algorithm is developed in a Bayesian framework and, contrary to most dwarf galaxy search codes, makes use of both the spatial and color-magnitude information of sources in a probabilistic approach. Accounting for the significant contamination from the Milky Way foreground and from the structured stellar halo of the Andromeda galaxy, we recover all known dwarf galaxies in the PAndAS footprint with high significance, even for the least luminous ones. Some Andromeda globular clusters are also recovered and,more » in one case, discovered. We publish a list of the 143 most significant detections yielded by the algorithm. The combined properties of the 39 most significant isolated detections show hints that at least some of these trace genuine dwarf galaxies, too faint to be individually detected. Follow-up observations by the community are mandatory to establish which are real members of the Andromeda satellite system. The search technique presented here will be used in an upcoming contribution to determine the PAndAS completeness limits for dwarf galaxies. Although here tuned to the search of dwarf galaxies in the PAndAS data, the algorithm can easily be adapted to the search for any localized overdensity whose properties can be modeled reliably in the parameter space of any catalog.« less
NASA Astrophysics Data System (ADS)
Waldmann, Ingo
2016-10-01
Radiative transfer retrievals have become the standard in modelling of exoplanetary transmission and emission spectra. Analysing currently available observations of exoplanetary atmospheres often invoke large and correlated parameter spaces that can be difficult to map or constrain.To address these issues, we have developed the Tau-REx (tau-retrieval of exoplanets) retrieval and the RobERt spectral recognition algorithms. Tau-REx is a bayesian atmospheric retrieval framework using Nested Sampling and cluster computing to fully map these large correlated parameter spaces. Nonetheless, data volumes can become prohibitively large and we must often select a subset of potential molecular/atomic absorbers in an atmosphere.In the era of open-source, automated and self-sufficient retrieval algorithms, such manual input should be avoided. User dependent input could, in worst case scenarios, lead to incomplete models and biases in the retrieval. The RobERt algorithm is build to address these issues. RobERt is a deep belief neural (DBN) networks trained to accurately recognise molecular signatures for a wide range of planets, atmospheric thermal profiles and compositions. Using these deep neural networks, we work towards retrieval algorithms that themselves understand the nature of the observed spectra, are able to learn from current and past data and make sensible qualitative preselections of atmospheric opacities to be used for the quantitative stage of the retrieval process.In this talk I will discuss how neural networks and Bayesian Nested Sampling can be used to solve highly degenerate spectral retrieval problems and what 'dreaming' neural networks can tell us about atmospheric characteristics.
Fusing Continuous-Valued Medical Labels Using a Bayesian Model.
Zhu, Tingting; Dunkley, Nic; Behar, Joachim; Clifton, David A; Clifford, Gari D
2015-12-01
With the rapid increase in volume of time series medical data available through wearable devices, there is a need to employ automated algorithms to label data. Examples of labels include interventions, changes in activity (e.g. sleep) and changes in physiology (e.g. arrhythmias). However, automated algorithms tend to be unreliable resulting in lower quality care. Expert annotations are scarce, expensive, and prone to significant inter- and intra-observer variance. To address these problems, a Bayesian Continuous-valued Label Aggregator (BCLA) is proposed to provide a reliable estimation of label aggregation while accurately infer the precision and bias of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic indicator) estimation from the electrocardiogram using labels from the 2006 PhysioNet/Computing in Cardiology Challenge database. It was compared to the mean, median, and a previously proposed Expectation Maximization (EM) label aggregation approaches. While accurately predicting each labelling algorithm's bias and precision, the root-mean-square error of the BCLA was 11.78 ± 0.63 ms, significantly outperforming the best Challenge entry (15.37 ± 2.13 ms) as well as the EM, mean, and median voting strategies (14.76 ± 0.52, 17.61 ± 0.55, and 14.43 ± 0.57 ms respectively with p < 0.0001). The BCLA could therefore provide accurate estimation for medical continuous-valued label tasks in an unsupervised manner even when the ground truth is not available.
A Bayesian perspective on magnitude estimation.
Petzschner, Frederike H; Glasauer, Stefan; Stephan, Klaas E
2015-05-01
Our representation of the physical world requires judgments of magnitudes, such as loudness, distance, or time. Interestingly, magnitude estimates are often not veridical but subject to characteristic biases. These biases are strikingly similar across different sensory modalities, suggesting common processing mechanisms that are shared by different sensory systems. However, the search for universal neurobiological principles of magnitude judgments requires guidance by formal theories. Here, we discuss a unifying Bayesian framework for understanding biases in magnitude estimation. This Bayesian perspective enables a re-interpretation of a range of established psychophysical findings, reconciles seemingly incompatible classical views on magnitude estimation, and can guide future investigations of magnitude estimation and its neurobiological mechanisms in health and in psychiatric diseases, such as schizophrenia. Copyright © 2015 Elsevier Ltd. All rights reserved.
A Bayesian approach to earthquake source studies
NASA Astrophysics Data System (ADS)
Minson, Sarah
Bayesian sampling has several advantages over conventional optimization approaches to solving inverse problems. It produces the distribution of all possible models sampled proportionally to how much each model is consistent with the data and the specified prior information, and thus images the entire solution space, revealing the uncertainties and trade-offs in the model. Bayesian sampling is applicable to both linear and non-linear modeling, and the values of the model parameters being sampled can be constrained based on the physics of the process being studied and do not have to be regularized. However, these methods are computationally challenging for high-dimensional problems. Until now the computational expense of Bayesian sampling has been too great for it to be practicable for most geophysical problems. I present a new parallel sampling algorithm called CATMIP for Cascading Adaptive Tempered Metropolis In Parallel. This technique, based on Transitional Markov chain Monte Carlo, makes it possible to sample distributions in many hundreds of dimensions, if the forward model is fast, or to sample computationally expensive forward models in smaller numbers of dimensions. The design of the algorithm is independent of the model being sampled, so CATMIP can be applied to many areas of research. I use CATMIP to produce a finite fault source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. Surface displacements from the earthquake were recorded by six interferograms and twelve local high-rate GPS stations. Because of the wealth of near-fault data, the source process is well-constrained. I find that the near-field high-rate GPS data have significant resolving power above and beyond the slip distribution determined from static displacements. The location and magnitude of the maximum displacement are resolved. The rupture almost certainly propagated at sub-shear velocities. The full posterior distribution can be used not only to calculate source parameters but also to determine their uncertainties. So while kinematic source modeling and the estimation of source parameters is not new, with CATMIP I am able to use Bayesian sampling to determine which parts of the source process are well-constrained and which are not.
Quantum-Like Representation of Non-Bayesian Inference
NASA Astrophysics Data System (ADS)
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Shiklomanov, Alexey N.; Dietze, Michael C.; Viskari, Toni; ...
2016-06-09
The remote monitoring of plant canopies is critically needed for understanding of terrestrial ecosystem mechanics and biodiversity as well as capturing the short- to long-term responses of vegetation to disturbance and climate change. A variety of orbital, sub-orbital, and field instruments have been used to retrieve optical spectral signals and to study different vegetation properties such as plant biochemistry, nutrient cycling, physiology, water status, and stress. Radiative transfer models (RTMs) provide a mechanistic link between vegetation properties and observed spectral features, and RTM spectral inversion is a useful framework for estimating these properties from spectral data. However, existing approaches tomore » RTM spectral inversion are typically limited by the inability to characterize uncertainty in parameter estimates. Here, we introduce a Bayesian algorithm for the spectral inversion of the PROSPECT 5 leaf RTM that is distinct from past approaches in two important ways: First, the algorithm only uses reflectance and does not require transmittance observations, which have been plagued by a variety of measurement and equipment challenges. Second, the output is not a point estimate for each parameter but rather the joint probability distribution that includes estimates of parameter uncertainties and covariance structure. We validated our inversion approach using a database of leaf spectra together with measurements of equivalent water thickness (EWT) and leaf dry mass per unit area (LMA). The parameters estimated by our inversion were able to accurately reproduce the observed reflectance (RMSE VIS = 0.0063, RMSE NIR-SWIR = 0.0098) and transmittance (RMSE VIS = 0.0404, RMSE NIR-SWIR = 0.0551) for both broadleaved and conifer species. Inversion estimates of EWT and LMA for broadleaved species agreed well with direct measurements (CV EWT = 18.8%, CV LMA = 24.5%), while estimates for conifer species were less accurate (CV EWT = 53.2%, CV LMA = 63.3%). To examine the influence of spectral resolution on parameter uncertainty, we simulated leaf reflectance as observed by ten common remote sensing platforms with varying spectral configurations and performed a Bayesian inversion on the resulting spectra. We found that full-range hyperspectral platforms were able to retrieve all parameters accurately and precisely, while the parameter estimates of multispectral platforms were much less precise and prone to bias at high and low values. We also observed that variations in the width and location of spectral bands influenced the shape of the covariance structure of parameter estimates. Lastly, our Bayesian spectral inversion provides a powerful and versatile framework for future RTM development and single- and multi-instrumental remote sensing of vegetation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shiklomanov, Alexey N.; Dietze, Michael C.; Viskari, Toni
The remote monitoring of plant canopies is critically needed for understanding of terrestrial ecosystem mechanics and biodiversity as well as capturing the short- to long-term responses of vegetation to disturbance and climate change. A variety of orbital, sub-orbital, and field instruments have been used to retrieve optical spectral signals and to study different vegetation properties such as plant biochemistry, nutrient cycling, physiology, water status, and stress. Radiative transfer models (RTMs) provide a mechanistic link between vegetation properties and observed spectral features, and RTM spectral inversion is a useful framework for estimating these properties from spectral data. However, existing approaches tomore » RTM spectral inversion are typically limited by the inability to characterize uncertainty in parameter estimates. Here, we introduce a Bayesian algorithm for the spectral inversion of the PROSPECT 5 leaf RTM that is distinct from past approaches in two important ways: First, the algorithm only uses reflectance and does not require transmittance observations, which have been plagued by a variety of measurement and equipment challenges. Second, the output is not a point estimate for each parameter but rather the joint probability distribution that includes estimates of parameter uncertainties and covariance structure. We validated our inversion approach using a database of leaf spectra together with measurements of equivalent water thickness (EWT) and leaf dry mass per unit area (LMA). The parameters estimated by our inversion were able to accurately reproduce the observed reflectance (RMSE VIS = 0.0063, RMSE NIR-SWIR = 0.0098) and transmittance (RMSE VIS = 0.0404, RMSE NIR-SWIR = 0.0551) for both broadleaved and conifer species. Inversion estimates of EWT and LMA for broadleaved species agreed well with direct measurements (CV EWT = 18.8%, CV LMA = 24.5%), while estimates for conifer species were less accurate (CV EWT = 53.2%, CV LMA = 63.3%). To examine the influence of spectral resolution on parameter uncertainty, we simulated leaf reflectance as observed by ten common remote sensing platforms with varying spectral configurations and performed a Bayesian inversion on the resulting spectra. We found that full-range hyperspectral platforms were able to retrieve all parameters accurately and precisely, while the parameter estimates of multispectral platforms were much less precise and prone to bias at high and low values. We also observed that variations in the width and location of spectral bands influenced the shape of the covariance structure of parameter estimates. Lastly, our Bayesian spectral inversion provides a powerful and versatile framework for future RTM development and single- and multi-instrumental remote sensing of vegetation.« less
Topics in Bayesian Hierarchical Modeling and its Monte Carlo Computations
NASA Astrophysics Data System (ADS)
Tak, Hyung Suk
The first chapter addresses a Beta-Binomial-Logit model that is a Beta-Binomial conjugate hierarchical model with covariate information incorporated via a logistic regression. Various researchers in the literature have unknowingly used improper posterior distributions or have given incorrect statements about posterior propriety because checking posterior propriety can be challenging due to the complicated functional form of a Beta-Binomial-Logit model. We derive data-dependent necessary and sufficient conditions for posterior propriety within a class of hyper-prior distributions that encompass those used in previous studies. Frequency coverage properties of several hyper-prior distributions are also investigated to see when and whether Bayesian interval estimates of random effects meet their nominal confidence levels. The second chapter deals with a time delay estimation problem in astrophysics. When the gravitational field of an intervening galaxy between a quasar and the Earth is strong enough to split light into two or more images, the time delay is defined as the difference between their travel times. The time delay can be used to constrain cosmological parameters and can be inferred from the time series of brightness data of each image. To estimate the time delay, we construct a Gaussian hierarchical model based on a state-space representation for irregularly observed time series generated by a latent continuous-time Ornstein-Uhlenbeck process. Our Bayesian approach jointly infers model parameters via a Gibbs sampler. We also introduce a profile likelihood of the time delay as an approximation of its marginal posterior distribution. The last chapter specifies a repelling-attracting Metropolis algorithm, a new Markov chain Monte Carlo method to explore multi-modal distributions in a simple and fast manner. This algorithm is essentially a Metropolis-Hastings algorithm with a proposal that consists of a downhill move in density that aims to make local modes repelling, followed by an uphill move in density that aims to make local modes attracting. The downhill move is achieved via a reciprocal Metropolis ratio so that the algorithm prefers downward movement. The uphill move does the opposite using the standard Metropolis ratio which prefers upward movement. This down-up movement in density increases the probability of a proposed move to a different mode.
Bayesian-MCMC-based parameter estimation of stealth aircraft RCS models
NASA Astrophysics Data System (ADS)
Xia, Wei; Dai, Xiao-Xia; Feng, Yuan
2015-12-01
When modeling a stealth aircraft with low RCS (Radar Cross Section), conventional parameter estimation methods may cause a deviation from the actual distribution, owing to the fact that the characteristic parameters are estimated via directly calculating the statistics of RCS. The Bayesian-Markov Chain Monte Carlo (Bayesian-MCMC) method is introduced herein to estimate the parameters so as to improve the fitting accuracies of fluctuation models. The parameter estimations of the lognormal and the Legendre polynomial models are reformulated in the Bayesian framework. The MCMC algorithm is then adopted to calculate the parameter estimates. Numerical results show that the distribution curves obtained by the proposed method exhibit improved consistence with the actual ones, compared with those fitted by the conventional method. The fitting accuracy could be improved by no less than 25% for both fluctuation models, which implies that the Bayesian-MCMC method might be a good candidate among the optimal parameter estimation methods for stealth aircraft RCS models. Project supported by the National Natural Science Foundation of China (Grant No. 61101173), the National Basic Research Program of China (Grant No. 613206), the National High Technology Research and Development Program of China (Grant No. 2012AA01A308), the State Scholarship Fund by the China Scholarship Council (CSC), and the Oversea Academic Training Funds, and University of Electronic Science and Technology of China (UESTC).
NASA Astrophysics Data System (ADS)
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Bayesian methods for outliers detection in GNSS time series
NASA Astrophysics Data System (ADS)
Qianqian, Zhang; Qingming, Gui
2013-07-01
This article is concerned with the problem of detecting outliers in GNSS time series based on Bayesian statistical theory. Firstly, a new model is proposed to simultaneously detect different types of outliers based on the conception of introducing different types of classification variables corresponding to the different types of outliers; the problem of outlier detection is converted into the computation of the corresponding posterior probabilities, and the algorithm for computing the posterior probabilities based on standard Gibbs sampler is designed. Secondly, we analyze the reasons of masking and swamping about detecting patches of additive outliers intensively; an unmasking Bayesian method for detecting additive outlier patches is proposed based on an adaptive Gibbs sampler. Thirdly, the correctness of the theories and methods proposed above is illustrated by simulated data and then by analyzing real GNSS observations, such as cycle slips detection in carrier phase data. Examples illustrate that the Bayesian methods for outliers detection in GNSS time series proposed by this paper are not only capable of detecting isolated outliers but also capable of detecting additive outlier patches. Furthermore, it can be successfully used to process cycle slips in phase data, which solves the problem of small cycle slips.
NASA Astrophysics Data System (ADS)
Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.
2018-05-01
Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
Quantum Inference on Bayesian Networks
NASA Astrophysics Data System (ADS)
Yoder, Theodore; Low, Guang Hao; Chuang, Isaac
2014-03-01
Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.
Sparse-grid, reduced-basis Bayesian inversion: Nonaffine-parametric nonlinear equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Peng, E-mail: peng@ices.utexas.edu; Schwab, Christoph, E-mail: christoph.schwab@sam.math.ethz.ch
2016-07-01
We extend the reduced basis (RB) accelerated Bayesian inversion methods for affine-parametric, linear operator equations which are considered in [16,17] to non-affine, nonlinear parametric operator equations. We generalize the analysis of sparsity of parametric forward solution maps in [20] and of Bayesian inversion in [48,49] to the fully discrete setting, including Petrov–Galerkin high-fidelity (“HiFi”) discretization of the forward maps. We develop adaptive, stochastic collocation based reduction methods for the efficient computation of reduced bases on the parametric solution manifold. The nonaffinity and nonlinearity with respect to (w.r.t.) the distributed, uncertain parameters and the unknown solution is collocated; specifically, by themore » so-called Empirical Interpolation Method (EIM). For the corresponding Bayesian inversion problems, computational efficiency is enhanced in two ways: first, expectations w.r.t. the posterior are computed by adaptive quadratures with dimension-independent convergence rates proposed in [49]; the present work generalizes [49] to account for the impact of the PG discretization in the forward maps on the convergence rates of the Quantities of Interest (QoI for short). Second, we propose to perform the Bayesian estimation only w.r.t. a parsimonious, RB approximation of the posterior density. Based on the approximation results in [49], the infinite-dimensional parametric, deterministic forward map and operator admit N-term RB and EIM approximations which converge at rates which depend only on the sparsity of the parametric forward map. In several numerical experiments, the proposed algorithms exhibit dimension-independent convergence rates which equal, at least, the currently known rate estimates for N-term approximation. We propose to accelerate Bayesian estimation by first offline construction of reduced basis surrogates of the Bayesian posterior density. The parsimonious surrogates can then be employed for online data assimilation and for Bayesian estimation. They also open a perspective for optimal experimental design.« less
A Fault Diagnosis Methodology for Gear Pump Based on EEMD and Bayesian Network
Liu, Zengkai; Liu, Yonghong; Shan, Hongkai; Cai, Baoping; Huang, Qing
2015-01-01
This paper proposes a fault diagnosis methodology for a gear pump based on the ensemble empirical mode decomposition (EEMD) method and the Bayesian network. Essentially, the presented scheme is a multi-source information fusion based methodology. Compared with the conventional fault diagnosis with only EEMD, the proposed method is able to take advantage of all useful information besides sensor signals. The presented diagnostic Bayesian network consists of a fault layer, a fault feature layer and a multi-source information layer. Vibration signals from sensor measurement are decomposed by the EEMD method and the energy of intrinsic mode functions (IMFs) are calculated as fault features. These features are added into the fault feature layer in the Bayesian network. The other sources of useful information are added to the information layer. The generalized three-layer Bayesian network can be developed by fully incorporating faults and fault symptoms as well as other useful information such as naked eye inspection and maintenance records. Therefore, diagnostic accuracy and capacity can be improved. The proposed methodology is applied to the fault diagnosis of a gear pump and the structure and parameters of the Bayesian network is established. Compared with artificial neural network and support vector machine classification algorithms, the proposed model has the best diagnostic performance when sensor data is used only. A case study has demonstrated that some information from human observation or system repair records is very helpful to the fault diagnosis. It is effective and efficient in diagnosing faults based on uncertain, incomplete information. PMID:25938760
A Fault Diagnosis Methodology for Gear Pump Based on EEMD and Bayesian Network.
Liu, Zengkai; Liu, Yonghong; Shan, Hongkai; Cai, Baoping; Huang, Qing
2015-01-01
This paper proposes a fault diagnosis methodology for a gear pump based on the ensemble empirical mode decomposition (EEMD) method and the Bayesian network. Essentially, the presented scheme is a multi-source information fusion based methodology. Compared with the conventional fault diagnosis with only EEMD, the proposed method is able to take advantage of all useful information besides sensor signals. The presented diagnostic Bayesian network consists of a fault layer, a fault feature layer and a multi-source information layer. Vibration signals from sensor measurement are decomposed by the EEMD method and the energy of intrinsic mode functions (IMFs) are calculated as fault features. These features are added into the fault feature layer in the Bayesian network. The other sources of useful information are added to the information layer. The generalized three-layer Bayesian network can be developed by fully incorporating faults and fault symptoms as well as other useful information such as naked eye inspection and maintenance records. Therefore, diagnostic accuracy and capacity can be improved. The proposed methodology is applied to the fault diagnosis of a gear pump and the structure and parameters of the Bayesian network is established. Compared with artificial neural network and support vector machine classification algorithms, the proposed model has the best diagnostic performance when sensor data is used only. A case study has demonstrated that some information from human observation or system repair records is very helpful to the fault diagnosis. It is effective and efficient in diagnosing faults based on uncertain, incomplete information.
Model-based Bayesian inference for ROC data analysis
NASA Astrophysics Data System (ADS)
Lei, Tianhu; Bae, K. Ty
2013-03-01
This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
Gogoshin, Grigoriy; Boerwinkle, Eric
2017-01-01
Abstract Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology—type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types—single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc. PMID:27681505
Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S
2017-04-01
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less
Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.; ...
2017-09-27
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less
Hardcastle, Thomas J
2016-01-15
High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a 'large P, small n' setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. tjh48@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Adaptive sequential Bayesian classification using Page's test
NASA Astrophysics Data System (ADS)
Lynch, Robert S., Jr.; Willett, Peter K.
2002-03-01
In this paper, the previously introduced Mean-Field Bayesian Data Reduction Algorithm is extended for adaptive sequential hypothesis testing utilizing Page's test. In general, Page's test is well understood as a method of detecting a permanent change in distribution associated with a sequence of observations. However, the relationship between detecting a change in distribution utilizing Page's test with that of classification and feature fusion is not well understood. Thus, the contribution of this work is based on developing a method of classifying an unlabeled vector of fused features (i.e., detect a change to an active statistical state) as quickly as possible given an acceptable mean time between false alerts. In this case, the developed classification test can be thought of as equivalent to performing a sequential probability ratio test repeatedly until a class is decided, with the lower log-threshold of each test being set to zero and the upper log-threshold being determined by the expected distance between false alerts. It is of interest to estimate the delay (or, related stopping time) to a classification decision (the number of time samples it takes to classify the target), and the mean time between false alerts, as a function of feature selection and fusion by the Mean-Field Bayesian Data Reduction Algorithm. Results are demonstrated by plotting the delay to declaring the target class versus the mean time between false alerts, and are shown using both different numbers of simulated training data and different numbers of relevant features for each class.
Approximate Bayesian Computation in the estimation of the parameters of the Forbush decrease model
NASA Astrophysics Data System (ADS)
Wawrzynczak, A.; Kopka, P.
2017-12-01
Realistic modeling of the complicated phenomena as Forbush decrease of the galactic cosmic ray intensity is a quite challenging task. One aspect is a numerical solution of the Fokker-Planck equation in five-dimensional space (three spatial variables, the time and particles energy). The second difficulty arises from a lack of detailed knowledge about the spatial and time profiles of the parameters responsible for the creation of the Forbush decrease. Among these parameters, the central role plays a diffusion coefficient. Assessment of the correctness of the proposed model can be done only by comparison of the model output with the experimental observations of the galactic cosmic ray intensity. We apply the Approximate Bayesian Computation (ABC) methodology to match the Forbush decrease model to experimental data. The ABC method is becoming increasing exploited for dynamic complex problems in which the likelihood function is costly to compute. The main idea of all ABC methods is to accept samples as an approximate posterior draw if its associated modeled data are close enough to the observed one. In this paper, we present application of the Sequential Monte Carlo Approximate Bayesian Computation algorithm scanning the space of the diffusion coefficient parameters. The proposed algorithm is adopted to create the model of the Forbush decrease observed by the neutron monitors at the Earth in March 2002. The model of the Forbush decrease is based on the stochastic approach to the solution of the Fokker-Planck equation.
GLASS 2.0: An Operational, Multimodal, Bayesian Earthquake Data Association Engine
NASA Astrophysics Data System (ADS)
Benz, H.; Johnson, C. E.; Patton, J. M.; McMahon, N. D.; Earle, P. S.
2015-12-01
The legacy approach to automated detection and determination of hypocenters is arrival time stacking algorithms. Examples of such algorithms are the associator, Binder, which has been in continuous use in many USGS-supported regional seismic networks since the 1980s and the spherical earth successor, GLASS 1.0, currently in service at the USGS National Earthquake Information Center for over 10 years. The principle short-comings of the legacy approach are 1) it can only use phase arrival times, 2) it does not adequately address the problems of extreme variations in station density worldwide, 3) it cannot incorporate multiple phase models or statistical attributes of phases with distance, and 4) it cannot incorporate noise model attributes of individual stations. Previously we introduced a theoretical framework of a new associator using a Bayesian kernel stacking approach to approximate a joint probability density function for hypocenter localization. More recently we added station- and phase-specific Bayesian constraints to the association process. GLASS 2.0 incorporates a multiplicity of earthquake related data including phase arrival times, back-azimuth and slowness information from array beamforming, arrival times from waveform cross correlation processing, and geographic constraints from real-time social media reports of ground shaking. We demonstrate its application by modeling an aftershock sequence using dozens of stations that recorded tens of thousands of earthquakes over a period of one month. We also demonstrate Glass 2.0 performance regionally and teleseismically using the globally distributed real-time monitoring system at NEIC.
2012-03-01
algorithms. Alternatively, features invariant to aspect and de - pression angles must be found. This research uses simple segmentation algorithms and...FN5 Out of Library SA-8 TZM SA-8 Reload vehicle N 6 N OOL1 BMP-1 Tank w/small turret Y 0 Y OOL2 BTR-70 8-wheeled transport N 8 N OOL3 SA-13 Turret SAMs...the NDEC threshold is established by selecting a user de - fined quantile of class’ entropy distriubtion. During testing, any test exemplar with an
Data Mining Methods for Recommender Systems
NASA Astrophysics Data System (ADS)
Amatriain, Xavier; Jaimes*, Alejandro; Oliver, Nuria; Pujol, Josep M.
In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.
Bayesian classification theory
NASA Technical Reports Server (NTRS)
Hanson, Robin; Stutz, John; Cheeseman, Peter
1991-01-01
The task of inferring a set of classes and class descriptions most likely to explain a given data set can be placed on a firm theoretical foundation using Bayesian statistics. Within this framework and using various mathematical and algorithmic approximations, the AutoClass system searches for the most probable classifications, automatically choosing the number of classes and complexity of class descriptions. A simpler version of AutoClass has been applied to many large real data sets, has discovered new independently-verified phenomena, and has been released as a robust software package. Recent extensions allow attributes to be selectively correlated within particular classes, and allow classes to inherit or share model parameters though a class hierarchy. We summarize the mathematical foundations of AutoClass.
cosmoabc: Likelihood-free inference for cosmology
NASA Astrophysics Data System (ADS)
Ishida, Emille E. O.; Vitenti, Sandro D. P.; Penna-Lima, Mariana; Trindade, Arlindo M.; Cisewski, Jessi; M.; de Souza, Rafael; Cameron, Ewan; Busti, Vinicius C.
2015-05-01
Approximate Bayesian Computation (ABC) enables parameter inference for complex physical systems in cases where the true likelihood function is unknown, unavailable, or computationally too expensive. It relies on the forward simulation of mock data and comparison between observed and synthetic catalogs. cosmoabc is a Python Approximate Bayesian Computation (ABC) sampler featuring a Population Monte Carlo variation of the original ABC algorithm, which uses an adaptive importance sampling scheme. The code can be coupled to an external simulator to allow incorporation of arbitrary distance and prior functions. When coupled with the numcosmo library, it has been used to estimate posterior probability distributions over cosmological parameters based on measurements of galaxy clusters number counts without computing the likelihood function.
Automated global structure extraction for effective local building block processing in XCS.
Butz, Martin V; Pelikan, Martin; Llorà, Xavier; Goldberg, David E
2006-01-01
Learning Classifier Systems (LCSs), such as the accuracy-based XCS, evolve distributed problem solutions represented by a population of rules. During evolution, features are specialized, propagated, and recombined to provide increasingly accurate subsolutions. Recently, it was shown that, as in conventional genetic algorithms (GAs), some problems require efficient processing of subsets of features to find problem solutions efficiently. In such problems, standard variation operators of genetic and evolutionary algorithms used in LCSs suffer from potential disruption of groups of interacting features, resulting in poor performance. This paper introduces efficient crossover operators to XCS by incorporating techniques derived from competent GAs: the extended compact GA (ECGA) and the Bayesian optimization algorithm (BOA). Instead of simple crossover operators such as uniform crossover or one-point crossover, ECGA or BOA-derived mechanisms are used to build a probabilistic model of the global population and to generate offspring classifiers locally using the model. Several offspring generation variations are introduced and evaluated. The results show that it is possible to achieve performance similar to runs with an informed crossover operator that is specifically designed to yield ideal problem-dependent exploration, exploiting provided problem structure information. Thus, we create the first competent LCSs, XCS/ECGA and XCS/BOA, that detect dependency structures online and propagate corresponding lower-level dependency structures effectively without any information about these structures given in advance.
Decentralized cooperative TOA/AOA target tracking for hierarchical wireless sensor networks.
Chen, Ying-Chih; Wen, Chih-Yu
2012-11-08
This paper proposes a distributed method for cooperative target tracking in hierarchical wireless sensor networks. The concept of leader-based information processing is conducted to achieve object positioning, considering a cluster-based network topology. Random timers and local information are applied to adaptively select a sub-cluster for the localization task. The proposed energy-efficient tracking algorithm allows each sub-cluster member to locally estimate the target position with a Bayesian filtering framework and a neural networking model, and further performs estimation fusion in the leader node with the covariance intersection algorithm. This paper evaluates the merits and trade-offs of the protocol design towards developing more efficient and practical algorithms for object position estimation.
Using Computational Text Classification for Qualitative Research and Evaluation in Extension
ERIC Educational Resources Information Center
Smith, Justin G.; Tissing, Reid
2018-01-01
This article introduces a process for computational text classification that can be used in a variety of qualitative research and evaluation settings. The process leverages supervised machine learning based on an implementation of a multinomial Bayesian classifier. Applied to a community of inquiry framework, the algorithm was used to identify…
2008-07-01
consider a proof as a composition relative to some system of music or as a painting. From the Bayesian perspective, any sufciently complex problem has...these types algorithms based on maximum entropy analysis. An example is the Bar-Shalom- Campo Fusion Rule: Xf (kjk) = X2(kjk) + (P22 P21)U1[X1(kjk
ERIC Educational Resources Information Center
Kim, Deok-Hwan; Chung, Chin-Wan
2003-01-01
Discusses the collection fusion problem of image databases, concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. Focuses on a metaserver which selects image databases supporting similarity measures and proposes a new algorithm which exploits a probabilistic technique using Bayesian…
Using Latent Class Analysis to Model Temperament Types
ERIC Educational Resources Information Center
Loken, Eric
2004-01-01
Mixture models are appropriate for data that arise from a set of qualitatively different subpopulations. In this study, latent class analysis was applied to observational data from a laboratory assessment of infant temperament at four months of age. The EM algorithm was used to fit the models, and the Bayesian method of posterior predictive checks…
Exact posterior computation in non-conjugate Gaussian location-scale parameters models
NASA Astrophysics Data System (ADS)
Andrade, J. A. A.; Rathie, P. N.
2017-12-01
In Bayesian analysis the class of conjugate models allows to obtain exact posterior distributions, however this class quite restrictive in the sense that it involves only a few distributions. In fact, most of the practical applications involves non-conjugate models, thus approximate methods, such as the MCMC algorithms, are required. Although these methods can deal with quite complex structures, some practical problems can make their applications quite time demanding, for example, when we use heavy-tailed distributions, convergence may be difficult, also the Metropolis-Hastings algorithm can become very slow, in addition to the extra work inevitably required on choosing efficient candidate generator distributions. In this work, we draw attention to the special functions as a tools for Bayesian computation, we propose an alternative method for obtaining the posterior distribution in Gaussian non-conjugate models in an exact form. We use complex integration methods based on the H-function in order to obtain the posterior distribution and some of its posterior quantities in an explicit computable form. Two examples are provided in order to illustrate the theory.
Bayesian calibration of coarse-grained forces: Efficiently addressing transferability
NASA Astrophysics Data System (ADS)
Patrone, Paul N.; Rosch, Thomas W.; Phelan, Frederick R.
2016-04-01
Generating and calibrating forces that are transferable across a range of state-points remains a challenging task in coarse-grained (CG) molecular dynamics. In this work, we present a coarse-graining workflow, inspired by ideas from uncertainty quantification and numerical analysis, to address this problem. The key idea behind our approach is to introduce a Bayesian correction algorithm that uses functional derivatives of CG simulations to rapidly and inexpensively recalibrate initial estimates f0 of forces anchored by standard methods such as force-matching. Taking density-temperature relationships as a running example, we demonstrate that this algorithm, in concert with various interpolation schemes, can be used to efficiently compute physically reasonable force curves on a fine grid of state-points. Importantly, we show that our workflow is robust to several choices available to the modeler, including the interpolation schemes and tools used to construct f0. In a related vein, we also demonstrate that our approach can speed up coarse-graining by reducing the number of atomistic simulations needed as inputs to standard methods for generating CG forces.
NASA Astrophysics Data System (ADS)
Ogorodnikov, Yuri; Khachay, Michael; Pljonkin, Anton
2018-04-01
We describe the possibility of employing the special case of the 3-SAT problem stemming from the well known integer factorization problem for the quantum cryptography. It is known, that for every instance of our 3-SAT setting the given 3-CNF is satisfiable by a unique truth assignment, and the goal is to find this assignment. Since the complexity status of the factorization problem is still undefined, development of approximation algorithms and heuristics adopts interest of numerous researchers. One of promising approaches to construction of approximation techniques is based on real-valued relaxation of the given 3-CNF followed by minimizing of the appropriate differentiable loss function, and subsequent rounding of the fractional minimizer obtained. Actually, algorithms developed this way differ by the rounding scheme applied on their final stage. We propose a new rounding scheme based on Bayesian learning. The article shows that the proposed method can be used to determine the security in quantum key distribution systems. In the quantum distribution the Shannon rules is applied and the factorization problem is paramount when decrypting secret keys.
The center for causal discovery of biomedical knowledge from big data.
Cooper, Gregory F; Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-11-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Hamiltonian Monte Carlo acceleration using surrogate functions with random bases.
Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai
2017-11-01
For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov chain Monte Carlo methods, namely, Hamiltonian Monte Carlo. The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the-art methods.
Novelty and Inductive Generalization in Human Reinforcement Learning.
Gershman, Samuel J; Niv, Yael
2015-07-01
In reinforcement learning (RL), a decision maker searching for the most rewarding option is often faced with the question: What is the value of an option that has never been tried before? One way to frame this question is as an inductive problem: How can I generalize my previous experience with one set of options to a novel option? We show how hierarchical Bayesian inference can be used to solve this problem, and we describe an equivalence between the Bayesian model and temporal difference learning algorithms that have been proposed as models of RL in humans and animals. According to our view, the search for the best option is guided by abstract knowledge about the relationships between different options in an environment, resulting in greater search efficiency compared to traditional RL algorithms previously applied to human cognition. In two behavioral experiments, we test several predictions of our model, providing evidence that humans learn and exploit structured inductive knowledge to make predictions about novel options. In light of this model, we suggest a new interpretation of dopaminergic responses to novelty. Copyright © 2015 Cognitive Science Society, Inc.
Novelty and Inductive Generalization in Human Reinforcement Learning
Gershman, Samuel J.; Niv, Yael
2015-01-01
In reinforcement learning, a decision maker searching for the most rewarding option is often faced with the question: what is the value of an option that has never been tried before? One way to frame this question is as an inductive problem: how can I generalize my previous experience with one set of options to a novel option? We show how hierarchical Bayesian inference can be used to solve this problem, and describe an equivalence between the Bayesian model and temporal difference learning algorithms that have been proposed as models of reinforcement learning in humans and animals. According to our view, the search for the best option is guided by abstract knowledge about the relationships between different options in an environment, resulting in greater search efficiency compared to traditional reinforcement learning algorithms previously applied to human cognition. In two behavioral experiments, we test several predictions of our model, providing evidence that humans learn and exploit structured inductive knowledge to make predictions about novel options. In light of this model, we suggest a new interpretation of dopaminergic responses to novelty. PMID:25808176
NASA Astrophysics Data System (ADS)
Morse, Llewellyn; Sharif Khodaei, Zahra; Aliabadi, M. H.
2018-01-01
In this work, a reliability based impact detection strategy for a sensorized composite structure is proposed. Impacts are localized using Artificial Neural Networks (ANNs) with recorded guided waves due to impacts used as inputs. To account for variability in the recorded data under operational conditions, Bayesian updating and Kalman filter techniques are applied to improve the reliability of the detection algorithm. The possibility of having one or more faulty sensors is considered, and a decision fusion algorithm based on sub-networks of sensors is proposed to improve the application of the methodology to real structures. A strategy for reliably categorizing impacts into high energy impacts, which are probable to cause damage in the structure (true impacts), and low energy non-damaging impacts (false impacts), has also been proposed to reduce the false alarm rate. The proposed strategy involves employing classification ANNs with different features extracted from captured signals used as inputs. The proposed methodologies are validated by experimental results on a quasi-isotropic composite coupon impacted with a range of impact energies.
Bayesian-based localization of wireless capsule endoscope using received signal strength.
Nadimi, Esmaeil S; Blanes-Vidal, Victoria; Tarokh, Vahid; Johansen, Per Michael
2014-01-01
In wireless body area sensor networking (WBASN) applications such as gastrointestinal (GI) tract monitoring using wireless video capsule endoscopy (WCE), the performance of out-of-body wireless link propagating through different body media (i.e. blood, fat, muscle and bone) is still under investigation. Most of the localization algorithms are vulnerable to the variations of path-loss coefficient resulting in unreliable location estimation. In this paper, we propose a novel robust probabilistic Bayesian-based approach using received-signal-strength (RSS) measurements that accounts for Rayleigh fading, variable path-loss exponent and uncertainty in location information received from the neighboring nodes and anchors. The results of this study showed that the localization root mean square error of our Bayesian-based method was 1.6 mm which was very close to the optimum Cramer-Rao lower bound (CRLB) and significantly smaller than that of other existing localization approaches (i.e. classical MDS (64.2mm), dwMDS (32.2mm), MLE (36.3mm) and POCS (2.3mm)).
BELM: Bayesian extreme learning machine.
Soria-Olivas, Emilio; Gómez-Sanchis, Juan; Martín, José D; Vila-Francés, Joan; Martínez, Marcelino; Magdalena, José R; Serrano, Antonio J
2011-03-01
The theory of extreme learning machine (ELM) has become very popular on the last few years. ELM is a new approach for learning the parameters of the hidden layers of a multilayer neural network (as the multilayer perceptron or the radial basis function neural network). Its main advantage is the lower computational cost, which is especially relevant when dealing with many patterns defined in a high-dimensional space. This brief proposes a bayesian approach to ELM, which presents some advantages over other approaches: it allows the introduction of a priori knowledge; obtains the confidence intervals (CIs) without the need of applying methods that are computationally intensive, e.g., bootstrap; and presents high generalization capabilities. Bayesian ELM is benchmarked against classical ELM in several artificial and real datasets that are widely used for the evaluation of machine learning algorithms. Achieved results show that the proposed approach produces a competitive accuracy with some additional advantages, namely, automatic production of CIs, reduction of probability of model overfitting, and use of a priori knowledge.
Real-time prediction of acute cardiovascular events using hardware-implemented Bayesian networks.
Tylman, Wojciech; Waszyrowski, Tomasz; Napieralski, Andrzej; Kamiński, Marek; Trafidło, Tamara; Kulesza, Zbigniew; Kotas, Rafał; Marciniak, Paweł; Tomala, Radosław; Wenerski, Maciej
2016-02-01
This paper presents a decision support system that aims to estimate a patient׳s general condition and detect situations which pose an immediate danger to the patient׳s health or life. The use of this system might be especially important in places such as accident and emergency departments or admission wards, where a small medical team has to take care of many patients in various general conditions. Particular stress is laid on cardiovascular and pulmonary conditions, including those leading to sudden cardiac arrest. The proposed system is a stand-alone microprocessor-based device that works in conjunction with a standard vital signs monitor, which provides input signals such as temperature, blood pressure, pulseoxymetry, ECG, and ICG. The signals are preprocessed and analysed by a set of artificial intelligence algorithms, the core of which is based on Bayesian networks. The paper focuses on the construction and evaluation of the Bayesian network, both its structure and numerical specification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Model selection and Bayesian inference for high-resolution seabed reflection inversion.
Dettmer, Jan; Dosso, Stan E; Holland, Charles W
2009-02-01
This paper applies Bayesian inference, including model selection and posterior parameter inference, to inversion of seabed reflection data to resolve sediment structure at a spatial scale below the pulse length of the acoustic source. A practical approach to model selection is used, employing the Bayesian information criterion to decide on the number of sediment layers needed to sufficiently fit the data while satisfying parsimony to avoid overparametrization. Posterior parameter inference is carried out using an efficient Metropolis-Hastings algorithm for high-dimensional models, and results are presented as marginal-probability depth distributions for sound velocity, density, and attenuation. The approach is applied to plane-wave reflection-coefficient inversion of single-bounce data collected on the Malta Plateau, Mediterranean Sea, which indicate complex fine structure close to the water-sediment interface. This fine structure is resolved in the geoacoustic inversion results in terms of four layers within the upper meter of sediments. The inversion results are in good agreement with parameter estimates from a gravity core taken at the experiment site.
Bayesian peak picking for NMR spectra.
Cheng, Yichen; Gao, Xin; Liang, Faming
2014-02-01
Protein structure determination is a very important topic in structural genomics, which helps people to understand varieties of biological functions such as protein-protein interactions, protein-DNA interactions and so on. Nowadays, nuclear magnetic resonance (NMR) has often been used to determine the three-dimensional structures of protein in vivo. This study aims to automate the peak picking step, the most important and tricky step in NMR structure determination. We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem. Under the Bayesian framework, the peak picking problem is casted as a variable selection problem. The proposed method can automatically distinguish true peaks from false ones without preprocessing the data. To the best of our knowledge, this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method. Copyright © 2013. Production and hosting by Elsevier Ltd.
An algorithm that improves speech intelligibility in noise for normal-hearing listeners.
Kim, Gibak; Lu, Yang; Hu, Yi; Loizou, Philipos C
2009-09-01
Traditional noise-suppression algorithms have been shown to improve speech quality, but not speech intelligibility. Motivated by prior intelligibility studies of speech synthesized using the ideal binary mask, an algorithm is proposed that decomposes the input signal into time-frequency (T-F) units and makes binary decisions, based on a Bayesian classifier, as to whether each T-F unit is dominated by the target or the masker. Speech corrupted at low signal-to-noise ratio (SNR) levels (-5 and 0 dB) using different types of maskers is synthesized by this algorithm and presented to normal-hearing listeners for identification. Results indicated substantial improvements in intelligibility (over 60% points in -5 dB babble) over that attained by human listeners with unprocessed stimuli. The findings from this study suggest that algorithms that can estimate reliably the SNR in each T-F unit can improve speech intelligibility.
Novel trace chemical detection algorithms: a comparative study
NASA Astrophysics Data System (ADS)
Raz, Gil; Murphy, Cara; Georgan, Chelsea; Greenwood, Ross; Prasanth, R. K.; Myers, Travis; Goyal, Anish; Kelley, David; Wood, Derek; Kotidis, Petros
2017-05-01
Algorithms for standoff detection and estimation of trace chemicals in hyperspectral images in the IR band are a key component for a variety of applications relevant to law-enforcement and the intelligence communities. Performance of these methods is impacted by the spectral signature variability due to presence of contaminants, surface roughness, nonlinear dependence on abundances as well as operational limitations on the compute platforms. In this work we provide a comparative performance and complexity analysis of several classes of algorithms as a function of noise levels, error distribution, scene complexity, and spatial degrees of freedom. The algorithm classes we analyze and test include adaptive cosine estimator (ACE and modifications to it), compressive/sparse methods, Bayesian estimation, and machine learning. We explicitly call out the conditions under which each algorithm class is optimal or near optimal as well as their built-in limitations and failure modes.
NASA Astrophysics Data System (ADS)
Gagné, Jonathan; Mamajek, Eric E.; Malo, Lison; Riedel, Adric; Rodriguez, David; Lafrenière, David; Faherty, Jacqueline K.; Roy-Loubier, Olivier; Pueyo, Laurent; Robin, Annie C.; Doyon, René
2018-03-01
BANYAN Σ is a new Bayesian algorithm to identify members of young stellar associations within 150 pc of the Sun. It includes 27 young associations with ages in the range ∼1–800 Myr, modeled with multivariate Gaussians in six-dimensional (6D) XYZUVW space. It is the first such multi-association classification tool to include the nearest sub-groups of the Sco-Cen OB star-forming region, the IC 2602, IC 2391, Pleiades and Platais 8 clusters, and the ρ Ophiuchi, Corona Australis, and Taurus star formation regions. A model of field stars is built from a mixture of multivariate Gaussians based on the Besançon Galactic model. The algorithm can derive membership probabilities for objects with only sky coordinates and proper motion, but can also include parallax and radial velocity measurements, as well as spectrophotometric distance constraints from sequences in color–magnitude or spectral type–magnitude diagrams. BANYAN Σ benefits from an analytical solution to the Bayesian marginalization integrals over unknown radial velocities and distances that makes it more accurate and significantly faster than its predecessor BANYAN II. A contamination versus hit rate analysis is presented and demonstrates that BANYAN Σ achieves a better classification performance than other moving group tools available in the literature, especially in terms of cross-contamination between young associations. An updated list of bona fide members in the 27 young associations, augmented by the Gaia-DR1 release, as well as all parameters for the 6D multivariate Gaussian models for each association and the Galactic field neighborhood within 300 pc are presented. This new tool will make it possible to analyze large data sets such as the upcoming Gaia-DR2 to identify new young stars. IDL and Python versions of BANYAN Σ are made available with this publication, and a more limited online web tool is available at http://www.exoplanetes.umontreal.ca/banyan/banyansigma.php.
NASA Astrophysics Data System (ADS)
Izquierdo, K.; Lekic, V.; Montesi, L.
2017-12-01
Gravity inversions are especially important for planetary applications since measurements of the variations in gravitational acceleration are often the only constraint available to map out lateral density variations in the interiors of planets and other Solar system objects. Currently, global gravity data is available for the terrestrial planets and the Moon. Although several methods for inverting these data have been developed and applied, the non-uniqueness of global density models that fit the data has not yet been fully characterized. We make use of Bayesian inference and a Reversible Jump Markov Chain Monte Carlo (RJMCMC) approach to develop a Trans-dimensional Hierarchical Bayesian (THB) inversion algorithm that yields a large sample of models that fit a gravity field. From this group of models, we can determine the most likely value of parameters of a global density model and a measure of the non-uniqueness of each parameter when the number of anomalies describing the gravity field is not fixed a priori. We explore the use of a parallel tempering algorithm and fast multipole method to reduce the number of iterations and computing time needed. We applied this method to a synthetic gravity field of the Moon and a long wavelength synthetic model of density anomalies in the Earth's lower mantle. We obtained a good match between the given gravity field and the gravity field produced by the most likely model in each inversion. The number of anomalies of the models showed parsimony of the algorithm, the value of the noise variance of the input data was retrieved, and the non-uniqueness of the models was quantified. Our results show that the ability to constrain the latitude and longitude of density anomalies, which is excellent at shallow locations (<200 km), decreases with increasing depth. With higher computational resources, this THB method for gravity inversion could give new information about the overall density distribution of celestial bodies even when there is no other geophysical data available.
Modelling Trial-by-Trial Changes in the Mismatch Negativity
Lieder, Falk; Daunizeau, Jean; Garrido, Marta I.; Friston, Karl J.; Stephan, Klaas E.
2013-01-01
The mismatch negativity (MMN) is a differential brain response to violations of learned regularities. It has been used to demonstrate that the brain learns the statistical structure of its environment and predicts future sensory inputs. However, the algorithmic nature of these computations and the underlying neurobiological implementation remain controversial. This article introduces a mathematical framework with which competing ideas about the computational quantities indexed by MMN responses can be formalized and tested against single-trial EEG data. This framework was applied to five major theories of the MMN, comparing their ability to explain trial-by-trial changes in MMN amplitude. Three of these theories (predictive coding, model adjustment, and novelty detection) were formalized by linking the MMN to different manifestations of the same computational mechanism: approximate Bayesian inference according to the free-energy principle. We thereby propose a unifying view on three distinct theories of the MMN. The relative plausibility of each theory was assessed against empirical single-trial MMN amplitudes acquired from eight healthy volunteers in a roving oddball experiment. Models based on the free-energy principle provided more plausible explanations of trial-by-trial changes in MMN amplitude than models representing the two more traditional theories (change detection and adaptation). Our results suggest that the MMN reflects approximate Bayesian learning of sensory regularities, and that the MMN-generating process adjusts a probabilistic model of the environment according to prediction errors. PMID:23436989
Bayesian Classification Models for Premature Ventricular Contraction Detection on ECG Traces.
Casas, Manuel M; Avitia, Roberto L; Gonzalez-Navarro, Felix F; Cardenas-Haro, Jose A; Reyna, Marco A
2018-01-01
According to the American Heart Association, in its latest commission about Ventricular Arrhythmias and Sudden Death 2006, the epidemiology of the ventricular arrhythmias ranges from a series of risk descriptors and clinical markers that go from ventricular premature complexes and nonsustained ventricular tachycardia to sudden cardiac death due to ventricular tachycardia in patients with or without clinical history. The premature ventricular complexes (PVCs) are known to be associated with malignant ventricular arrhythmias and sudden cardiac death (SCD) cases. Detecting this kind of arrhythmia has been crucial in clinical applications. The electrocardiogram (ECG) is a clinical test used to measure the heart electrical activity for inferences and diagnosis. Analyzing large ECG traces from several thousands of beats has brought the necessity to develop mathematical models that can automatically make assumptions about the heart condition. In this work, 80 different features from 108,653 ECG classified beats of the gold-standard MIT-BIH database were extracted in order to classify the Normal, PVC, and other kind of ECG beats. Three well-known Bayesian classification algorithms were trained and tested using these extracted features. Experimental results show that the F1 scores for each class were above 0.95, giving almost the perfect value for the PVC class. This gave us a promising path in the development of automated mechanisms for the detection of PVC complexes.
Unsupervised Bayesian linear unmixing of gene expression microarrays.
Bazot, Cécile; Dobigeon, Nicolas; Tourneret, Jean-Yves; Zaas, Aimee K; Ginsburg, Geoffrey S; Hero, Alfred O
2013-03-19
This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor.
Prediction of road accidents: A Bayesian hierarchical approach.
Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H
2013-03-01
In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any road network provided that the required data are available. Copyright © 2012 Elsevier Ltd. All rights reserved.
Minsley, B.J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, 'best' model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequency-domain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favourably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment. ?? 2011. Geophysical Journal International ?? 2011 RAS.
Reconstructing Constructivism: Causal Models, Bayesian Learning Mechanisms, and the Theory Theory
ERIC Educational Resources Information Center
Gopnik, Alison; Wellman, Henry M.
2012-01-01
We propose a new version of the "theory theory" grounded in the computational framework of probabilistic causal models and Bayesian learning. Probabilistic models allow a constructivist but rigorous and detailed approach to cognitive development. They also explain the learning of both more specific causal hypotheses and more abstract framework…
Renard, Bernhard Y.; Xu, Buote; Kirchner, Marc; Zickmann, Franziska; Winter, Dominic; Korten, Simone; Brattig, Norbert W.; Tzur, Amit; Hamprecht, Fred A.; Steen, Hanno
2012-01-01
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis. PMID:22493179
The Chandra Source Catalog 2.0: Estimating Source Fluxes
NASA Astrophysics Data System (ADS)
Primini, Francis Anthony; Allen, Christopher E.; Miller, Joseph; Anderson, Craig S.; Budynkiewicz, Jamie A.; Burke, Douglas; Chen, Judy C.; Civano, Francesca Maria; D'Abrusco, Raffaele; Doe, Stephen M.; Evans, Ian N.; Evans, Janet D.; Fabbiano, Giuseppina; Gibbs, Danny G., II; Glotfelty, Kenny J.; Graessle, Dale E.; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; Houck, John C.; Lauer, Jennifer L.; Laurino, Omar; Lee, Nicholas P.; Martínez-Galarza, Juan Rafael; McCollough, Michael L.; McDowell, Jonathan C.; McLaughlin, Warren; Morgan, Douglas L.; Mossman, Amy E.; Nguyen, Dan T.; Nichols, Joy S.; Nowak, Michael A.; Paxson, Charles; Plummer, David A.; Rots, Arnold H.; Siemiginowska, Aneta; Sundheim, Beth A.; Tibbetts, Michael; Van Stone, David W.; Zografou, Panagoula
2018-01-01
The Second Chandra Source Catalog (CSC2.0) will provide information on approximately 316,000 point or compact extended x-ray sources, derived from over 10,000 ACIS and HRC-I imaging observations available in the public archive at the end of 2014. As in the previous catalog release (CSC1.1), fluxes for these sources will be determined separately from source detection, using a Bayesian formalism that accounts for background, spatial resolution effects, and contamination from nearby sources. However, the CSC2.0 procedure differs from that used in CSC1.1 in three important aspects. First, for sources in crowded regions in which photometric apertures overlap, fluxes are determined jointly, using an extension of the CSC1.1 algorithm, as discussed in Primini & Kashyap (2014ApJ...796…24P). Second, an MCMC procedure is used to estimate marginalized posterior probability distributions for source fluxes. Finally, for sources observed in multiple observations, a Bayesian Blocks algorithm (Scargle, et al. 2013ApJ...764..167S) is used to group observations into blocks of constant source flux.In this poster we present details of the CSC2.0 photometry algorithms and illustrate their performance in actual CSC2.0 datasets.This work has been supported by NASA under contract NAS 8-03060 to the Smithsonian Astrophysical Observatory for operation of the Chandra X-ray Center.
Approximate, computationally efficient online learning in Bayesian spiking neurons.
Kuhlmann, Levin; Hauser-Raspe, Michael; Manton, Jonathan H; Grayden, David B; Tapson, Jonathan; van Schaik, André
2014-03-01
Bayesian spiking neurons (BSNs) provide a probabilistic interpretation of how neurons perform inference and learning. Online learning in BSNs typically involves parameter estimation based on maximum-likelihood expectation-maximization (ML-EM) which is computationally slow and limits the potential of studying networks of BSNs. An online learning algorithm, fast learning (FL), is presented that is more computationally efficient than the benchmark ML-EM for a fixed number of time steps as the number of inputs to a BSN increases (e.g., 16.5 times faster run times for 20 inputs). Although ML-EM appears to converge 2.0 to 3.6 times faster than FL, the computational cost of ML-EM means that ML-EM takes longer to simulate to convergence than FL. FL also provides reasonable convergence performance that is robust to initialization of parameter estimates that are far from the true parameter values. However, parameter estimation depends on the range of true parameter values. Nevertheless, for a physiologically meaningful range of parameter values, FL gives very good average estimation accuracy, despite its approximate nature. The FL algorithm therefore provides an efficient tool, complementary to ML-EM, for exploring BSN networks in more detail in order to better understand their biological relevance. Moreover, the simplicity of the FL algorithm means it can be easily implemented in neuromorphic VLSI such that one can take advantage of the energy-efficient spike coding of BSNs.
Sparse representation and Bayesian detection of genome copy number alterations from microarray data.
Pique-Regi, Roger; Monso-Varona, Jordi; Ortega, Antonio; Seeger, Robert C; Triche, Timothy J; Asgharzadeh, Shahab
2008-02-01
Genomic instability in cancer leads to abnormal genome copy number alterations (CNA) that are associated with the development and behavior of tumors. Advances in microarray technology have allowed for greater resolution in detection of DNA copy number changes (amplifications or deletions) across the genome. However, the increase in number of measured signals and accompanying noise from the array probes present a challenge in accurate and fast identification of breakpoints that define CNA. This article proposes a novel detection technique that exploits the use of piece wise constant (PWC) vectors to represent genome copy number and sparse Bayesian learning (SBL) to detect CNA breakpoints. First, a compact linear algebra representation for the genome copy number is developed from normalized probe intensities. Second, SBL is applied and optimized to infer locations where copy number changes occur. Third, a backward elimination (BE) procedure is used to rank the inferred breakpoints; and a cut-off point can be efficiently adjusted in this procedure to control for the false discovery rate (FDR). The performance of our algorithm is evaluated using simulated and real genome datasets and compared to other existing techniques. Our approach achieves the highest accuracy and lowest FDR while improving computational speed by several orders of magnitude. The proposed algorithm has been developed into a free standing software application (GADA, Genome Alteration Detection Algorithm). http://biron.usc.edu/~piquereg/GADA
Algorithm for ion beam figuring of low-gradient mirrors.
Jiao, Changjun; Li, Shengyi; Xie, Xuhui
2009-07-20
Ion beam figuring technology for low-gradient mirrors is discussed. Ion beam figuring is a noncontact machining technique in which a beam of high-energy ions is directed toward a target workpiece to remove material in a predetermined and controlled fashion. Owing to this noncontact mode of material removal, problems associated with tool wear and edge effects, which are common in conventional contact polishing processes, are avoided. Based on the Bayesian principle, an iterative dwell time algorithm for planar mirrors is deduced from the computer-controlled optical surfacing (CCOS) principle. With the properties of the removal function, the shaping process of low-gradient mirrors can be approximated by the linear model for planar mirrors. With these discussions, the error surface figuring technology for low-gradient mirrors with a linear path is set up. With the near-Gaussian property of the removal function, the figuring process with a spiral path can be described by the conventional linear CCOS principle, and a Bayesian-based iterative algorithm can be used to deconvolute the dwell time. Moreover, the selection criterion of the spiral parameter is given. Ion beam figuring technology with a spiral scan path based on these methods can be used to figure mirrors with non-axis-symmetrical errors. Experiments on SiC chemical vapor deposition planar and Zerodur paraboloid samples are made, and the final surface errors are all below 1/100 lambda.
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Huang, Lei; Liao, Li; Wu, Cathy H.
2016-01-01
Revealing the underlying evolutionary mechanism plays an important role in understanding protein interaction networks in the cell. While many evolutionary models have been proposed, the problem about applying these models to real network data, especially for differentiating which model can better describe evolutionary process for the observed network urgently remains as a challenge. The traditional way is to use a model with presumed parameters to generate a network, and then evaluate the fitness by summary statistics, which however cannot capture the complete network structures information and estimate parameter distribution. In this work we developed a novel method based on Approximate Bayesian Computation and modified Differential Evolution (ABC-DEP) that is capable of conducting model selection and parameter estimation simultaneously and detecting the underlying evolutionary mechanisms more accurately. We tested our method for its power in differentiating models and estimating parameters on the simulated data and found significant improvement in performance benchmark, as compared with a previous method. We further applied our method to real data of protein interaction networks in human and yeast. Our results show Duplication Attachment model as the predominant evolutionary mechanism for human PPI networks and Scale-Free model as the predominant mechanism for yeast PPI networks. PMID:26357273
Heuristics as Bayesian inference under extreme priors.
Parpart, Paula; Jones, Matt; Love, Bradley C
2018-05-01
Simple heuristics are often regarded as tractable decision strategies because they ignore a great deal of information in the input data. One puzzle is why heuristics can outperform full-information models, such as linear regression, which make full use of the available information. These "less-is-more" effects, in which a relatively simpler model outperforms a more complex model, are prevalent throughout cognitive science, and are frequently argued to demonstrate an inherent advantage of simplifying computation or ignoring information. In contrast, we show at the computational level (where algorithmic restrictions are set aside) that it is never optimal to discard information. Through a formal Bayesian analysis, we prove that popular heuristics, such as tallying and take-the-best, are formally equivalent to Bayesian inference under the limit of infinitely strong priors. Varying the strength of the prior yields a continuum of Bayesian models with the heuristics at one end and ordinary regression at the other. Critically, intermediate models perform better across all our simulations, suggesting that down-weighting information with the appropriate prior is preferable to entirely ignoring it. Rather than because of their simplicity, our analyses suggest heuristics perform well because they implement strong priors that approximate the actual structure of the environment. We end by considering how new heuristics could be derived by infinitely strengthening the priors of other Bayesian models. These formal results have implications for work in psychology, machine learning and economics. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Zhijun; Feng, Maria Q.; Luo, Longxi; Feng, Dongming; Xu, Xiuli
2018-01-01
Uncertainty of modal parameters estimation appear in structural health monitoring (SHM) practice of civil engineering to quite some significant extent due to environmental influences and modeling errors. Reasonable methodologies are needed for processing the uncertainty. Bayesian inference can provide a promising and feasible identification solution for the purpose of SHM. However, there are relatively few researches on the application of Bayesian spectral method in the modal identification using SHM data sets. To extract modal parameters from large data sets collected by SHM system, the Bayesian spectral density algorithm was applied to address the uncertainty of mode extraction from output-only response of a long-span suspension bridge. The posterior most possible values of modal parameters and their uncertainties were estimated through Bayesian inference. A long-term variation and statistical analysis was performed using the sensor data sets collected from the SHM system of the suspension bridge over a one-year period. The t location-scale distribution was shown to be a better candidate function for frequencies of lower modes. On the other hand, the burr distribution provided the best fitting to the higher modes which are sensitive to the temperature. In addition, wind-induced variation of modal parameters was also investigated. It was observed that both the damping ratios and modal forces increased during the period of typhoon excitations. Meanwhile, the modal damping ratios exhibit significant correlation with the spectral intensities of the corresponding modal forces.
Pruning Neural Networks with Distribution Estimation Algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cantu-Paz, E
2003-01-15
This paper describes the application of four evolutionary algorithms to the pruning of neural networks used in classification problems. Besides of a simple genetic algorithm (GA), the paper considers three distribution estimation algorithms (DEAs): a compact GA, an extended compact GA, and the Bayesian Optimization Algorithm. The objective is to determine if the DEAs present advantages over the simple GA in terms of accuracy or speed in this problem. The experiments used a feed forward neural network trained with standard back propagation and public-domain and artificial data sets. The pruned networks seemed to have better or equal accuracy than themore » original fully-connected networks. Only in a few cases, pruning resulted in less accurate networks. We found few differences in the accuracy of the networks pruned by the four EAs, but found important differences in the execution time. The results suggest that a simple GA with a small population might be the best algorithm for pruning networks on the data sets we tested.« less
Assessment of parametric uncertainty for groundwater reactive transport modeling,
Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun
2014-01-01
The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.
Bayesian state space models for dynamic genetic network construction across multiple tissues.
Liang, Yulan; Kelemen, Arpad
2016-08-01
Construction of gene-gene interaction networks and potential pathways is a challenging and important problem in genomic research for complex diseases while estimating the dynamic changes of the temporal correlations and non-stationarity are the keys in this process. In this paper, we develop dynamic state space models with hierarchical Bayesian settings to tackle this challenge for inferring the dynamic profiles and genetic networks associated with disease treatments. We treat both the stochastic transition matrix and the observation matrix time-variant and include temporal correlation structures in the covariance matrix estimations in the multivariate Bayesian state space models. The unevenly spaced short time courses with unseen time points are treated as hidden state variables. Hierarchical Bayesian approaches with various prior and hyper-prior models with Monte Carlo Markov Chain and Gibbs sampling algorithms are used to estimate the model parameters and the hidden state variables. We apply the proposed Hierarchical Bayesian state space models to multiple tissues (liver, skeletal muscle, and kidney) Affymetrix time course data sets following corticosteroid (CS) drug administration. Both simulation and real data analysis results show that the genomic changes over time and gene-gene interaction in response to CS treatment can be well captured by the proposed models. The proposed dynamic Hierarchical Bayesian state space modeling approaches could be expanded and applied to other large scale genomic data, such as next generation sequence (NGS) combined with real time and time varying electronic health record (EHR) for more comprehensive and robust systematic and network based analysis in order to transform big biomedical data into predictions and diagnostics for precision medicine and personalized healthcare with better decision making and patient outcomes.
Liu, Zhihong; Zheng, Minghao; Yan, Xin; Gu, Qiong; Gasteiger, Johann; Tijhuis, Johan; Maas, Peter; Li, Jiabo; Xu, Jun
2014-09-01
Predicting compound chemical stability is important because unstable compounds can lead to either false positive or to false negative conclusions in bioassays. Experimental data (COMDECOM) measured from DMSO/H2O solutions stored at 50 °C for 105 days were used to predicted stability by applying rule-embedded naïve Bayesian learning, based upon atom center fragment (ACF) features. To build the naïve Bayesian classifier, we derived ACF features from 9,746 compounds in the COMDECOM dataset. By recursively applying naïve Bayesian learning from the data set, each ACF is assigned with an expected stable probability (p(s)) and an unstable probability (p(uns)). 13,340 ACFs, together with their p(s) and p(uns) data, were stored in a knowledge base for use by the Bayesian classifier. For a given compound, its ACFs were derived from its structure connection table with the same protocol used to drive ACFs from the training data. Then, the Bayesian classifier assigned p(s) and p(uns) values to the compound ACFs by a structural pattern recognition algorithm, which was implemented in-house. Compound instability is calculated, with Bayes' theorem, based upon the p(s) and p(uns) values of the compound ACFs. We were able to achieve performance with an AUC value of 84% and a tenfold cross validation accuracy of 76.5%. To reduce false negatives, a rule-based approach has been embedded in the classifier. The rule-based module allows the program to improve its predictivity by expanding its compound instability knowledge base, thus further reducing the possibility of false negatives. To our knowledge, this is the first in silico prediction service for the prediction of the stabilities of organic compounds.
On the uncertainty in single molecule fluorescent lifetime and energy emission measurements
NASA Technical Reports Server (NTRS)
Brown, Emery N.; Zhang, Zhenhua; Mccollom, Alex D.
1995-01-01
Time-correlated single photon counting has recently been combined with mode-locked picosecond pulsed excitation to measure the fluorescent lifetimes and energy emissions of single molecules in a flow stream. Maximum likelihood (ML) and least square methods agree and are optimal when the number of detected photons is large however, in single molecule fluorescence experiments the number of detected photons can be less than 20, 67% of those can be noise and the detection time is restricted to 10 nanoseconds. Under the assumption that the photon signal and background noise are two independent inhomogeneous poisson processes, we derive the exact joint arrival time probably density of the photons collected in a single counting experiment performed in the presence of background noise. The model obviates the need to bin experimental data for analysis, and makes it possible to analyze formally the effect of background noise on the photon detection experiment using both ML or Bayesian methods. For both methods we derive the joint and marginal probability densities of the fluorescent lifetime and fluorescent emission. the ML and Bayesian methods are compared in an analysis of simulated single molecule fluorescence experiments of Rhodamine 110 using different combinations of expected background nose and expected fluorescence emission. While both the ML or Bayesian procedures perform well for analyzing fluorescence emissions, the Bayesian methods provide more realistic measures of uncertainty in the fluorescent lifetimes. The Bayesian methods would be especially useful for measuring uncertainty in fluorescent lifetime estimates in current single molecule flow stream experiments where the expected fluorescence emission is low. Both the ML and Bayesian algorithms can be automated for applications in molecular biology.
On the Uncertainty in Single Molecule Fluorescent Lifetime and Energy Emission Measurements
NASA Technical Reports Server (NTRS)
Brown, Emery N.; Zhang, Zhenhua; McCollom, Alex D.
1996-01-01
Time-correlated single photon counting has recently been combined with mode-locked picosecond pulsed excitation to measure the fluorescent lifetimes and energy emissions of single molecules in a flow stream. Maximum likelihood (ML) and least squares methods agree and are optimal when the number of detected photons is large, however, in single molecule fluorescence experiments the number of detected photons can be less than 20, 67 percent of those can be noise, and the detection time is restricted to 10 nanoseconds. Under the assumption that the photon signal and background noise are two independent inhomogeneous Poisson processes, we derive the exact joint arrival time probability density of the photons collected in a single counting experiment performed in the presence of background noise. The model obviates the need to bin experimental data for analysis, and makes it possible to analyze formally the effect of background noise on the photon detection experiment using both ML or Bayesian methods. For both methods we derive the joint and marginal probability densities of the fluorescent lifetime and fluorescent emission. The ML and Bayesian methods are compared in an analysis of simulated single molecule fluorescence experiments of Rhodamine 110 using different combinations of expected background noise and expected fluorescence emission. While both the ML or Bayesian procedures perform well for analyzing fluorescence emissions, the Bayesian methods provide more realistic measures of uncertainty in the fluorescent lifetimes. The Bayesian methods would be especially useful for measuring uncertainty in fluorescent lifetime estimates in current single molecule flow stream experiments where the expected fluorescence emission is low. Both the ML and Bayesian algorithms can be automated for applications in molecular biology.
Searching Dynamic Agents with a Team of Mobile Robots
Juliá, Miguel; Gil, Arturo; Reinoso, Oscar
2012-01-01
This paper presents a new algorithm that allows a team of robots to cooperatively search for a set of moving targets. An estimation of the areas of the environment that are more likely to hold a target agent is obtained using a grid-based Bayesian filter. The robot sensor readings and the maximum speed of the moving targets are used in order to update the grid. This representation is used in a search algorithm that commands the robots to those areas that are more likely to present target agents. This algorithm splits the environment in a tree of connected regions using dynamic programming. This tree is used in order to decide the destination for each robot in a coordinated manner. The algorithm has been successfully tested in known and unknown environments showing the validity of the approach. PMID:23012519
Searching dynamic agents with a team of mobile robots.
Juliá, Miguel; Gil, Arturo; Reinoso, Oscar
2012-01-01
This paper presents a new algorithm that allows a team of robots to cooperatively search for a set of moving targets. An estimation of the areas of the environment that are more likely to hold a target agent is obtained using a grid-based Bayesian filter. The robot sensor readings and the maximum speed of the moving targets are used in order to update the grid. This representation is used in a search algorithm that commands the robots to those areas that are more likely to present target agents. This algorithm splits the environment in a tree of connected regions using dynamic programming. This tree is used in order to decide the destination for each robot in a coordinated manner. The algorithm has been successfully tested in known and unknown environments showing the validity of the approach.
Adults with autism overestimate the volatility of the sensory environment.
Lawson, Rebecca P; Mathys, Christoph; Rees, Geraint
2017-09-01
Insistence on sameness and intolerance of change are among the diagnostic criteria for autism spectrum disorder (ASD), but little research has addressed how people with ASD represent and respond to environmental change. Here, behavioral and pupillometric measurements indicated that adults with ASD are less surprised than neurotypical adults when their expectations are violated, and decreased surprise is predictive of greater symptom severity. A hierarchical Bayesian model of learning suggested that in ASD, a tendency to overlearn about volatility in the face of environmental change drives a corresponding reduction in learning about probabilistically aberrant events, thus putatively rendering these events less surprising. Participant-specific modeled estimates of surprise about environmental conditions were linked to pupil size in the ASD group, thus suggesting heightened noradrenergic responsivity in line with compromised neural gain. This study offers insights into the behavioral, algorithmic and physiological mechanisms underlying responses to environmental volatility in ASD.
Adults with autism over-estimate the volatility of the sensory environment
Mathys, Christoph; Rees, Geraint
2017-01-01
Insistence on sameness and intolerance of change are part of the diagnostic criteria for Autism Spectrum Disorder (ASD) but there is little research addressing how people with ASD represent and respond to environmental change. Here, we find that behavioural and pupillometric measurements show adults with ASD are less surprised than neurotypical adults when expectations are violated, with reduced surprise predicting greater symptom severity. A hierarchical Bayesian model of learning suggests that in ASD a tendency to over-learn about volatility in the face of environmental change drives a corresponding reduction in learning about probabilistically aberrant events – putatively rendering them less surprising. Participant-specific modelled estimates of surprise about environmental conditions are linked to pupil size in the ASD group, suggesting heightened phasic noradrenergic responsivity in line with neural gain impairments. This study offers novel insight into the behavioural, algorithmic and physiological mechanisms that underlie responses to environmental volatility in ASD. PMID:28758996
Bayesian source term determination with unknown covariance of measurements
NASA Astrophysics Data System (ADS)
Belal, Alkomiet; Tichý, Ondřej; Šmídl, Václav
2017-04-01
Determination of a source term of release of a hazardous material into the atmosphere is a very important task for emergency response. We are concerned with the problem of estimation of the source term in the conventional linear inverse problem, y = Mx, where the relationship between the vector of observations y is described using the source-receptor-sensitivity (SRS) matrix M and the unknown source term x. Since the system is typically ill-conditioned, the problem is recast as an optimization problem minR,B(y - Mx)TR-1(y - Mx) + xTB-1x. The first term minimizes the error of the measurements with covariance matrix R, and the second term is a regularization of the source term. There are different types of regularization arising for different choices of matrices R and B, for example, Tikhonov regularization assumes covariance matrix B as the identity matrix multiplied by scalar parameter. In this contribution, we adopt a Bayesian approach to make inference on the unknown source term x as well as unknown R and B. We assume prior on x to be a Gaussian with zero mean and unknown diagonal covariance matrix B. The covariance matrix of the likelihood R is also unknown. We consider two potential choices of the structure of the matrix R. First is the diagonal matrix and the second is a locally correlated structure using information on topology of the measuring network. Since the inference of the model is intractable, iterative variational Bayes algorithm is used for simultaneous estimation of all model parameters. The practical usefulness of our contribution is demonstrated on an application of the resulting algorithm to real data from the European Tracer Experiment (ETEX). This research is supported by EEA/Norwegian Financial Mechanism under project MSMT-28477/2014 Source-Term Determination of Radionuclide Releases by Inverse Atmospheric Dispersion Modelling (STRADI).
Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer
2018-01-01
This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.
Heuristic Bayesian segmentation for discovery of coexpressed genes within genomic regions.
Pehkonen, Petri; Wong, Garry; Törönen, Petri
2010-01-01
Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.
Receptive Field Inference with Localized Priors
Park, Mijung; Pillow, Jonathan W.
2011-01-01
The linear receptive field describes a mapping from sensory stimuli to a one-dimensional variable governing a neuron's spike response. However, traditional receptive field estimators such as the spike-triggered average converge slowly and often require large amounts of data. Bayesian methods seek to overcome this problem by biasing estimates towards solutions that are more likely a priori, typically those with small, smooth, or sparse coefficients. Here we introduce a novel Bayesian receptive field estimator designed to incorporate locality, a powerful form of prior information about receptive field structure. The key to our approach is a hierarchical receptive field model that flexibly adapts to localized structure in both spacetime and spatiotemporal frequency, using an inference method known as empirical Bayes. We refer to our method as automatic locality determination (ALD), and show that it can accurately recover various types of smooth, sparse, and localized receptive fields. We apply ALD to neural data from retinal ganglion cells and V1 simple cells, and find it achieves error rates several times lower than standard estimators. Thus, estimates of comparable accuracy can be achieved with substantially less data. Finally, we introduce a computationally efficient Markov Chain Monte Carlo (MCMC) algorithm for fully Bayesian inference under the ALD prior, yielding accurate Bayesian confidence intervals for small or noisy datasets. PMID:22046110
Pant, Jeevan K; Krishnan, Sridhar
2014-04-01
A new algorithm for the reconstruction of electrocardiogram (ECG) signals and a dictionary learning algorithm for the enhancement of its reconstruction performance for a class of signals are proposed. The signal reconstruction algorithm is based on minimizing the lp pseudo-norm of the second-order difference, called as the lp(2d) pseudo-norm, of the signal. The optimization involved is carried out using a sequential conjugate-gradient algorithm. The dictionary learning algorithm uses an iterative procedure wherein a signal reconstruction and a dictionary update steps are repeated until a convergence criterion is satisfied. The signal reconstruction step is implemented by using the proposed signal reconstruction algorithm and the dictionary update step is implemented by using the linear least-squares method. Extensive simulation results demonstrate that the proposed algorithm yields improved reconstruction performance for temporally correlated ECG signals relative to the state-of-the-art lp(1d)-regularized least-squares and Bayesian learning based algorithms. Also for a known class of signals, the reconstruction performance of the proposed algorithm can be improved by applying it in conjunction with a dictionary obtained using the proposed dictionary learning algorithm.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters.
Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Serrano, Alejandro; Godoy, Jorge; Martínez-Álvarez, Antonio; Villagra, Jorge
2017-11-11
Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle.
A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules
Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos
2015-01-01
Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods. PMID:25938136
A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules.
Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos
Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods.
Inferring Gene Regulatory Networks by Singular Value Decomposition and Gravitation Field Algorithm
Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms. PMID:23226565
NASA Astrophysics Data System (ADS)
Vrugt, Jasper A.; Beven, Keith J.
2018-04-01
This essay illustrates some recent developments to the DiffeRential Evolution Adaptive Metropolis (DREAM) MATLAB toolbox of Vrugt (2016) to delineate and sample the behavioural solution space of set-theoretic likelihood functions used within the GLUE (Limits of Acceptability) framework (Beven and Binley, 1992, 2014; Beven and Freer, 2001; Beven, 2006). This work builds on the DREAM(ABC) algorithm of Sadegh and Vrugt (2014) and enhances significantly the accuracy and CPU-efficiency of Bayesian inference with GLUE. In particular it is shown how lack of adequate sampling in the model space might lead to unjustified model rejection.
Bayesian inference in an item response theory model with a generalized student t link function
NASA Astrophysics Data System (ADS)
Azevedo, Caio L. N.; Migon, Helio S.
2012-10-01
In this paper we introduce a new item response theory (IRT) model with a generalized Student t-link function with unknown degrees of freedom (df), named generalized t-link (GtL) IRT model. In this model we consider only the difficulty parameter in the item response function. GtL is an alternative to the two parameter logit and probit models, since the degrees of freedom (df) play a similar role to the discrimination parameter. However, the behavior of the curves of the GtL is different from those of the two parameter models and the usual Student t link, since in GtL the curve obtained from different df's can cross the probit curves in more than one latent trait level. The GtL model has similar proprieties to the generalized linear mixed models, such as the existence of sufficient statistics and easy parameter interpretation. Also, many techniques of parameter estimation, model fit assessment and residual analysis developed for that models can be used for the GtL model. We develop fully Bayesian estimation and model fit assessment tools through a Metropolis-Hastings step within Gibbs sampling algorithm. We consider a prior sensitivity choice concerning the degrees of freedom. The simulation study indicates that the algorithm recovers all parameters properly. In addition, some Bayesian model fit assessment tools are considered. Finally, a real data set is analyzed using our approach and other usual models. The results indicate that our model fits the data better than the two parameter models.
A Bayesian Approach to Real-Time Earthquake Phase Association
NASA Astrophysics Data System (ADS)
Benz, H.; Johnson, C. E.; Earle, P. S.; Patton, J. M.
2014-12-01
Real-time location of seismic events requires a robust and extremely efficient means of associating and identifying seismic phases with hypothetical sources. An association algorithm converts a series of phase arrival times into a catalog of earthquake hypocenters. The classical approach based on time-space stacking of the locus of possible hypocenters for each phase arrival using the principal of acoustic reciprocity has been in use now for many years. One of the most significant problems that has emerged over time with this approach is related to the extreme variations in seismic station density throughout the global seismic network. To address this problem we have developed a novel, Bayesian association algorithm, which looks at the association problem as a dynamically evolving complex system of "many to many relationships". While the end result must be an array of one to many relations (one earthquake, many phases), during the association process the situation is quite different. Both the evolving possible hypocenters and the relationships between phases and all nascent hypocenters is many to many (many earthquakes, many phases). The computational framework we are using to address this is a responsive, NoSQL graph database where the earthquake-phase associations are represented as intersecting Bayesian Learning Networks. The approach directly addresses the network inhomogeneity issue while at the same time allowing the inclusion of other kinds of data (e.g., seismic beams, station noise characteristics, priors on estimated location of the seismic source) by representing the locus of intersecting hypothetical loci for a given datum as joint probability density functions.
Kim, D.; Burge, J.; Lane, T.; Pearlson, G. D; Kiehl, K. A; Calhoun, V. D.
2008-01-01
We utilized a discrete dynamic Bayesian network (dDBN) approach (Burge et al., 2007) to determine differences in brain regions between patients with schizophrenia and healthy controls on a measure of effective connectivity, termed the approximate conditional likelihood score (ACL) (Burge and Lane, 2005). The ACL score represents a class-discriminative measure of effective connectivity by measuring the relative likelihood of the correlation between brain regions in one group versus another. The algorithm is capable of finding non-linear relationships between brain regions because it uses discrete rather than continuous values and attempts to model temporal relationships with a first-order Markov and stationary assumption constraint (Papoulis, 1991). Since Bayesian networks are overly sensitive to noisy data, we introduced an independent component analysis (ICA) filtering approach that attempted to reduce the noise found in fMRI data by unmixing the raw datasets into a set of independent spatial component maps. Components that represented noise were removed and the remaining components reconstructed into the dimensions of the original fMRI datasets. We applied the dDBN algorithm to a group of 35 patients with schizophrenia and 35 matched healthy controls using an ICA filtered and unfiltered approach. We determined that filtering the data significantly improved the magnitude of the ACL score. Patients showed the greatest ACL scores in several regions, most markedly the cerebellar vermis and hemispheres. Our findings suggest that schizophrenia patients exhibit weaker connectivity than healthy controls in multiple regions, including bilateral temporal and frontal cortices, plus cerebellum during an auditory paradigm. PMID:18602482
NASA Technical Reports Server (NTRS)
Mengshoel, Ole J.; Roth, Dan; Wilkins, David C.
2001-01-01
Portfolio methods support the combination of different algorithms and heuristics, including stochastic local search (SLS) heuristics, and have been identified as a promising approach to solve computationally hard problems. While successful in experiments, theoretical foundations and analytical results for portfolio-based SLS heuristics are less developed. This article aims to improve the understanding of the role of portfolios of heuristics in SLS. We emphasize the problem of computing most probable explanations (MPEs) in Bayesian networks (BNs). Algorithmically, we discuss a portfolio-based SLS algorithm for MPE computation, Stochastic Greedy Search (SGS). SGS supports the integration of different initialization operators (or initialization heuristics) and different search operators (greedy and noisy heuristics), thereby enabling new analytical and experimental results. Analytically, we introduce a novel Markov chain model tailored to portfolio-based SLS algorithms including SGS, thereby enabling us to analytically form expected hitting time results that explain empirical run time results. For a specific BN, we show the benefit of using a homogenous initialization portfolio. To further illustrate the portfolio approach, we consider novel additive search heuristics for handling determinism in the form of zero entries in conditional probability tables in BNs. Our additive approach adds rather than multiplies probabilities when computing the utility of an explanation. We motivate the additive measure by studying the dramatic impact of zero entries in conditional probability tables on the number of zero-probability explanations, which again complicates the search process. We consider the relationship between MAXSAT and MPE, and show that additive utility (or gain) is a generalization, to the probabilistic setting, of MAXSAT utility (or gain) used in the celebrated GSAT and WalkSAT algorithms and their descendants. Utilizing our Markov chain framework, we show that expected hitting time is a rational function - i.e. a ratio of two polynomials - of the probability of applying an additive search operator. Experimentally, we report on synthetically generated BNs as well as BNs from applications, and compare SGSs performance to that of Hugin, which performs BN inference by compilation to and propagation in clique trees. On synthetic networks, SGS speeds up computation by approximately two orders of magnitude compared to Hugin. In application networks, our approach is highly competitive in Bayesian networks with a high degree of determinism. In addition to showing that stochastic local search can be competitive with clique tree clustering, our empirical results provide an improved understanding of the circumstances under which portfolio-based SLS outperforms clique tree clustering and vice versa.
Can Bayesian Theories of Autism Spectrum Disorder Help Improve Clinical Practice?
Haker, Helene; Schneebeli, Maya; Stephan, Klaas Enno
2016-01-01
Diagnosis and individualized treatment of autism spectrum disorder (ASD) represent major problems for contemporary psychiatry. Tackling these problems requires guidance by a pathophysiological theory. In this paper, we consider recent theories that re-conceptualize ASD from a "Bayesian brain" perspective, which posit that the core abnormality of ASD resides in perceptual aberrations due to a disbalance in the precision of prediction errors (sensory noise) relative to the precision of predictions (prior beliefs). This results in percepts that are dominated by sensory inputs and less guided by top-down regularization and shifts the perceptual focus to detailed aspects of the environment with difficulties in extracting meaning. While these Bayesian theories have inspired ongoing empirical studies, their clinical implications have not yet been carved out. Here, we consider how this Bayesian perspective on disease mechanisms in ASD might contribute to improving clinical care for affected individuals. Specifically, we describe a computational strategy, based on generative (e.g., hierarchical Bayesian) models of behavioral and functional neuroimaging data, for establishing diagnostic tests. These tests could provide estimates of specific cognitive processes underlying ASD and delineate pathophysiological mechanisms with concrete treatment targets. Written with a clinical audience in mind, this article outlines how the development of computational diagnostics applicable to behavioral and functional neuroimaging data in routine clinical practice could not only fundamentally alter our concept of ASD but eventually also transform the clinical management of this disorder.
Can Bayesian Theories of Autism Spectrum Disorder Help Improve Clinical Practice?
Haker, Helene; Schneebeli, Maya; Stephan, Klaas Enno
2016-01-01
Diagnosis and individualized treatment of autism spectrum disorder (ASD) represent major problems for contemporary psychiatry. Tackling these problems requires guidance by a pathophysiological theory. In this paper, we consider recent theories that re-conceptualize ASD from a “Bayesian brain” perspective, which posit that the core abnormality of ASD resides in perceptual aberrations due to a disbalance in the precision of prediction errors (sensory noise) relative to the precision of predictions (prior beliefs). This results in percepts that are dominated by sensory inputs and less guided by top-down regularization and shifts the perceptual focus to detailed aspects of the environment with difficulties in extracting meaning. While these Bayesian theories have inspired ongoing empirical studies, their clinical implications have not yet been carved out. Here, we consider how this Bayesian perspective on disease mechanisms in ASD might contribute to improving clinical care for affected individuals. Specifically, we describe a computational strategy, based on generative (e.g., hierarchical Bayesian) models of behavioral and functional neuroimaging data, for establishing diagnostic tests. These tests could provide estimates of specific cognitive processes underlying ASD and delineate pathophysiological mechanisms with concrete treatment targets. Written with a clinical audience in mind, this article outlines how the development of computational diagnostics applicable to behavioral and functional neuroimaging data in routine clinical practice could not only fundamentally alter our concept of ASD but eventually also transform the clinical management of this disorder. PMID:27378955
NASA Astrophysics Data System (ADS)
Gu, Chen; Marzouk, Youssef M.; Toksöz, M. Nafi
2018-03-01
Small earthquakes occur due to natural tectonic motions and are induced by oil and gas production processes. In many oil/gas fields and hydrofracking processes, induced earthquakes result from fluid extraction or injection. The locations and source mechanisms of these earthquakes provide valuable information about the reservoirs. Analysis of induced seismic events has mostly assumed a double-couple source mechanism. However, recent studies have shown a non-negligible percentage of non-double-couple components of source moment tensors in hydraulic fracturing events, assuming a full moment tensor source mechanism. Without uncertainty quantification of the moment tensor solution, it is difficult to determine the reliability of these source models. This study develops a Bayesian method to perform waveform-based full moment tensor inversion and uncertainty quantification for induced seismic events, accounting for both location and velocity model uncertainties. We conduct tests with synthetic events to validate the method, and then apply our newly developed Bayesian inversion approach to real induced seismicity in an oil/gas field in the sultanate of Oman—determining the uncertainties in the source mechanism and in the location of that event.
Hesar, Hamed Danandeh; Mohebbi, Maryam
2017-05-01
In this paper, a model-based Bayesian filtering framework called the "marginalized particle-extended Kalman filter (MP-EKF) algorithm" is proposed for electrocardiogram (ECG) denoising. This algorithm does not have the extended Kalman filter (EKF) shortcoming in handling non-Gaussian nonstationary situations because of its nonlinear framework. In addition, it has less computational complexity compared with particle filter. This filter improves ECG denoising performance by implementing marginalized particle filter framework while reducing its computational complexity using EKF framework. An automatic particle weighting strategy is also proposed here that controls the reliance of our framework to the acquired measurements. We evaluated the proposed filter on several normal ECGs selected from MIT-BIH normal sinus rhythm database. To do so, artificial white Gaussian and colored noises as well as nonstationary real muscle artifact (MA) noise over a range of low SNRs from 10 to -5 dB were added to these normal ECG segments. The benchmark methods were the EKF and extended Kalman smoother (EKS) algorithms which are the first model-based Bayesian algorithms introduced in the field of ECG denoising. From SNR viewpoint, the experiments showed that in the presence of Gaussian white noise, the proposed framework outperforms the EKF and EKS algorithms in lower input SNRs where the measurements and state model are not reliable. Owing to its nonlinear framework and particle weighting strategy, the proposed algorithm attained better results at all input SNRs in non-Gaussian nonstationary situations (such as presence of pink noise, brown noise, and real MA). In addition, the impact of the proposed filtering method on the distortion of diagnostic features of the ECG was investigated and compared with EKF/EKS methods using an ECG diagnostic distortion measure called the "Multi-Scale Entropy Based Weighted Distortion Measure" or MSEWPRD. The results revealed that our proposed algorithm had the lowest MSEPWRD for all noise types at low input SNRs. Therefore, the morphology and diagnostic information of ECG signals were much better conserved compared with EKF/EKS frameworks, especially in non-Gaussian nonstationary situations.
Probabilistic Common Spatial Patterns for Multichannel EEG Analysis
Chen, Zhe; Gao, Xiaorong; Li, Yuanqing; Brown, Emery N.; Gao, Shangkai
2015-01-01
Common spatial patterns (CSP) is a well-known spatial filtering algorithm for multichannel electroencephalogram (EEG) analysis. In this paper, we cast the CSP algorithm in a probabilistic modeling setting. Specifically, probabilistic CSP (P-CSP) is proposed as a generic EEG spatio-temporal modeling framework that subsumes the CSP and regularized CSP algorithms. The proposed framework enables us to resolve the overfitting issue of CSP in a principled manner. We derive statistical inference algorithms that can alleviate the issue of local optima. In particular, an efficient algorithm based on eigendecomposition is developed for maximum a posteriori (MAP) estimation in the case of isotropic noise. For more general cases, a variational algorithm is developed for group-wise sparse Bayesian learning for the P-CSP model and for automatically determining the model size. The two proposed algorithms are validated on a simulated data set. Their practical efficacy is also demonstrated by successful applications to single-trial classifications of three motor imagery EEG data sets and by the spatio-temporal pattern analysis of one EEG data set recorded in a Stroop color naming task. PMID:26005228