Informed Source Separation: A Bayesian Tutorial
NASA Technical Reports Server (NTRS)
Knuth, Kevin H.
2005-01-01
Source separation problems are ubiquitous in the physical sciences; any situation where signals are superimposed calls for source separation to estimate the original signals. In h s tutorial I will discuss the Bayesian approach to the source separation problem. This approach has a specific advantage in that it requires the designer to explicitly describe the signal model in addition to any other information or assumptions that go into the problem description. This leads naturally to the idea of informed source separation, where the algorithm design incorporates relevant information about the specific problem. This approach promises to enable researchers to design their own high-quality algorithms that are specifically tailored to the problem at hand.
Information Theoretic Studies and Assessment of Space Object Identification
2014-03-24
localization are contained in Ref. [5]. 1.7.1 A Bayesian MPE Based Analysis of 2D Point-Source-Pair Superresolution In a second recently submitted paper [6], a...related problem of the optical superresolution (OSR) of a pair of equal-brightness point sources separated spatially by a distance (or angle) smaller...1403.4897 [physics.optics] (19 March 2014). 6. S. Prasad, “Asymptotics of Bayesian error probability and 2D pair superresolution ,” submitted to Opt. Express
Variational Bayesian Learning for Wavelet Independent Component Analysis
NASA Astrophysics Data System (ADS)
Roussos, E.; Roberts, S.; Daubechies, I.
2005-11-01
In an exploratory approach to data analysis, it is often useful to consider the observations as generated from a set of latent generators or "sources" via a generally unknown mapping. For the noisy overcomplete case, where we have more sources than observations, the problem becomes extremely ill-posed. Solutions to such inverse problems can, in many cases, be achieved by incorporating prior knowledge about the problem, captured in the form of constraints. This setting is a natural candidate for the application of the Bayesian methodology, allowing us to incorporate "soft" constraints in a natural manner. The work described in this paper is mainly driven by problems in functional magnetic resonance imaging of the brain, for the neuro-scientific goal of extracting relevant "maps" from the data. This can be stated as a `blind' source separation problem. Recent experiments in the field of neuroscience show that these maps are sparse, in some appropriate sense. The separation problem can be solved by independent component analysis (ICA), viewed as a technique for seeking sparse components, assuming appropriate distributions for the sources. We derive a hybrid wavelet-ICA model, transforming the signals into a domain where the modeling assumption of sparsity of the coefficients with respect to a dictionary is natural. We follow a graphical modeling formalism, viewing ICA as a probabilistic generative model. We use hierarchical source and mixing models and apply Bayesian inference to the problem. This allows us to perform model selection in order to infer the complexity of the representation, as well as automatic denoising. Since exact inference and learning in such a model is intractable, we follow a variational Bayesian mean-field approach in the conjugate-exponential family of distributions, for efficient unsupervised learning in multi-dimensional settings. The performance of the proposed algorithm is demonstrated on some representative experiments.
STARBLADE: STar and Artefact Removal with a Bayesian Lightweight Algorithm from Diffuse Emission
NASA Astrophysics Data System (ADS)
Knollmüller, Jakob; Frank, Philipp; Ensslin, Torsten A.
2018-05-01
STARBLADE (STar and Artefact Removal with a Bayesian Lightweight Algorithm from Diffuse Emission) separates superimposed point-like sources from a diffuse background by imposing physically motivated models as prior knowledge. The algorithm can also be used on noisy and convolved data, though performing a proper reconstruction including a deconvolution prior to the application of the algorithm is advised; the algorithm could also be used within a denoising imaging method. STARBLADE learns the correlation structure of the diffuse emission and takes it into account to determine the occurrence and strength of a superimposed point source.
Zou, Yonghong; Wang, Lixia; Christensen, Erik R
2015-10-01
This work intended to explain the challenges of the fingerprints based source apportionment method for polycyclic aromatic hydrocarbons (PAH) in the aquatic environment, and to illustrate a practical and robust solution. The PAH data detected in the sediment cores from the Illinois River provide the basis of this study. Principal component analysis (PCA) separates PAH compounds into two groups reflecting their possible airborne transport patterns; but it is not able to suggest specific sources. Not all positive matrix factorization (PMF) determined sources are distinguishable due to the variability of source fingerprints. However, they constitute useful suggestions for inputs for a Bayesian chemical mass balance (CMB) analysis. The Bayesian CMB analysis takes into account the measurement errors as well as the variations of source fingerprints, and provides a credible source apportionment. Major PAH sources for Illinois River sediments are traffic (35%), coke oven (24%), coal combustion (18%), and wood combustion (14%). Copyright © 2015. Published by Elsevier Ltd.
Chai, Rifai; Naik, Ganesh R; Nguyen, Tuan Nghia; Ling, Sai Ho; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-05-01
This paper presents a two-class electroencephal-ography-based classification for classifying of driver fatigue (fatigue state versus alert state) from 43 healthy participants. The system uses independent component by entropy rate bound minimization analysis (ERBM-ICA) for the source separation, autoregressive (AR) modeling for the features extraction, and Bayesian neural network for the classification algorithm. The classification results demonstrate a sensitivity of 89.7%, a specificity of 86.8%, and an accuracy of 88.2%. The combination of ERBM-ICA (source separator), AR (feature extractor), and Bayesian neural network (classifier) provides the best outcome with a p-value < 0.05 with the highest value of area under the receiver operating curve (AUC-ROC = 0.93) against other methods such as power spectral density as feature extractor (AUC-ROC = 0.81). The results of this study suggest the method could be utilized effectively for a countermeasure device for driver fatigue identification and other adverse event applications.
A Bayesian Framework of Uncertainties Integration in 3D Geological Model
NASA Astrophysics Data System (ADS)
Liang, D.; Liu, X.
2017-12-01
3D geological model can describe complicated geological phenomena in an intuitive way while its application may be limited by uncertain factors. Great progress has been made over the years, lots of studies decompose the uncertainties of geological model to analyze separately, while ignored the comprehensive impacts of multi-source uncertainties. Great progress has been made over the years, while lots of studies ignored the comprehensive impacts of multi-source uncertainties when analyzed them item by item from each source. To evaluate the synthetical uncertainty, we choose probability distribution to quantify uncertainty, and propose a bayesian framework of uncertainties integration. With this framework, we integrated data errors, spatial randomness, and cognitive information into posterior distribution to evaluate synthetical uncertainty of geological model. Uncertainties propagate and cumulate in modeling process, the gradual integration of multi-source uncertainty is a kind of simulation of the uncertainty propagation. Bayesian inference accomplishes uncertainty updating in modeling process. Maximum entropy principle makes a good effect on estimating prior probability distribution, which ensures the prior probability distribution subjecting to constraints supplied by the given information with minimum prejudice. In the end, we obtained a posterior distribution to evaluate synthetical uncertainty of geological model. This posterior distribution represents the synthetical impact of all the uncertain factors on the spatial structure of geological model. The framework provides a solution to evaluate synthetical impact on geological model of multi-source uncertainties and a thought to study uncertainty propagation mechanism in geological modeling.
A Markov model for blind image separation by a mean-field EM algorithm.
Tonazzini, Anna; Bedini, Luigi; Salerno, Emanuele
2006-02-01
This paper deals with blind separation of images from noisy linear mixtures with unknown coefficients, formulated as a Bayesian estimation problem. This is a flexible framework, where any kind of prior knowledge about the source images and the mixing matrix can be accounted for. In particular, we describe local correlation within the individual images through the use of Markov random field (MRF) image models. These are naturally suited to express the joint pdf of the sources in a factorized form, so that the statistical independence requirements of most independent component analysis approaches to blind source separation are retained. Our model also includes edge variables to preserve intensity discontinuities. MRF models have been proved to be very efficient in many visual reconstruction problems, such as blind image restoration, and allow separation and edge detection to be performed simultaneously. We propose an expectation-maximization algorithm with the mean field approximation to derive a procedure for estimating the mixing matrix, the sources, and their edge maps. We tested this procedure on both synthetic and real images, in the fully blind case (i.e., no prior information on mixing is exploited) and found that a source model accounting for local autocorrelation is able to increase robustness against noise, even space variant. Furthermore, when the model closely fits the source characteristics, independence is no longer a strict requirement, and cross-correlated sources can be separated, as well.
Evaristo, Jaivime; McDonnell, Jeffrey J.; Scholl, Martha A.; Bruijnzeel, L. Adrian; Chun, Kwok P.
2016-01-01
Water transpired by trees has long been assumed to be sourced from the same subsurface water stocks that contribute to groundwater recharge and streamflow. However, recent investigations using dual water stable isotopes have shown an apparent ecohydrological separation between tree-transpired water and stream water. Here we present evidence for such ecohydrological separation in two tropical environments in Puerto Rico where precipitation seasonality is relatively low and where precipitation is positively correlated with primary productivity. We determined the stable isotope signature of xylem water of 30 mahogany (Swietenia spp.) trees sampled during two periods with contrasting moisture status. Our results suggest that the separation between transpiration water and groundwater recharge/streamflow water might be related less to the temporal phasing of hydrologic inputs and primary productivity, and more to the fundamental processes that drive evaporative isotopic enrichment of residual soil water within the soil matrix. The lack of an evaporative signature of both groundwater and streams in the study area suggests that these water balance components have a water source that is transported quickly to deeper subsurface storage compared to waters that trees use. A Bayesian mixing model used to partition source water proportions of xylem water showed that groundwater contribution was greater for valley-bottom, riparian trees than for ridge-top trees. Groundwater contribution was also greater at the xeric site than at the mesic–hydric site. These model results (1) underline the utility of a simple linear mixing model, implemented in a Bayesian inference framework, in quantifying source water contributions at sites with contrasting physiographic characteristics, and (2) highlight the informed judgement that should be made in interpreting mixing model results, of import particularly in surveying groundwater use patterns by vegetation from regional to global scales.
Shashilov, Victor A; Sikirzhytski, Vitali; Popova, Ludmila A; Lednev, Igor K
2010-09-01
Here we report on novel quantitative approaches for protein structural characterization using deep UV resonance Raman (DUVRR) spectroscopy. Specifically, we propose a new method combining hydrogen-deuterium (HD) exchange and Bayesian source separation for extracting the DUVRR signatures of various structural elements of aggregated proteins including the cross-beta core and unordered parts of amyloid fibrils. The proposed method is demonstrated using the set of DUVRR spectra of hen egg white lysozyme acquired at various stages of HD exchange. Prior information about the concentration matrix and the spectral features of the individual components was incorporated into the Bayesian equation to eliminate the ill-conditioning of the problem caused by 100% correlation of the concentration profiles of protonated and deuterated species. Secondary structure fractions obtained by partial least squares (PLS) and least squares support vector machines (LS-SVMs) were used as the initial guess for the Bayessian source separation. Advantages of the PLS and LS-SVMs methods over the classical least squares calibration (CLSC) are discussed and illustrated using the DUVRR data of the prion protein in its native and aggregated forms. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Enhancements to the Bayesian Infrasound Source Location Method
2012-09-01
ENHANCEMENTS TO THE BAYESIAN INFRASOUND SOURCE LOCATION METHOD Omar E. Marcillo, Stephen J. Arrowsmith, Rod W. Whitaker, and Dale N. Anderson Los...ABSTRACT We report on R&D that is enabling enhancements to the Bayesian Infrasound Source Location (BISL) method for infrasound event location...the Bayesian Infrasound Source Location Method 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER
The Chandra Source Catalog 2.0: Estimating Source Fluxes
NASA Astrophysics Data System (ADS)
Primini, Francis Anthony; Allen, Christopher E.; Miller, Joseph; Anderson, Craig S.; Budynkiewicz, Jamie A.; Burke, Douglas; Chen, Judy C.; Civano, Francesca Maria; D'Abrusco, Raffaele; Doe, Stephen M.; Evans, Ian N.; Evans, Janet D.; Fabbiano, Giuseppina; Gibbs, Danny G., II; Glotfelty, Kenny J.; Graessle, Dale E.; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; Houck, John C.; Lauer, Jennifer L.; Laurino, Omar; Lee, Nicholas P.; Martínez-Galarza, Juan Rafael; McCollough, Michael L.; McDowell, Jonathan C.; McLaughlin, Warren; Morgan, Douglas L.; Mossman, Amy E.; Nguyen, Dan T.; Nichols, Joy S.; Nowak, Michael A.; Paxson, Charles; Plummer, David A.; Rots, Arnold H.; Siemiginowska, Aneta; Sundheim, Beth A.; Tibbetts, Michael; Van Stone, David W.; Zografou, Panagoula
2018-01-01
The Second Chandra Source Catalog (CSC2.0) will provide information on approximately 316,000 point or compact extended x-ray sources, derived from over 10,000 ACIS and HRC-I imaging observations available in the public archive at the end of 2014. As in the previous catalog release (CSC1.1), fluxes for these sources will be determined separately from source detection, using a Bayesian formalism that accounts for background, spatial resolution effects, and contamination from nearby sources. However, the CSC2.0 procedure differs from that used in CSC1.1 in three important aspects. First, for sources in crowded regions in which photometric apertures overlap, fluxes are determined jointly, using an extension of the CSC1.1 algorithm, as discussed in Primini & Kashyap (2014ApJ...796…24P). Second, an MCMC procedure is used to estimate marginalized posterior probability distributions for source fluxes. Finally, for sources observed in multiple observations, a Bayesian Blocks algorithm (Scargle, et al. 2013ApJ...764..167S) is used to group observations into blocks of constant source flux.In this poster we present details of the CSC2.0 photometry algorithms and illustrate their performance in actual CSC2.0 datasets.This work has been supported by NASA under contract NAS 8-03060 to the Smithsonian Astrophysical Observatory for operation of the Chandra X-ray Center.
NASA Astrophysics Data System (ADS)
Mohammad-Djafari, Ali
2015-01-01
The main object of this tutorial article is first to review the main inference tools using Bayesian approach, Entropy, Information theory and their corresponding geometries. This review is focused mainly on the ways these tools have been used in data, signal and image processing. After a short introduction of the different quantities related to the Bayes rule, the entropy and the Maximum Entropy Principle (MEP), relative entropy and the Kullback-Leibler divergence, Fisher information, we will study their use in different fields of data and signal processing such as: entropy in source separation, Fisher information in model order selection, different Maximum Entropy based methods in time series spectral estimation and finally, general linear inverse problems.
NASA Astrophysics Data System (ADS)
Eriçok, Ozan Burak; Ertürk, Hakan
2018-07-01
Optical characterization of nanoparticle aggregates is a complex inverse problem that can be solved by deterministic or statistical methods. Previous studies showed that there exists a different lower size limit of reliable characterization, corresponding to the wavelength of light source used. In this study, these characterization limits are determined considering a light source wavelength range changing from ultraviolet to near infrared (266-1064 nm) relying on numerical light scattering experiments. Two different measurement ensembles are considered. Collection of well separated aggregates made up of same sized particles and that of having particle size distribution. Filippov's cluster-cluster algorithm is used to generate the aggregates and the light scattering behavior is calculated by discrete dipole approximation. A likelihood-free Approximate Bayesian Computation, relying on Adaptive Population Monte Carlo method, is used for characterization. It is found that when the wavelength range of 266-1064 nm is used, successful characterization limit changes from 21-62 nm effective radius for monodisperse and polydisperse soot aggregates.
Coincident Detection Significance in Multimessenger Astronomy
NASA Astrophysics Data System (ADS)
Ashton, G.; Burns, E.; Dal Canton, T.; Dent, T.; Eggenstein, H.-B.; Nielsen, A. B.; Prix, R.; Was, M.; Zhu, S. J.
2018-06-01
We derive a Bayesian criterion for assessing whether signals observed in two separate data sets originate from a common source. The Bayes factor for a common versus unrelated origin of signals includes an overlap integral of the posterior distributions over the common-source parameters. Focusing on multimessenger gravitational-wave astronomy, we apply the method to the spatial and temporal association of independent gravitational-wave and electromagnetic (or neutrino) observations. As an example, we consider the coincidence between the recently discovered gravitational-wave signal GW170817 from a binary neutron star merger and the gamma-ray burst GRB 170817A: we find that the common-source model is enormously favored over a model describing them as unrelated signals.
Mixture Modeling for Background and Sources Separation in x-ray Astronomical Images
NASA Astrophysics Data System (ADS)
Guglielmetti, Fabrizia; Fischer, Rainer; Dose, Volker
2004-11-01
A probabilistic technique for the joint estimation of background and sources in high-energy astrophysics is described. Bayesian probability theory is applied to gain insight into the coexistence of background and sources through a probabilistic two-component mixture model, which provides consistent uncertainties of background and sources. The present analysis is applied to ROSAT PSPC data (0.1-2.4 keV) in Survey Mode. A background map is modelled using a Thin-Plate spline. Source probability maps are obtained for each pixel (45 arcsec) independently and for larger correlation lengths, revealing faint and extended sources. We will demonstrate that the described probabilistic method allows for detection improvement of faint extended celestial sources compared to the Standard Analysis Software System (SASS) used for the production of the ROSAT All-Sky Survey (RASS) catalogues.
Meng, Qing-Hao; Yang, Wei-Xing; Wang, Yang; Zeng, Ming
2011-01-01
This paper addresses the collective odor source localization (OSL) problem in a time-varying airflow environment using mobile robots. A novel OSL methodology which combines odor-source probability estimation and multiple robots' search is proposed. The estimation phase consists of two steps: firstly, the separate probability-distribution map of odor source is estimated via Bayesian rules and fuzzy inference based on a single robot's detection events; secondly, the separate maps estimated by different robots at different times are fused into a combined map by way of distance based superposition. The multi-robot search behaviors are coordinated via a particle swarm optimization algorithm, where the estimated odor-source probability distribution is used to express the fitness functions. In the process of OSL, the estimation phase provides the prior knowledge for the searching while the searching verifies the estimation results, and both phases are implemented iteratively. The results of simulations for large-scale advection-diffusion plume environments and experiments using real robots in an indoor airflow environment validate the feasibility and robustness of the proposed OSL method.
Meng, Qing-Hao; Yang, Wei-Xing; Wang, Yang; Zeng, Ming
2011-01-01
This paper addresses the collective odor source localization (OSL) problem in a time-varying airflow environment using mobile robots. A novel OSL methodology which combines odor-source probability estimation and multiple robots’ search is proposed. The estimation phase consists of two steps: firstly, the separate probability-distribution map of odor source is estimated via Bayesian rules and fuzzy inference based on a single robot’s detection events; secondly, the separate maps estimated by different robots at different times are fused into a combined map by way of distance based superposition. The multi-robot search behaviors are coordinated via a particle swarm optimization algorithm, where the estimated odor-source probability distribution is used to express the fitness functions. In the process of OSL, the estimation phase provides the prior knowledge for the searching while the searching verifies the estimation results, and both phases are implemented iteratively. The results of simulations for large-scale advection–diffusion plume environments and experiments using real robots in an indoor airflow environment validate the feasibility and robustness of the proposed OSL method. PMID:22346650
NASA Astrophysics Data System (ADS)
Liu, Luyao; Feng, Minquan
2018-03-01
[Objective] This study quantitatively evaluated risk probabilities of sudden water pollution accidents under the influence of risk sources, thus providing an important guarantee for risk source identification during water diversion from the Hanjiang River to the Weihe River. [Methods] The research used Bayesian networks to represent the correlation between accidental risk sources. It also adopted the sequential Monte Carlo algorithm to combine water quality simulation with state simulation of risk sources, thereby determining standard-exceeding probabilities of sudden water pollution accidents. [Results] When the upstream inflow was 138.15 m3/s and the average accident duration was 48 h, the probabilities were 0.0416 and 0.0056 separately. When the upstream inflow was 55.29 m3/s and the average accident duration was 48 h, the probabilities were 0.0225 and 0.0028 separately. [Conclusions] The research conducted a risk assessment on sudden water pollution accidents, thereby providing an important guarantee for the smooth implementation, operation, and water quality of the Hanjiang-to-Weihe River Diversion Project.
Varughese, Eunice A.; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer S; Fout, G. Shay; Furlong, Edward T.; Kolpin, Dana W.; Glassmeyer, Susan T.; Keely, Scott P
2017-01-01
incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters.
Advanced obstacle avoidance for a laser based wheelchair using optimised Bayesian neural networks.
Trieu, Hoang T; Nguyen, Hung T; Willey, Keith
2008-01-01
In this paper we present an advanced method of obstacle avoidance for a laser based intelligent wheelchair using optimized Bayesian neural networks. Three neural networks are designed for three separate sub-tasks: passing through a door way, corridor and wall following and general obstacle avoidance. The accurate usable accessible space is determined by including the actual wheelchair dimensions in a real-time map used as inputs to each networks. Data acquisitions are performed separately to collect the patterns required for specified sub-tasks. Bayesian frame work is used to determine the optimal neural network structure in each case. Then these networks are trained under the supervision of Bayesian rule. Experiment results showed that compare to the VFH algorithm our neural networks navigated a smoother path following a near optimum trajectory.
NASA Astrophysics Data System (ADS)
Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.
2015-12-01
Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gencaga, Deniz; Knuth, Kevin H.; Carbon, Duane F.
Understanding the origins of life has been one of the greatest dreams throughout history. It is now known that star-forming regions contain complex organic molecules, known as Polycyclic Aromatic Hydrocarbons (PAHs), each of which has particular infrared spectral characteristics. By understanding which PAH species are found in specific star-forming regions, we can better understand the biochemistry that takes place in interstellar clouds. Identifying and classifying PAHs is not an easy task: we can only observe a single superposition of PAH spectra at any given astrophysical site, with the PAH species perhaps numbering in the hundreds or even thousands. This ismore » a challenging source separation problem since we have only one observation composed of numerous mixed sources. However, it is made easier with the help of a library of hundreds of PAH spectra. In order to separate PAH molecules from their mixture, we need to identify the specific species and their unique concentrations that would provide the given mixture. We develop a Bayesian approach for this problem where sources are separated from their mixture by Metropolis Hastings algorithm. Separated PAH concentrations are provided with their error bars, illustrating the uncertainties involved in the estimation process. The approach is demonstrated on synthetic spectral mixtures using spectral resolutions from the Infrared Space Observatory (ISO). Performance of the method is tested for different noise levels.« less
Bayesian analyses of time-interval data for environmental radiation monitoring.
Luo, Peng; Sharp, Julia L; DeVol, Timothy A
2013-01-01
Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.
Depaoli, Sarah
2013-06-01
Growth mixture modeling (GMM) represents a technique that is designed to capture change over time for unobserved subgroups (or latent classes) that exhibit qualitatively different patterns of growth. The aim of the current article was to explore the impact of latent class separation (i.e., how similar growth trajectories are across latent classes) on GMM performance. Several estimation conditions were compared: maximum likelihood via the expectation maximization (EM) algorithm and the Bayesian framework implementing diffuse priors, "accurate" informative priors, weakly informative priors, data-driven informative priors, priors reflecting partial-knowledge of parameters, and "inaccurate" (but informative) priors. The main goal was to provide insight about the optimal estimation condition under different degrees of latent class separation for GMM. Results indicated that optimal parameter recovery was obtained though the Bayesian approach using "accurate" informative priors, and partial-knowledge priors showed promise for the recovery of the growth trajectory parameters. Maximum likelihood and the remaining Bayesian estimation conditions yielded poor parameter recovery for the latent class proportions and the growth trajectories. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Spatial variation in anthropogenic mortality induces a source-sink system in a hunted mesopredator.
Minnie, Liaan; Zalewski, Andrzej; Zalewska, Hanna; Kerley, Graham I H
2018-04-01
Lethal carnivore management is a prevailing strategy to reduce livestock predation. Intensity of lethal management varies according to land-use, where carnivores are more intensively hunted on farms relative to reserves. Variations in hunting intensity may result in the formation of a source-sink system where carnivores disperse from high-density to low-density areas. Few studies quantify dispersal between supposed sources and sinks-a fundamental requirement for source-sink systems. We used the black-backed jackal (Canis mesomelas) as a model to determine if heterogeneous anthropogenic mortality induces a source-sink system. We analysed 12 microsatellite loci from 554 individuals from lightly hunted and previously unhunted reserves, as well as heavily hunted livestock- and game farms. Bayesian genotype assignment showed that jackal populations displayed a hierarchical population structure. We identified two genetically distinct populations at the regional level and nine distinct subpopulations at the local level, with each cluster corresponding to distinct land-use types separated by various dispersal barriers. Migration, estimated using Bayesian multilocus genotyping, between reserves and farms was asymmetric and heterogeneous anthropogenic mortality induced source-sink dynamics via compensatory immigration. Additionally some heavily hunted populations also acted as source populations, exporting individuals to other heavily hunted populations. This indicates that heterogeneous anthropogenic mortality results in the formation of a complex series of interconnected sources and sinks. Thus, lethal management of mesopredators may not be an effective long-term strategy in reducing livestock predation, as dispersal and, more importantly, compensatory immigration may continue to affect population reduction efforts as long as dispersal from other areas persists.
Bayesian component separation: The Planck experience
NASA Astrophysics Data System (ADS)
Wehus, Ingunn Kathrine; Eriksen, Hans Kristian
2018-05-01
Bayesian component separation techniques have played a central role in the data reduction process of Planck. The most important strength of this approach is its global nature, in which a parametric and physical model is fitted to the data. Such physical modeling allows the user to constrain very general data models, and jointly probe cosmological, astrophysical and instrumental parameters. This approach also supports statistically robust goodness-of-fit tests in terms of data-minus-model residual maps, which are essential for identifying residual systematic effects in the data. The main challenges are high code complexity and computational cost. Whether or not these costs are justified for a given experiment depends on its final uncertainty budget. We therefore predict that the importance of Bayesian component separation techniques is likely to increase with time for intensity mapping experiments, similar to what has happened in the CMB field, as observational techniques mature, and their overall sensitivity improves.
Performance assessment of a Bayesian Forecasting System (BFS) for real-time flood forecasting
NASA Astrophysics Data System (ADS)
Biondi, D.; De Luca, D. L.
2013-02-01
SummaryThe paper evaluates, for a number of flood events, the performance of a Bayesian Forecasting System (BFS), with the aim of evaluating total uncertainty in real-time flood forecasting. The predictive uncertainty of future streamflow is estimated through the Bayesian integration of two separate processors. The former evaluates the propagation of input uncertainty on simulated river discharge, the latter computes the hydrological uncertainty of actual river discharge associated with all other possible sources of error. A stochastic model and a distributed rainfall-runoff model were assumed, respectively, for rainfall and hydrological response simulations. A case study was carried out for a small basin in the Calabria region (southern Italy). The performance assessment of the BFS was performed with adequate verification tools suited for probabilistic forecasts of continuous variables such as streamflow. Graphical tools and scalar metrics were used to evaluate several attributes of the forecast quality of the entire time-varying predictive distributions: calibration, sharpness, accuracy, and continuous ranked probability score (CRPS). Besides the overall system, which incorporates both sources of uncertainty, other hypotheses resulting from the BFS properties were examined, corresponding to (i) a perfect hydrological model; (ii) a non-informative rainfall forecast for predicting streamflow; and (iii) a perfect input forecast. The results emphasize the importance of using different diagnostic approaches to perform comprehensive analyses of predictive distributions, to arrive at a multifaceted view of the attributes of the prediction. For the case study, the selected criteria revealed the interaction of the different sources of error, in particular the crucial role of the hydrological uncertainty processor when compensating, at the cost of wider forecast intervals, for the unreliable and biased predictive distribution resulting from the Precipitation Uncertainty Processor.
Calculating shock arrival in expansion tubes and shock tunnels using Bayesian changepoint analysis
NASA Astrophysics Data System (ADS)
James, Christopher M.; Bourke, Emily J.; Gildfind, David E.
2018-06-01
To understand the flow conditions generated in expansion tubes and shock tunnels, shock speeds are generally calculated based on shock arrival times at high-frequency wall-mounted pressure transducers. These calculations require that the shock arrival times are obtained accurately. This can be non-trivial for expansion tubes especially because pressure rises may be small and shock speeds high. Inaccurate shock arrival times can be a significant source of uncertainty. To help address this problem, this paper investigates two separate but complimentary techniques. Principally, it proposes using a Bayesian changepoint detection method to automatically calculate shock arrival, potentially reducing error and simplifying the shock arrival finding process. To compliment this, a technique for filtering the raw data without losing the shock arrival time is also presented and investigated. To test the validity of the proposed techniques, tests are performed using both a theoretical step change with different levels of noise and real experimental data. It was found that with conditions added to ensure that a real shock arrival time was found, the Bayesian changepoint analysis method was able to automatically find the shock arrival time, even for noisy signals.
NASA Astrophysics Data System (ADS)
Yee, Eugene
2007-04-01
Although a great deal of research effort has been focused on the forward prediction of the dispersion of contaminants (e.g., chemical and biological warfare agents) released into the turbulent atmosphere, much less work has been directed toward the inverse prediction of agent source location and strength from the measured concentration, even though the importance of this problem for a number of practical applications is obvious. In general, the inverse problem of source reconstruction is ill-posed and unsolvable without additional information. It is demonstrated that a Bayesian probabilistic inferential framework provides a natural and logically consistent method for source reconstruction from a limited number of noisy concentration data. In particular, the Bayesian approach permits one to incorporate prior knowledge about the source as well as additional information regarding both model and data errors. The latter enables a rigorous determination of the uncertainty in the inference of the source parameters (e.g., spatial location, emission rate, release time, etc.), hence extending the potential of the methodology as a tool for quantitative source reconstruction. A model (or, source-receptor relationship) that relates the source distribution to the concentration data measured by a number of sensors is formulated, and Bayesian probability theory is used to derive the posterior probability density function of the source parameters. A computationally efficient methodology for determination of the likelihood function for the problem, based on an adjoint representation of the source-receptor relationship, is described. Furthermore, we describe the application of efficient stochastic algorithms based on Markov chain Monte Carlo (MCMC) for sampling from the posterior distribution of the source parameters, the latter of which is required to undertake the Bayesian computation. The Bayesian inferential methodology for source reconstruction is validated against real dispersion data for two cases involving contaminant dispersion in highly disturbed flows over urban and complex environments where the idealizations of horizontal homogeneity and/or temporal stationarity in the flow cannot be applied to simplify the problem. Furthermore, the methodology is applied to the case of reconstruction of multiple sources.
Inference of emission rates from multiple sources using Bayesian probability theory.
Yee, Eugene; Flesch, Thomas K
2010-03-01
The determination of atmospheric emission rates from multiple sources using inversion (regularized least-squares or best-fit technique) is known to be very susceptible to measurement and model errors in the problem, rendering the solution unusable. In this paper, a new perspective is offered for this problem: namely, it is argued that the problem should be addressed as one of inference rather than inversion. Towards this objective, Bayesian probability theory is used to estimate the emission rates from multiple sources. The posterior probability distribution for the emission rates is derived, accounting fully for the measurement errors in the concentration data and the model errors in the dispersion model used to interpret the data. The Bayesian inferential methodology for emission rate recovery is validated against real dispersion data, obtained from a field experiment involving various source-sensor geometries (scenarios) consisting of four synthetic area sources and eight concentration sensors. The recovery of discrete emission rates from three different scenarios obtained using Bayesian inference and singular value decomposition inversion are compared and contrasted.
How Much Can We Learn from a Single Chromatographic Experiment? A Bayesian Perspective.
Wiczling, Paweł; Kaliszan, Roman
2016-01-05
In this work, we proposed and investigated a Bayesian inference procedure to find the desired chromatographic conditions based on known analyte properties (lipophilicity, pKa, and polar surface area) using one preliminary experiment. A previously developed nonlinear mixed effect model was used to specify the prior information about a new analyte with known physicochemical properties. Further, the prior (no preliminary data) and posterior predictive distribution (prior + one experiment) were determined sequentially to search towards the desired separation. The following isocratic high-performance reversed-phase liquid chromatographic conditions were sought: (1) retention time of a single analyte within the range of 4-6 min and (2) baseline separation of two analytes with retention times within the range of 4-10 min. The empirical posterior Bayesian distribution of parameters was estimated using the "slice sampling" Markov Chain Monte Carlo (MCMC) algorithm implemented in Matlab. The simulations with artificial analytes and experimental data of ketoprofen and papaverine were used to test the proposed methodology. The simulation experiment showed that for a single and two randomly selected analytes, there is 97% and 74% probability of obtaining a successful chromatogram using none or one preliminary experiment. The desired separation for ketoprofen and papaverine was established based on a single experiment. It was confirmed that the search for a desired separation rarely requires a large number of chromatographic analyses at least for a simple optimization problem. The proposed Bayesian-based optimization scheme is a powerful method of finding a desired chromatographic separation based on a small number of preliminary experiments.
2014-10-01
de l’exactitude et de la précision), comparativement au modèle de mesure plus simple qui n’utilise pas de multiplicateurs. Importance pour la défense...3) Bayesian experimental design for receptor placement in order to maximize the expected information in the measured concen- tration data for...applications of the Bayesian inferential methodology for source recon- struction have used high-quality concentration data from well- designed atmospheric
Cortical Hierarchies Perform Bayesian Causal Inference in Multisensory Perception
Rohe, Tim; Noppeney, Uta
2015-01-01
To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the “causal inference problem.” Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world. PMID:25710328
Gifford, Matthew E; Larson, Allan
2008-10-01
A previous phylogeographic study of mitochondrial haplotypes for the Hispaniolan lizard Ameiva chrysolaema revealed deep genetic structure associated with seawater inundation during the late Pliocene/early Pleistocene and evidence of subsequent population expansion into formerly inundated areas. We revisit hypotheses generated by our previous study using increased geographic sampling of populations and analysis of three nuclear markers (alpha-enolase intron 8, alpha-cardiac-actin intron 4, and beta-actin intron 3) in addition to mitochondrial haplotypes (ND2). Large genetic discontinuities correspond spatially and temporally with historical barriers to gene flow (sea inundations). NCPA cross-validation analysis and Bayesian multilocus analyses of divergence times (IMa and MCMCcoal) reveal two separate episodes of fragmentation associated with Pliocene and Pleistocene sea inundations, separating the species into historically separate Northern, East-Central, West-Central, and Southern population lineages. Multilocus Bayesian analysis using IMa indicates asymmetrical migration from the East-Central to the West-Central populations following secondary contact, consistent with expectations from the more pervasive sea inundation in the western region. The West-Central lineage has a genetic signature of population growth consistent with the expectation of geographic expansion into formerly inundated areas. Within each lineage, significant spatial genetic structure indicates isolation by distance at comparable temporal scales. This study adds to the growing body of evidence that vicariant speciation may be the prevailing source of lineage accumulation on oceanic islands. Thus, prior theories of island biogeography generally underestimate the role and temporal scale of intra-island vicariant processes.
Source Detection with Bayesian Inference on ROSAT All-Sky Survey Data Sample
NASA Astrophysics Data System (ADS)
Guglielmetti, F.; Voges, W.; Fischer, R.; Boese, G.; Dose, V.
2004-07-01
We employ Bayesian inference for the joint estimation of sources and background on ROSAT All-Sky Survey (RASS) data. The probabilistic method allows for detection improvement of faint extended celestial sources compared to the Standard Analysis Software System (SASS). Background maps were estimated in a single step together with the detection of sources without pixel censoring. Consistent uncertainties of background and sources are provided. The source probability is evaluated for single pixels as well as for pixel domains to enhance source detection of weak and extended sources.
[Determination of wine original regions using information fusion of NIR and MIR spectroscopy].
Xiang, Ling-Li; Li, Meng-Hua; Li, Jing-Mingz; Li, Jun-Hui; Zhang, Lu-Da; Zhao, Long-Lian
2014-10-01
Geographical origins of wine grapes are significant factors affecting wine quality and wine prices. Tasters' evaluation is a good method but has some limitations. It is important to discriminate different wine original regions quickly and accurately. The present paper proposed a method to determine wine original regions based on Bayesian information fusion that fused near-infrared (NIR) transmission spectra information and mid-infrared (MIR) ATR spectra information of wines. This method improved the determination results by expanding the sources of analysis information. NIR spectra and MIR spectra of 153 wine samples from four different regions of grape growing were collected by near-infrared and mid-infrared Fourier transform spe trometer separately. These four different regions are Huailai, Yantai, Gansu and Changli, which areall typical geographical originals for Chinese wines. NIR and MIR discriminant models for wine regions were established using partial least squares discriminant analysis (PLS-DA) based on NIR spectra and MIR spectra separately. In PLS-DA, the regions of wine samples are presented in group of binary code. There are four wine regions in this paper, thereby using four nodes standing for categorical variables. The output nodes values for each sample in NIR and MIR models were normalized first. These values stand for the probabilities of each sample belonging to each category. They seemed as the input to the Bayesian discriminant formula as a priori probability value. The probabilities were substituteed into the Bayesian formula to get posterior probabilities, by which we can judge the new class characteristics of these samples. Considering the stability of PLS-DA models, all the wine samples were divided into calibration sets and validation sets randomly for ten times. The results of NIR and MIR discriminant models of four wine regions were as follows: the average accuracy rates of calibration sets were 78.21% (NIR) and 82.57% (MIR), and the average accuracy rates of validation sets were 82.50% (NIR) and 81.98% (MIR). After using the method proposed in this paper, the accuracy rates of calibration and validation changed to 87.11% and 90.87% separately, which all achieved better results of determination than individual spectroscopy. These results suggest that Bayesian information fusion of NIR and MIR spectra is feasible for fast identification of wine original regions.
NASA Astrophysics Data System (ADS)
Li, L.; Xu, C.-Y.; Engeland, K.
2012-04-01
With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD
Metadynamic metainference: Enhanced sampling of the metainference ensemble using metadynamics
Bonomi, Massimiliano; Camilloni, Carlo; Vendruscolo, Michele
2016-01-01
Accurate and precise structural ensembles of proteins and macromolecular complexes can be obtained with metainference, a recently proposed Bayesian inference method that integrates experimental information with prior knowledge and deals with all sources of errors in the data as well as with sample heterogeneity. The study of complex macromolecular systems, however, requires an extensive conformational sampling, which represents a separate challenge. To address such challenge and to exhaustively and efficiently generate structural ensembles we combine metainference with metadynamics and illustrate its application to the calculation of the free energy landscape of the alanine dipeptide. PMID:27561930
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
NASA Astrophysics Data System (ADS)
Ram Upadhayay, Hari; Bodé, Samuel; Griepentrog, Marco; Bajracharya, Roshan Man; Blake, Will; Cornelis, Wim; Boeckx, Pascal
2017-04-01
The implementation of compound-specific stable isotope (CSSI) analyses of biotracers (e.g. fatty acids, FAs) as constraints on sediment-source contributions has become increasingly relevant to understand the origin of sediments in catchments. The CSSI fingerprinting of sediment utilizes CSSI signature of biotracer as input in an isotopic mixing model (IMM) to apportion source soil contributions. So far source studies relied on the linear mixing assumptions of CSSI signature of sources to the sediment without accounting for potential effects of source biotracer concentration. Here we evaluated the effect of FAs concentration in sources on the accuracy of source contribution estimations in artificial soil mixture of three well-separated land use sources. Soil samples from land use sources were mixed to create three groups of artificial mixture with known source contributions. Sources and artificial mixture were analysed for δ13C of FAs using gas chromatography-combustion-isotope ratio mass spectrometry. The source contributions to the mixture were estimated using with and without concentration-dependent MixSIAR, a Bayesian isotopic mixing model. The concentration-dependent MixSIAR provided the closest estimates to the known artificial mixture source contributions (mean absolute error, MAE = 10.9%, and standard error, SE = 1.4%). In contrast, the concentration-independent MixSIAR with post mixing correction of tracer proportions based on aggregated concentration of FAs of sources biased the source contributions (MAE = 22.0%, SE = 3.4%). This study highlights the importance of accounting the potential effect of a source FA concentration for isotopic mixing in sediments that adds realisms to mixing model and allows more accurate estimates of contributions of sources to the mixture. The potential influence of FA concentration on CSSI signature of sediments is an important underlying factor that determines whether the isotopic signature of a given source is observable even after equilibrium. Therefore inclusion of FA concentrations of the sources in the IMM formulation is standard procedure for accurate estimation of source contributions. The post model correction approach that dominates the CSSI fingerprinting causes bias, especially if the FAs concentration of sources differs substantially.
NASA Astrophysics Data System (ADS)
Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.
2014-12-01
A critical point in the analysis of ground displacements time series is the development of data driven methods that allow to discern and characterize the different sources that generate the observed displacements. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows to reduce the dimensionality of the data space maintaining most of the variance of the dataset explained. It reproduces the original data using a limited number of Principal Components, but it also shows some deficiencies. Indeed, PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem, i.e. in recovering and separating the original sources that generated the observed data. This is mainly due to the assumptions on which PCA relies: it looks for a new Euclidean space where the projected data are uncorrelated. Usually, the uncorrelation condition is not strong enough and it has been proven that the BSS problem can be tackled imposing on the components to be independent. The Independent Component Analysis (ICA) is, in fact, another popular technique adopted to approach this problem, and it can be used in all those fields where PCA is also applied. An ICA approach enables us to explain the time series imposing a fewer number of constraints on the model, and to reveal anomalies in the data such as transient signals. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here we present the application of the vbICA technique to GPS position time series. First, we use vbICA on synthetic data that simulate a seismic cycle (interseismic + coseismic + postseismic + seasonal + noise), and study the ability of the algorithm to recover the original (known) sources of deformation. Secondly, we apply vbICA to different tectonically active scenarios, such as earthquakes in central and northern Italy, as well as the study of slow slip events in Cascadia.
Frank, Scott D; Ferris, Aaron N
2011-08-01
During the Woodlark Basin seismic experiment in eastern Papua New Guinea (1999-2000), an ocean-bottom seismic array recorded marine mammal vocalizations along with target earthquake signals. The array consisted of 14 instruments, 7 of which were three-component seismometers with a fourth component hydrophone. They were deployed at 2.0-3.2 km water depth and operated from September 1999 through February 2000. While whale vocalizations were recorded throughout the deployment, this study focuses on 3 h from December 21, 1999 during which the signals are particularly clear. The recordings show a blue whale song composed of a three-unit phrase. That song does not match vocalization characteristics of other known Pacific subpopulations and may represent a previously undocumented blue whale song. Animal tracking and source level estimates are obtained with a Bayesian inversion method that generates probabilistic source locations. The Bayesian method is augmented to include travel time estimates from seismometers and hydrophones and acoustic signal amplitude. Tracking results show the whale traveled northeasterly over the course of 3 h, covering approximately 27 km. The path followed the edge of the Woodlark Basin along a shelf that separates the shallow waters of the Trobriand platform from the deep waters of the basin.
Bayesian Estimation of Fugitive Methane Point Source Emission Rates from a Single Downwind High-Frequency Gas Sensor With the tremendous advances in onshore oil and gas exploration and production (E&P) capability comes the realization that new tools are needed to support env...
Bayesian demography 250 years after Bayes
Bijak, Jakub; Bryant, John
2016-01-01
Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889
OGLE-2013-BLG-1761Lb: A Massive Planet around an M/K Dwarf
NASA Astrophysics Data System (ADS)
Hirao, Y.; Udalski, A.; Sumi, T.; Bennett, D. P.; Koshimoto, N.; Bond, I. A.; Rattenbury, N. J.; Suzuki, D.; and; Abe, F.; Asakura, Y.; Barry, R. K.; Bhattacharya, A.; Donachie, M.; Evans, P.; Fukui, A.; Itow, Y.; Li, M. C. A.; Ling, C. H.; Masuda, K.; Matsubara, Y.; Matsuo, T.; Muraki, Y.; Nagakane, M.; Ohnishi, K.; Ranc, C.; Saito, To.; Sharan, A.; Shibai, H.; Sullivan, D. J.; Tristram, P. J.; Yamada, T.; Yamada, T.; Yonehara, A.; MOA Collaboration; Poleski, R.; Skowron, J.; Mróz, P.; Szymański, M. K.; Kozłowski, S.; Pietrukowicz, P.; Soszyński, I.; Wyrzykowski, Ł.; Ulaczyk, K.; OGLE Collaboration
2017-07-01
We report the discovery and the analysis of the planetary microlensing event, OGLE-2013-BLG-1761. There are some degenerate solutions in this event because the planetary anomaly is only sparsely sampled. However, the detailed light-curve analysis ruled out all stellar binary models and shows the lens to be a planetary system. There is the so-called close/wide degeneracy in the solutions with the planet/host mass ratio of q ˜ (7.0 ± 2.0) × 10-3 and q ˜ (8.1 ± 2.6) × 10-3 with the projected separation in Einstein radius units of s = 0.95 (close) and s = 1.18 (wide), respectively. The microlens parallax effect is not detected, but the finite source effect is detected. Our Bayesian analysis indicates that the lens system is located DL=6.9-1.2+1.0 kpc away from us and the host star is an M/K dwarf with a mass of ML=0.33-0.19+0.32 M⊙ orbited by a super-Jupiter mass planet with a mass of mP=2.7-1.5+2.5 MJup at the projected separation of a\\perp=1.8-0.5+0.5 au. The preference of the large lens distance in the Bayesian analysis is due to the relatively large observed source star radius. The distance and other physical parameters may be constrained by the future high-resolution imaging by large ground telescopes or HST. If the estimated lens distance is correct, then this planet provides another sample for testing the claimed deficit of planets in the Galactic bulge.
OGLE-2013-BLG-1761Lb: A Massive Planet around an MK Dwarf
NASA Technical Reports Server (NTRS)
Hirao, Y.; Udalski, A.; Sumi, T.; Bennett, D. P.; Koshimoto, N.; Bond, I. A.; Rattenbury, N. J.; Suzuki, D.; Abe, F.; Asakura, Y.;
2017-01-01
We report the discovery and the analysis of the planetary microlensing event, OGLE-2013-BLG-1761. There are some degenerate solutions in this event because the planetary anomaly is only sparsely sampled. However, the detailed light curve analysis ruled out all stellar binary models and shows the lens to be a planetary system. There is the so-called close wide degeneracy in the solutions with the planet host mass ratio of q approx.(7.0+/-2.0) x 10(exp -3) and q approx.(8.1+/-2.6) x 10(exp -3) with the projected separation in Einstein radius units of s = 0.95 (close) and s = 1.18(wide), respectively. The microlens parallax effect is not detected, but the finite source effect is detected. Our Bayesian analysis indicates that the lens system is located -D(sub L) = 6.9(+ 1.0 -1.2)kpc away from us and the host star is an M/K dwarf with amass of M(sub L) = 0.33(+ 0.32- 1.9)Stellar Mass orbited by a super-Jupiter mass planet with a mass of m(sub p) = 2.7(+ 2.5 - 1.5) M(sub Jup) at the projected separation of a(sub l) = 1.8(+ 0.5 -0.5)au. The preference of the large lens distance in the Bayesian analysis is due to the relatively large observed source star radius. The distance and other physical parameters may be constrained by the future high-resolution imaging by large ground telescopes or HST. If the estimated lens distance is correct, then this planet provides another sample for testing the claimed deficit of planets in the Galactic bulge.
NASA Astrophysics Data System (ADS)
Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.
2017-12-01
Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.
Bayesian statistics in radionuclide metrology: measurement of a decaying source
NASA Astrophysics Data System (ADS)
Bochud, François O.; Bailat, Claude J.; Laedermann, Jean-Pascal
2007-08-01
The most intuitive way of defining a probability is perhaps through the frequency at which it appears when a large number of trials are realized in identical conditions. The probability derived from the obtained histogram characterizes the so-called frequentist or conventional statistical approach. In this sense, probability is defined as a physical property of the observed system. By contrast, in Bayesian statistics, a probability is not a physical property or a directly observable quantity, but a degree of belief or an element of inference. The goal of this paper is to show how Bayesian statistics can be used in radionuclide metrology and what its advantages and disadvantages are compared with conventional statistics. This is performed through the example of an yttrium-90 source typically encountered in environmental surveillance measurement. Because of the very low activity of this kind of source and the small half-life of the radionuclide, this measurement takes several days, during which the source decays significantly. Several methods are proposed to compute simultaneously the number of unstable nuclei at a given reference time, the decay constant and the background. Asymptotically, all approaches give the same result. However, Bayesian statistics produces coherent estimates and confidence intervals in a much smaller number of measurements. Apart from the conceptual understanding of statistics, the main difficulty that could deter radionuclide metrologists from using Bayesian statistics is the complexity of the computation.
XID+: Next generation XID development
NASA Astrophysics Data System (ADS)
Hurley, Peter
2017-04-01
XID+ is a prior-based source extraction tool which carries out photometry in the Herschel SPIRE (Spectral and Photometric Imaging Receiver) maps at the positions of known sources. It uses a probabilistic Bayesian framework that provides a natural framework in which to include prior information, and uses the Bayesian inference tool Stan to obtain the full posterior probability distribution on flux estimates.
NASA Astrophysics Data System (ADS)
Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn
2013-04-01
SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.
NASA Astrophysics Data System (ADS)
Wellen, Christopher; Arhonditsis, George B.; Long, Tanya; Boyd, Duncan
2014-11-01
Spatially distributed nonpoint source watershed models are essential tools to estimate the magnitude and sources of diffuse pollution. However, little work has been undertaken to understand the sources and ramifications of the uncertainty involved in their use. In this study we conduct the first Bayesian uncertainty analysis of the water quality components of the SWAT model, one of the most commonly used distributed nonpoint source models. Working in Southern Ontario, we apply three Bayesian configurations for calibrating SWAT to Redhill Creek, an urban catchment, and Grindstone Creek, an agricultural one. We answer four interrelated questions: can SWAT determine suspended sediment sources with confidence when end of basin data is used for calibration? How does uncertainty propagate from the discharge submodel to the suspended sediment submodels? Do the estimated sediment sources vary when different calibration approaches are used? Can we combine the knowledge gained from different calibration approaches? We show that: (i) despite reasonable fit at the basin outlet, the simulated sediment sources are subject to uncertainty sufficient to undermine the typical approach of reliance on a single, best fit simulation; (ii) more than a third of the uncertainty of sediment load predictions may stem from the discharge submodel; (iii) estimated sediment sources do vary significantly across the three statistical configurations of model calibration despite end-of-basin predictions being virtually identical; and (iv) Bayesian model averaging is an approach that can synthesize predictions when a number of adequate distributed models make divergent source apportionments. We conclude with recommendations for future research to reduce the uncertainty encountered when using distributed nonpoint source models for source apportionment.
Semi-blind Bayesian inference of CMB map and power spectrum
NASA Astrophysics Data System (ADS)
Vansyngel, Flavien; Wandelt, Benjamin D.; Cardoso, Jean-François; Benabed, Karim
2016-04-01
We present a new blind formulation of the cosmic microwave background (CMB) inference problem. The approach relies on a phenomenological model of the multifrequency microwave sky without the need for physical models of the individual components. For all-sky and high resolution data, it unifies parts of the analysis that had previously been treated separately such as component separation and power spectrum inference. We describe an efficient sampling scheme that fully explores the component separation uncertainties on the inferred CMB products such as maps and/or power spectra. External information about individual components can be incorporated as a prior giving a flexible way to progressively and continuously introduce physical component separation from a maximally blind approach. We connect our Bayesian formalism to existing approaches such as Commander, spectral mismatch independent component analysis (SMICA), and internal linear combination (ILC), and discuss possible future extensions.
Evaluation of Bayesian approaches to identify DDT source contributions to soils in Southeast China.
Zeng, Faming; Yang, Dan; Xing, Xinli; Qi, Shihua
2017-06-01
Dicofol application may be an important source to elevate the dichlorodiphenyltrichloroethane (DDT) residues to soils in Fujian, Southeast China, after the technical DDT was banned, which left DDT residues from the historical application. The DDT residues varied geographically, corresponding to the varied potential sources of DDT. In this study, a novel approach based on the Bayesian method (BM) was developed to identify the source contributions of DDT to soils, composed with both historical DDT and dicofol. The Naive Bayesian classifier was used basing on the subset of the samples, which were determined by chemical analysis independent of the Bayesian approach. The results show that BM (95%) was higher than that using the ratio of o, p'-/p, p'-DDT (84%) to identify DDT source contributions. High detection rate (97%) of dicofol (p, p'-OH-DDT) was observed in the subset, showing dicofol application influenced the DDX levels in soils in Fujian. However, the contribution from historical technical DDT source was greater than that from dicofol in Fujian, indicating historical technical DDT was still an important pollution source to soils. In addition, both the DDX (DDT isomers and derivatives) level and dicofol contribution in non-agricultural soils were higher than other agricultural land uses, especially in hilly regions, the potential cause may be the atmospheric transport of dicofol type DDT, after spraying during daytime, or regional difference on production and application. Copyright © 2017 Elsevier Ltd. All rights reserved.
Climatic Models Ensemble-based Mid-21st Century Runoff Projections: A Bayesian Framework
NASA Astrophysics Data System (ADS)
Achieng, K. O.; Zhu, J.
2017-12-01
There are a number of North American Regional Climate Change Assessment Program (NARCCAP) climatic models that have been used to project surface runoff in the mid-21st century. Statistical model selection techniques are often used to select the model that best fits data. However, model selection techniques often lead to different conclusions. In this study, ten models are averaged in Bayesian paradigm to project runoff. Bayesian Model Averaging (BMA) is used to project and identify effect of model uncertainty on future runoff projections. Baseflow separation - a two-digital filter which is also called Eckhardt filter - is used to separate USGS streamflow (total runoff) into two components: baseflow and surface runoff. We use this surface runoff as the a priori runoff when conducting BMA of runoff simulated from the ten RCM models. The primary objective of this study is to evaluate how well RCM multi-model ensembles simulate surface runoff, in a Bayesian framework. Specifically, we investigate and discuss the following questions: How well do ten RCM models ensemble jointly simulate surface runoff by averaging over all the models using BMA, given a priori surface runoff? What are the effects of model uncertainty on surface runoff simulation?
Bayesian flood forecasting methods: A review
NASA Astrophysics Data System (ADS)
Han, Shasha; Coulibaly, Paulin
2017-08-01
Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.
Bayesian networks improve causal environmental ...
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on value
A local approach for focussed Bayesian fusion
NASA Astrophysics Data System (ADS)
Sander, Jennifer; Heizmann, Michael; Goussev, Igor; Beyerer, Jürgen
2009-04-01
Local Bayesian fusion approaches aim to reduce high storage and computational costs of Bayesian fusion which is separated from fixed modeling assumptions. Using the small world formalism, we argue why this proceeding is conform with Bayesian theory. Then, we concentrate on the realization of local Bayesian fusion by focussing the fusion process solely on local regions that are task relevant with a high probability. The resulting local models correspond then to restricted versions of the original one. In a previous publication, we used bounds for the probability of misleading evidence to show the validity of the pre-evaluation of task specific knowledge and prior information which we perform to build local models. In this paper, we prove the validity of this proceeding using information theoretic arguments. For additional efficiency, local Bayesian fusion can be realized in a distributed manner. Here, several local Bayesian fusion tasks are evaluated and unified after the actual fusion process. For the practical realization of distributed local Bayesian fusion, software agents are predestinated. There is a natural analogy between the resulting agent based architecture and criminal investigations in real life. We show how this analogy can be used to improve the efficiency of distributed local Bayesian fusion additionally. Using a landscape model, we present an experimental study of distributed local Bayesian fusion in the field of reconnaissance, which highlights its high potential.
NASA Astrophysics Data System (ADS)
Kopka, Piotr; Wawrzynczak, Anna; Borysiewicz, Mieczyslaw
2016-11-01
In this paper the Bayesian methodology, known as Approximate Bayesian Computation (ABC), is applied to the problem of the atmospheric contamination source identification. The algorithm input data are on-line arriving concentrations of the released substance registered by the distributed sensors network. This paper presents the Sequential ABC algorithm in detail and tests its efficiency in estimation of probabilistic distributions of atmospheric release parameters of a mobile contamination source. The developed algorithms are tested using the data from Over-Land Atmospheric Diffusion (OLAD) field tracer experiment. The paper demonstrates estimation of seven parameters characterizing the contamination source, i.e.: contamination source starting position (x,y), the direction of the motion of the source (d), its velocity (v), release rate (q), start time of release (ts) and its duration (td). The online-arriving new concentrations dynamically update the probability distributions of search parameters. The atmospheric dispersion Second-order Closure Integrated PUFF (SCIPUFF) Model is used as the forward model to predict the concentrations at the sensors locations.
BATSE gamma-ray burst line search. 2: Bayesian consistency methodology
NASA Technical Reports Server (NTRS)
Band, D. L.; Ford, L. A.; Matteson, J. L.; Briggs, M.; Paciesas, W.; Pendleton, G.; Preece, R.; Palmer, D.; Teegarden, B.; Schaefer, B.
1994-01-01
We describe a Bayesian methodology to evaluate the consistency between the reported Ginga and Burst and Transient Source Experiment (BATSE) detections of absorption features in gamma-ray burst spectra. Currently no features have been detected by BATSE, but this methodology will still be applicable if and when such features are discovered. The Bayesian methodology permits the comparison of hypotheses regarding the two detectors' observations and makes explicit the subjective aspects of our analysis (e.g., the quantification of our confidence in detector performance). We also present non-Bayesian consistency statistics. Based on preliminary calculations of line detectability, we find that both the Bayesian and non-Bayesian techniques show that the BATSE and Ginga observations are consistent given our understanding of these detectors.
Wei Wu; James Clark; James Vose
2010-01-01
Hierarchical Bayesian (HB) modeling allows for multiple sources of uncertainty by factoring complex relationships into conditional distributions that can be used to draw inference and make predictions. We applied an HB model to estimate the parameters and state variables of a parsimonious hydrological model â GR4J â by coherently assimilating the uncertainties from the...
Bayesian Networks Improve Causal Environmental Assessments for Evidence-Based Policy.
Carriger, John F; Barron, Mace G; Newman, Michael C
2016-12-20
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on valued ecological resources. These aspects are demonstrated through hypothetical problem scenarios that explore some major benefits of using Bayesian networks for reasoning and making inferences in evidence-based policy.
Bayesian network modelling of upper gastrointestinal bleeding
NASA Astrophysics Data System (ADS)
Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri
2013-09-01
Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.
bnstruct: an R package for Bayesian Network structure learning in the presence of missing data.
Franzin, Alberto; Sambo, Francesco; Di Camillo, Barbara
2017-04-15
A Bayesian Network is a probabilistic graphical model that encodes probabilistic dependencies between a set of random variables. We introduce bnstruct, an open source R package to (i) learn the structure and the parameters of a Bayesian Network from data in the presence of missing values and (ii) perform reasoning and inference on the learned Bayesian Networks. To the best of our knowledge, there is no other open source software that provides methods for all of these tasks, particularly the manipulation of missing data, which is a common situation in practice. The software is implemented in R and C and is available on CRAN under a GPL licence. francesco.sambo@unipd.it. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A Fault Diagnosis Methodology for Gear Pump Based on EEMD and Bayesian Network
Liu, Zengkai; Liu, Yonghong; Shan, Hongkai; Cai, Baoping; Huang, Qing
2015-01-01
This paper proposes a fault diagnosis methodology for a gear pump based on the ensemble empirical mode decomposition (EEMD) method and the Bayesian network. Essentially, the presented scheme is a multi-source information fusion based methodology. Compared with the conventional fault diagnosis with only EEMD, the proposed method is able to take advantage of all useful information besides sensor signals. The presented diagnostic Bayesian network consists of a fault layer, a fault feature layer and a multi-source information layer. Vibration signals from sensor measurement are decomposed by the EEMD method and the energy of intrinsic mode functions (IMFs) are calculated as fault features. These features are added into the fault feature layer in the Bayesian network. The other sources of useful information are added to the information layer. The generalized three-layer Bayesian network can be developed by fully incorporating faults and fault symptoms as well as other useful information such as naked eye inspection and maintenance records. Therefore, diagnostic accuracy and capacity can be improved. The proposed methodology is applied to the fault diagnosis of a gear pump and the structure and parameters of the Bayesian network is established. Compared with artificial neural network and support vector machine classification algorithms, the proposed model has the best diagnostic performance when sensor data is used only. A case study has demonstrated that some information from human observation or system repair records is very helpful to the fault diagnosis. It is effective and efficient in diagnosing faults based on uncertain, incomplete information. PMID:25938760
A Fault Diagnosis Methodology for Gear Pump Based on EEMD and Bayesian Network.
Liu, Zengkai; Liu, Yonghong; Shan, Hongkai; Cai, Baoping; Huang, Qing
2015-01-01
This paper proposes a fault diagnosis methodology for a gear pump based on the ensemble empirical mode decomposition (EEMD) method and the Bayesian network. Essentially, the presented scheme is a multi-source information fusion based methodology. Compared with the conventional fault diagnosis with only EEMD, the proposed method is able to take advantage of all useful information besides sensor signals. The presented diagnostic Bayesian network consists of a fault layer, a fault feature layer and a multi-source information layer. Vibration signals from sensor measurement are decomposed by the EEMD method and the energy of intrinsic mode functions (IMFs) are calculated as fault features. These features are added into the fault feature layer in the Bayesian network. The other sources of useful information are added to the information layer. The generalized three-layer Bayesian network can be developed by fully incorporating faults and fault symptoms as well as other useful information such as naked eye inspection and maintenance records. Therefore, diagnostic accuracy and capacity can be improved. The proposed methodology is applied to the fault diagnosis of a gear pump and the structure and parameters of the Bayesian network is established. Compared with artificial neural network and support vector machine classification algorithms, the proposed model has the best diagnostic performance when sensor data is used only. A case study has demonstrated that some information from human observation or system repair records is very helpful to the fault diagnosis. It is effective and efficient in diagnosing faults based on uncertain, incomplete information.
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
A fully Bayesian method for jointly fitting instrumental calibration and X-ray spectral models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Jin; Yu, Yaming; Van Dyk, David A.
2014-10-20
Owing to a lack of robust principled methods, systematic instrumental uncertainties have generally been ignored in astrophysical data analysis despite wide recognition of the importance of including them. Ignoring calibration uncertainty can cause bias in the estimation of source model parameters and can lead to underestimation of the variance of these estimates. We previously introduced a pragmatic Bayesian method to address this problem. The method is 'pragmatic' in that it introduced an ad hoc technique that simplified computation by neglecting the potential information in the data for narrowing the uncertainty for the calibration product. Following that work, we use amore » principal component analysis to efficiently represent the uncertainty of the effective area of an X-ray (or γ-ray) telescope. Here, however, we leverage this representation to enable a principled, fully Bayesian method that coherently accounts for the calibration uncertainty in high-energy spectral analysis. In this setting, the method is compared with standard analysis techniques and the pragmatic Bayesian method. The advantage of the fully Bayesian method is that it allows the data to provide information not only for estimation of the source parameters but also for the calibration product—here the effective area, conditional on the adopted spectral model. In this way, it can yield more accurate and efficient estimates of the source parameters along with valid estimates of their uncertainty. Provided that the source spectrum can be accurately described by a parameterized model, this method allows rigorous inference about the effective area by quantifying which possible curves are most consistent with the data.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gastelum, Zoe N.; White, Amanda M.; Whitney, Paul D.
2013-06-04
The Multi-Source Signatures for Nuclear Programs project, part of Pacific Northwest National Laboratory’s (PNNL) Signature Discovery Initiative, seeks to computationally capture expert assessment of multi-type information such as text, sensor output, imagery, or audio/video files, to assess nuclear activities through a series of Bayesian network (BN) models. These models incorporate knowledge from a diverse range of information sources in order to help assess a country’s nuclear activities. The models span engineering topic areas, state-level indicators, and facility-specific characteristics. To illustrate the development, calibration, and use of BN models for multi-source assessment, we present a model that predicts a country’s likelihoodmore » to participate in the international nuclear nonproliferation regime. We validate this model by examining the extent to which the model assists non-experts arrive at conclusions similar to those provided by nuclear proliferation experts. We also describe the PNNL-developed software used throughout the lifecycle of the Bayesian network model development.« less
Abdul-Latiff, Muhammad Abu Bakar; Ruslin, Farhani; Fui, Vun Vui; Abu, Mohd-Hashim; Rovie-Ryan, Jeffrine Japning; Abdul-Patah, Pazil; Lakim, Maklarin; Roos, Christian; Yaakop, Salmah; Md-Zain, Badrul Munir
2014-01-01
Abstract Phylogenetic relationships among Malaysia’s long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b) sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian) portray a consistent clustering paradigm as Borneo’s population was distinguished from Peninsula’s population (99% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees). The East coast population was separated from other Peninsula populations (64% in NJ, 66% in MP and 0.53 posterior probability in Bayesian). West coast populations were divided into 2 clades: the North-South (47%/54% in NJ, 26/26% in MP and 1.00/0.80 posterior probability in Bayesian) and Island-Mainland (93% in NJ, 90% in MP and 1.00 posterior probability in Bayesian). The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia’s M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia. PMID:24899832
Abdul-Latiff, Muhammad Abu Bakar; Ruslin, Farhani; Fui, Vun Vui; Abu, Mohd-Hashim; Rovie-Ryan, Jeffrine Japning; Abdul-Patah, Pazil; Lakim, Maklarin; Roos, Christian; Yaakop, Salmah; Md-Zain, Badrul Munir
2014-01-01
Phylogenetic relationships among Malaysia's long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b) sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian) portray a consistent clustering paradigm as Borneo's population was distinguished from Peninsula's population (99% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees). The East coast population was separated from other Peninsula populations (64% in NJ, 66% in MP and 0.53 posterior probability in Bayesian). West coast populations were divided into 2 clades: the North-South (47%/54% in NJ, 26/26% in MP and 1.00/0.80 posterior probability in Bayesian) and Island-Mainland (93% in NJ, 90% in MP and 1.00 posterior probability in Bayesian). The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia's M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia.
Bayesian Inference for Time Trends in Parameter Values using Weighted Evidence Sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
D. L. Kelly; A. Malkhasyan
2010-09-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performedmore » for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “in-dustry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an applica-tion of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates an approach to incorporating multiple sources of data via applicability weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dana L. Kelly; Albert Malkhasyan
2010-06-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performedmore » for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “industry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an application of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates the development of a generic prior distribution, which incorporates multiple sources of generic data via weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.« less
Discovery of a Gas Giant Planet in Microlensing Event Ogle-2014-BLG-1760
NASA Technical Reports Server (NTRS)
Bhattacharya, A.; Bennett, D. P.; Bond, I. A.; Sumi, T.; Udalski, A.; Street, R.; Tsapras, Y.; Abe, F.; Freeman, M.; Fukui, A.
2016-01-01
We present the analysis of the planetary microlensing event OGLE-2014-BLG-1760, which shows a strong light-curve signal due to the presence of a Jupiter mass ratio planet. One unusual feature of this event is that the source star is quite blue, with V-I = 1.48 +/- 0.08. This is marginally consistent with a source star in the Galactic bulge, but it could possibly indicate a young source star on the far side of the disk. Assuming a bulge source, we perform a Bayesian analysis assuming a standard Galactic model, and this indicates that the planetary system resides in or near the Galactic bulge at D(sub L) = 6.9 +/- 1.1 kpc. It also indicates a host-star mass of M(sub *) = 0.51(sup + 0.44/sub -0.28) M(sub theta), a planet mass of m(sub p ) = 0.56(sup +0.34/sub -0.26) M(sub J), and a projected star-planet separation of a(perpendicular) = 1.75(sup +0.33/sub -0.34) au. The lens-source relative proper motion is micro(sub rel) = 6.5 +/- 1.1mas per yr. The lens (and stellar host star) is estimated to be very faint compared to the source star, so it is most likely that it can be detected only when the lens and source stars start to separate. Due to the relatively high relative proper motion, the lens and source will be resolved to about approximately 46 mas in 6-8 yr after the peak magnification. So, by 2020-2022, we can hope to detect the lens star with deep, high-resolution images.
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
Cai, C; Rodet, T; Legoupil, S; Mohammad-Djafari, A
2013-11-01
Dual-energy computed tomography (DECT) makes it possible to get two fractions of basis materials without segmentation. One is the soft-tissue equivalent water fraction and the other is the hard-matter equivalent bone fraction. Practical DECT measurements are usually obtained with polychromatic x-ray beams. Existing reconstruction approaches based on linear forward models without counting the beam polychromaticity fail to estimate the correct decomposition fractions and result in beam-hardening artifacts (BHA). The existing BHA correction approaches either need to refer to calibration measurements or suffer from the noise amplification caused by the negative-log preprocessing and the ill-conditioned water and bone separation problem. To overcome these problems, statistical DECT reconstruction approaches based on nonlinear forward models counting the beam polychromaticity show great potential for giving accurate fraction images. This work proposes a full-spectral Bayesian reconstruction approach which allows the reconstruction of high quality fraction images from ordinary polychromatic measurements. This approach is based on a Gaussian noise model with unknown variance assigned directly to the projections without taking negative-log. Referring to Bayesian inferences, the decomposition fractions and observation variance are estimated by using the joint maximum a posteriori (MAP) estimation method. Subject to an adaptive prior model assigned to the variance, the joint estimation problem is then simplified into a single estimation problem. It transforms the joint MAP estimation problem into a minimization problem with a nonquadratic cost function. To solve it, the use of a monotone conjugate gradient algorithm with suboptimal descent steps is proposed. The performance of the proposed approach is analyzed with both simulated and experimental data. The results show that the proposed Bayesian approach is robust to noise and materials. It is also necessary to have the accurate spectrum information about the source-detector system. When dealing with experimental data, the spectrum can be predicted by a Monte Carlo simulator. For the materials between water and bone, less than 5% separation errors are observed on the estimated decomposition fractions. The proposed approach is a statistical reconstruction approach based on a nonlinear forward model counting the full beam polychromaticity and applied directly to the projections without taking negative-log. Compared to the approaches based on linear forward models and the BHA correction approaches, it has advantages in noise robustness and reconstruction accuracy.
NASA Astrophysics Data System (ADS)
Gu, Chen; Marzouk, Youssef M.; Toksöz, M. Nafi
2018-03-01
Small earthquakes occur due to natural tectonic motions and are induced by oil and gas production processes. In many oil/gas fields and hydrofracking processes, induced earthquakes result from fluid extraction or injection. The locations and source mechanisms of these earthquakes provide valuable information about the reservoirs. Analysis of induced seismic events has mostly assumed a double-couple source mechanism. However, recent studies have shown a non-negligible percentage of non-double-couple components of source moment tensors in hydraulic fracturing events, assuming a full moment tensor source mechanism. Without uncertainty quantification of the moment tensor solution, it is difficult to determine the reliability of these source models. This study develops a Bayesian method to perform waveform-based full moment tensor inversion and uncertainty quantification for induced seismic events, accounting for both location and velocity model uncertainties. We conduct tests with synthetic events to validate the method, and then apply our newly developed Bayesian inversion approach to real induced seismicity in an oil/gas field in the sultanate of Oman—determining the uncertainties in the source mechanism and in the location of that event.
Stewart, Heather; Massoudieh, Arash; Gellis, Allen C.
2015-01-01
A Bayesian chemical mass balance (CMB) approach was used to assess the contribution of potential sources for fluvial samples from Laurel Hill Creek in southwest Pennsylvania. The Bayesian approach provides joint probability density functions of the sources' contributions considering the uncertainties due to source and fluvial sample heterogeneity and measurement error. Both elemental profiles of sources and fluvial samples and 13C and 15N isotopes were used for source apportionment. The sources considered include stream bank erosion, forest, roads and agriculture (pasture and cropland). Agriculture was found to have the largest contribution, followed by stream bank erosion. Also, road erosion was found to have a significant contribution in three of the samples collected during lower-intensity rain events. The source apportionment was performed with and without isotopes. The results were largely consistent; however, the use of isotopes was found to slightly increase the uncertainty in most of the cases. The correlation analysis between the contributions of sources shows strong correlations between stream bank and agriculture, whereas roads and forest seem to be less correlated to other sources. Thus, the method was better able to estimate road and forest contributions independently. The hypothesis that the contributions of sources are not seasonally changing was tested by assuming that all ten fluvial samples had the same source contributions. This hypothesis was rejected, demonstrating a significant seasonal variation in the sources of sediments in the stream.
NASA Astrophysics Data System (ADS)
Kopka, P.; Wawrzynczak, A.; Borysiewicz, M.
2015-09-01
In many areas of application, a central problem is a solution to the inverse problem, especially estimation of the unknown model parameters to model the underlying dynamics of a physical system precisely. In this situation, the Bayesian inference is a powerful tool to combine observed data with prior knowledge to gain the probability distribution of searched parameters. We have applied the modern methodology named Sequential Approximate Bayesian Computation (S-ABC) to the problem of tracing the atmospheric contaminant source. The ABC is technique commonly used in the Bayesian analysis of complex models and dynamic system. Sequential methods can significantly increase the efficiency of the ABC. In the presented algorithm, the input data are the on-line arriving concentrations of released substance registered by distributed sensor network from OVER-LAND ATMOSPHERIC DISPERSION (OLAD) experiment. The algorithm output are the probability distributions of a contamination source parameters i.e. its particular location, release rate, speed and direction of the movement, start time and duration. The stochastic approach presented in this paper is completely general and can be used in other fields where the parameters of the model bet fitted to the observable data should be found.
NASA Technical Reports Server (NTRS)
Kraft, Ralph P.; Burrows, David N.; Nousek, John A.
1991-01-01
Two different methods, classical and Bayesian, for determining confidence intervals involving Poisson-distributed data are compared. Particular consideration is given to cases where the number of counts observed is small and is comparable to the mean number of background counts. Reasons for preferring the Bayesian over the classical method are given. Tables of confidence limits calculated by the Bayesian method are provided for quick reference.
NASA Astrophysics Data System (ADS)
Meillier, Céline; Chatelain, Florent; Michel, Olivier; Bacon, Roland; Piqueras, Laure; Bacher, Raphael; Ayasso, Hacheme
2016-04-01
We present SELFI, the Source Emission Line FInder, a new Bayesian method optimized for detection of faint galaxies in Multi Unit Spectroscopic Explorer (MUSE) deep fields. MUSE is the new panoramic integral field spectrograph at the Very Large Telescope (VLT) that has unique capabilities for spectroscopic investigation of the deep sky. It has provided data cubes with 324 million voxels over a single 1 arcmin2 field of view. To address the challenge of faint-galaxy detection in these large data cubes, we developed a new method that processes 3D data either for modeling or for estimation and extraction of source configurations. This object-based approach yields a natural sparse representation of the sources in massive data fields, such as MUSE data cubes. In the Bayesian framework, the parameters that describe the observed sources are considered random variables. The Bayesian model leads to a general and robust algorithm where the parameters are estimated in a fully data-driven way. This detection algorithm was applied to the MUSE observation of Hubble Deep Field-South. With 27 h total integration time, these observations provide a catalog of 189 sources of various categories and with secured redshift. The algorithm retrieved 91% of the galaxies with only 9% false detection. This method also allowed the discovery of three new Lyα emitters and one [OII] emitter, all without any Hubble Space Telescope counterpart. We analyzed the reasons for failure for some targets, and found that the most important limitation of the method is when faint sources are located in the vicinity of bright spatially resolved galaxies that cannot be approximated by the Sérsic elliptical profile. The software and its documentation are available on the MUSE science web service (muse-vlt.eu/science).
Bayesian Modeling of the Assimilative Capacity Component of Stream Nutrient Export
Implementing stream restoration techniques and best management practices to reduce nonpoint source nutrients implies enhancement of the assimilative capacity for the stream system. In this paper, a Bayesian method for evaluating this component of a TMDL load capacity is developed...
Estimating the Earthquake Source Time Function by Markov Chain Monte Carlo Sampling
NASA Astrophysics Data System (ADS)
Dȩbski, Wojciech
2008-07-01
Many aspects of earthquake source dynamics like dynamic stress drop, rupture velocity and directivity, etc. are currently inferred from the source time functions obtained by a deconvolution of the propagation and recording effects from seismograms. The question of the accuracy of obtained results remains open. In this paper we address this issue by considering two aspects of the source time function deconvolution. First, we propose a new pseudo-spectral parameterization of the sought function which explicitly takes into account the physical constraints imposed on the sought functions. Such parameterization automatically excludes non-physical solutions and so improves the stability and uniqueness of the deconvolution. Secondly, we demonstrate that the Bayesian approach to the inverse problem at hand, combined with an efficient Markov Chain Monte Carlo sampling technique, is a method which allows efficient estimation of the source time function uncertainties. The key point of the approach is the description of the solution of the inverse problem by the a posteriori probability density function constructed according to the Bayesian (probabilistic) theory. Next, the Markov Chain Monte Carlo sampling technique is used to sample this function so the statistical estimator of a posteriori errors can be easily obtained with minimal additional computational effort with respect to modern inversion (optimization) algorithms. The methodological considerations are illustrated by a case study of the mining-induced seismic event of the magnitude M L ≈3.1 that occurred at Rudna (Poland) copper mine. The seismic P-wave records were inverted for the source time functions, using the proposed algorithm and the empirical Green function technique to approximate Green functions. The obtained solutions seem to suggest some complexity of the rupture process with double pulses of energy release. However, the error analysis shows that the hypothesis of source complexity is not justified at the 95% confidence level. On the basis of the analyzed event we also show that the separation of the source inversion into two steps introduces limitations on the completeness of the a posteriori error analysis.
A Bayesian approach to earthquake source studies
NASA Astrophysics Data System (ADS)
Minson, Sarah
Bayesian sampling has several advantages over conventional optimization approaches to solving inverse problems. It produces the distribution of all possible models sampled proportionally to how much each model is consistent with the data and the specified prior information, and thus images the entire solution space, revealing the uncertainties and trade-offs in the model. Bayesian sampling is applicable to both linear and non-linear modeling, and the values of the model parameters being sampled can be constrained based on the physics of the process being studied and do not have to be regularized. However, these methods are computationally challenging for high-dimensional problems. Until now the computational expense of Bayesian sampling has been too great for it to be practicable for most geophysical problems. I present a new parallel sampling algorithm called CATMIP for Cascading Adaptive Tempered Metropolis In Parallel. This technique, based on Transitional Markov chain Monte Carlo, makes it possible to sample distributions in many hundreds of dimensions, if the forward model is fast, or to sample computationally expensive forward models in smaller numbers of dimensions. The design of the algorithm is independent of the model being sampled, so CATMIP can be applied to many areas of research. I use CATMIP to produce a finite fault source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. Surface displacements from the earthquake were recorded by six interferograms and twelve local high-rate GPS stations. Because of the wealth of near-fault data, the source process is well-constrained. I find that the near-field high-rate GPS data have significant resolving power above and beyond the slip distribution determined from static displacements. The location and magnitude of the maximum displacement are resolved. The rupture almost certainly propagated at sub-shear velocities. The full posterior distribution can be used not only to calculate source parameters but also to determine their uncertainties. So while kinematic source modeling and the estimation of source parameters is not new, with CATMIP I am able to use Bayesian sampling to determine which parts of the source process are well-constrained and which are not.
Drake, Brandon Lee; Wills, Wirt H.; Hamilton, Marian I.; Dorshow, Wetherbee
2014-01-01
Strontium isotope sourcing has become a common and useful method for assigning sources to archaeological artifacts. In Chaco Canyon, an Ancestral Pueblo regional center in New Mexico, previous studies using these methods have suggested that significant portion of maize and wood originate in the Chuska Mountains region, 75 km to the East. In the present manuscript, these results were tested using both frequentist methods (to determine if geochemical sources can truly be differentiated) and Bayesian methods (to address uncertainty in geochemical source attribution). It was found that Chaco Canyon and the Chuska Mountain region are not easily distinguishable based on radiogenic strontium isotope values. The strontium profiles of many geochemical sources in the region overlap, making it difficult to definitively identify any one particular geochemical source for the canyon's pre-historic maize. Bayesian mixing models support the argument that some spruce and fir wood originated in the San Mateo Mountains, but that this cannot explain all 87Sr/86Sr values in Chaco timber. Overall radiogenic strontium isotope data do not clearly identify a single major geochemical source for maize, ponderosa, and most spruce/fir timber. As such, the degree to which Chaco Canyon relied upon outside support for both food and construction material is still ambiguous. PMID:24854352
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
OGLE-2017-BLG-1522: A Giant Planet around a Brown Dwarf Located in the Galactic Bulge
NASA Astrophysics Data System (ADS)
Jung, Y. K.; Udalski, A.; Gould, A.; Ryu, Y.-H.; Yee, J. C.; and; Han, C.; Albrow, M. D.; Lee, C.-U.; Kim, S.-L.; Hwang, K.-H.; Chung, S.-J.; Shin, I.-G.; Zhu, W.; Cha, S.-M.; Kim, D.-J.; Lee, Y.; Park, B.-G.; Lee, D.-J.; Kim, H.-W.; Pogge, R. W.; The KMTNet Collaboration; Szymański, M. K.; Mróz, P.; Poleski, R.; Skowron, J.; Pietrukowicz, P.; Soszyński, I.; Kozłowski, S.; Ulaczyk, K.; Pawlak, M.; Rybicki, K.; The OGLE Collaboration
2018-05-01
We report the discovery of a giant planet in the OGLE-2017-BLG-1522 microlensing event. The planetary perturbations were clearly identified by high-cadence survey experiments despite the relatively short event timescale of t E ∼ 7.5 days. The Einstein radius is unusually small, θ E = 0.065 mas, implying that the lens system either has very low mass or lies much closer to the microlensed source than the Sun, or both. A Bayesian analysis yields component masses ({M}host},{M}planet})=({46}-25+79,{0.75}-0.40+1.26) {M}{{J}} and source-lens distance {D}LS}={0.99}-0.54+0.91 {kpc}, implying that this is a brown-dwarf/Jupiter system that probably lies in the Galactic bulge, a location that is also consistent with the relatively low lens-source relative proper motion μ = 3.2 ± 0.5 mas yr‑1. The projected companion-host separation is {0.59}-0.11+0.12 {au}, indicating that the planet is placed beyond the snow line of the host, i.e., a sl ∼ 0.12 au. Planet formation scenarios combined with the small companion-host mass ratio q ∼ 0.016 and separation suggest that the companion could be the first discovery of a giant planet that formed in a protoplanetary disk around a brown-dwarf host.
Variational dynamic background model for keyword spotting in handwritten documents
NASA Astrophysics Data System (ADS)
Kumar, Gaurav; Wshah, Safwan; Govindaraju, Venu
2013-12-01
We propose a bayesian framework for keyword spotting in handwritten documents. This work is an extension to our previous work where we proposed dynamic background model, DBM for keyword spotting that takes into account the local character level scores and global word level scores to learn a logistic regression classifier to separate keywords from non-keywords. In this work, we add a bayesian layer on top of the DBM called the variational dynamic background model, VDBM. The logistic regression classifier uses the sigmoid function to separate keywords from non-keywords. The sigmoid function being neither convex nor concave, exact inference of VDBM becomes intractable. An expectation maximization step is proposed to do approximate inference. The advantage of VDBM over the DBM is multi-fold. Firstly, being bayesian, it prevents over-fitting of data. Secondly, it provides better modeling of data and an improved prediction of unseen data. VDBM is evaluated on the IAM dataset and the results prove that it outperforms our prior work and other state of the art line based word spotting system.
Sparse Bayesian Learning for Nonstationary Data Sources
NASA Astrophysics Data System (ADS)
Fujimaki, Ryohei; Yairi, Takehisa; Machida, Kazuo
This paper proposes an online Sparse Bayesian Learning (SBL) algorithm for modeling nonstationary data sources. Although most learning algorithms implicitly assume that a data source does not change over time (stationary), one in the real world usually does due to such various factors as dynamically changing environments, device degradation, sudden failures, etc (nonstationary). The proposed algorithm can be made useable for stationary online SBL by setting time decay parameters to zero, and as such it can be interpreted as a single unified framework for online SBL for use with stationary and nonstationary data sources. Tests both on four types of benchmark problems and on actual stock price data have shown it to perform well.
Bayesian Inference for Source Reconstruction: A Real-World Application
Yee, Eugene; Hoffman, Ian; Ungar, Kurt
2014-01-01
This paper applies a Bayesian probabilistic inferential methodology for the reconstruction of the location and emission rate from an actual contaminant source (emission from the Chalk River Laboratories medical isotope production facility) using a small number of activity concentration measurements of a noble gas (Xenon-133) obtained from three stations that form part of the International Monitoring System radionuclide network. The sampling of the resulting posterior distribution of the source parameters is undertaken using a very efficient Markov chain Monte Carlo technique that utilizes a multiple-try differential evolution adaptive Metropolis algorithm with an archive of past states. It is shown that the principal difficulty in the reconstruction lay in the correct specification of the model errors (both scale and structure) for use in the Bayesian inferential methodology. In this context, two different measurement models for incorporation of the model error of the predicted concentrations are considered. The performance of both of these measurement models with respect to their accuracy and precision in the recovery of the source parameters is compared and contrasted. PMID:27379292
Varughese, Eunice A; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer L; Fout, G Shay; Furlong, Edward T; Kolpin, Dana W; Glassmeyer, Susan T; Keely, Scott P
2018-04-01
Drinking water treatment plants rely on purification of contaminated source waters to provide communities with potable water. One group of possible contaminants are enteric viruses. Measurement of viral quantities in environmental water systems are often performed using polymerase chain reaction (PCR) or quantitative PCR (qPCR). However, true values may be underestimated due to challenges involved in a multi-step viral concentration process and due to PCR inhibition. In this study, water samples were concentrated from 25 drinking water treatment plants (DWTPs) across the US to study the occurrence of enteric viruses in source water and removal after treatment. The five different types of viruses studied were adenovirus, norovirus GI, norovirus GII, enterovirus, and polyomavirus. Quantitative PCR was performed on all samples to determine presence or absence of these viruses in each sample. Ten DWTPs showed presence of one or more viruses in source water, with four DWTPs having treated drinking water testing positive. Furthermore, PCR inhibition was assessed for each sample using an exogenous amplification control, which indicated that all of the DWTP samples, including source and treated water samples, had some level of inhibition, confirming that inhibition plays an important role in PCR-based assessments of environmental samples. PCR inhibition measurements, viral recovery, and other assessments were incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters. Published by Elsevier B.V.
Deep Learning Neural Networks and Bayesian Neural Networks in Data Analysis
NASA Astrophysics Data System (ADS)
Chernoded, Andrey; Dudko, Lev; Myagkov, Igor; Volkov, Petr
2017-10-01
Most of the modern analyses in high energy physics use signal-versus-background classification techniques of machine learning methods and neural networks in particular. Deep learning neural network is the most promising modern technique to separate signal and background and now days can be widely and successfully implemented as a part of physical analysis. In this article we compare Deep learning and Bayesian neural networks application as a classifiers in an instance of top quark analysis.
Groopman, Amber M.; Katz, Jonathan I.; Holland, Mark R.; Fujita, Fuminori; Matsukawa, Mami; Mizuno, Katsunori; Wear, Keith A.; Miller, James G.
2015-01-01
Conventional, Bayesian, and the modified least-squares Prony's plus curve-fitting (MLSP + CF) methods were applied to data acquired using 1 MHz center frequency, broadband transducers on a single equine cancellous bone specimen that was systematically shortened from 11.8 mm down to 0.5 mm for a total of 24 sample thicknesses. Due to overlapping fast and slow waves, conventional analysis methods were restricted to data from sample thicknesses ranging from 11.8 mm to 6.0 mm. In contrast, Bayesian and MLSP + CF methods successfully separated fast and slow waves and provided reliable estimates of the ultrasonic properties of fast and slow waves for sample thicknesses ranging from 11.8 mm down to 3.5 mm. Comparisons of the three methods were carried out for phase velocity at the center frequency and the slope of the attenuation coefficient for the fast and slow waves. Good agreement among the three methods was also observed for average signal loss at the center frequency. The Bayesian and MLSP + CF approaches were able to separate the fast and slow waves and provide good estimates of the fast and slow wave properties even when the two wave modes overlapped in both time and frequency domains making conventional analysis methods unreliable. PMID:26328678
In our previous research, we showed that robust Bayesian methods can be used in environmental modeling to define a set of probability distributions for key parameters that captures the effects of expert disagreement, ambiguity, or ignorance. This entire set can then be update...
An Open-Source Bayesian Atmospheric Radiative Transfer (BART) Code, with Application to WASP-12b
NASA Astrophysics Data System (ADS)
Harrington, Joseph; Blecic, Jasmina; Cubillos, Patricio; Rojo, Patricio; Loredo, Thomas J.; Bowman, M. Oliver; Foster, Andrew S. D.; Stemm, Madison M.; Lust, Nate B.
2015-01-01
Atmospheric retrievals for solar-system planets typically fit, either with a minimizer or by eye, a synthetic spectrum to high-resolution (Δλ/λ ~ 1000-100,000) data with S/N > 100 per point. In contrast, exoplanet data often have S/N ~ 10 per point, and may have just a few points representing bandpasses larger than 1 um. To derive atmospheric constraints and robust parameter uncertainty estimates from such data requires a Bayesian approach. To date there are few investigators with the relevant codes, none of which are publicly available. We are therefore pleased to announce the open-source Bayesian Atmospheric Radiative Transfer (BART) code. BART uses a Bayesian phase-space explorer to drive a radiative-transfer model through the parameter phase space, producing the most robust estimates available for the thermal profile and chemical abundances in the atmosphere. We present an overview of the code and an initial application to Spitzer eclipse data for WASP-12b. We invite the community to use and improve BART via the open-source development site GitHub.com. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G. JB holds a NASA Earth and Space Science Fellowship.
An Open-Source Bayesian Atmospheric Radiative Transfer (BART) Code, and Application to WASP-12b
NASA Astrophysics Data System (ADS)
Harrington, Joseph; Blecic, Jasmina; Cubillos, Patricio; Rojo, Patricio M.; Loredo, Thomas J.; Bowman, Matthew O.; Foster, Andrew S.; Stemm, Madison M.; Lust, Nate B.
2014-11-01
Atmospheric retrievals for solar-system planets typically fit, either with a minimizer or by eye, a synthetic spectrum to high-resolution (Δλ/λ ~ 1000-100,000) data with S/N > 100 per point. In contrast, exoplanet data often have S/N ~ 10 per point, and may have just a few points representing bandpasses larger than 1 um. To derive atmospheric constraints and robust parameter uncertainty estimates from such data requires a Bayesian approach. To date there are few investigators with the relevant codes, none of which are publicly available. We are therefore pleased to announce the open-source Bayesian Atmospheric Radiative Transfer (BART) code. BART uses a Bayesian phase-space explorer to drive a radiative-transfer model through the parameter phase space, producing the most robust estimates available for the thermal profile and chemical abundances in the atmosphere. We present an overview of the code and an initial application to Spitzer eclipse data for WASP-12b. We invite the community to use and improve BART via the open-source development site GitHub.com. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G. JB holds a NASA Earth and Space Science Fellowship.
NASA Astrophysics Data System (ADS)
Wang, L.; Davis, J. L.; Tamisiea, M. E.
2017-12-01
The Antarctic ice sheet (AIS) holds about 60% of all fresh water on the Earth, an amount equivalent to about 58 m of sea-level rise. Observation of AIS mass change is thus essential in determining and predicting its contribution to sea level. While the ice mass loss estimates for West Antarctica (WA) and the Antarctic Peninsula (AP) are in good agreement, what the mass balance over East Antarctica (EA) is, and whether or not it compensates for the mass loss is under debate. Besides the different error sources and sensitivities of different measurement types, complex spatial and temporal variabilities would be another factor complicating the accurate estimation of the AIS mass balance. Therefore, a model that allows for variabilities in both melting rate and seasonal signals would seem appropriate in the estimation of present-day AIS melting. We present a stochastic filter technique, which enables the Bayesian separation of the systematic stripe noise and mass signal in decade-length GRACE monthly gravity series, and allows the estimation of time-variable seasonal and inter-annual components in the signals. One of the primary advantages of this Bayesian method is that it yields statistically rigorous uncertainty estimates reflecting the inherent spatial resolution of the data. By applying the stochastic filter to the decade-long GRACE observations, we present the temporal variabilities of the AIS mass balance at basin scale, particularly over East Antarctica, and decipher the EA mass variations in the past decade, and their role in affecting overall AIS mass balance and sea level.
Bayesian approach for counting experiment statistics applied to a neutrino point source analysis
NASA Astrophysics Data System (ADS)
Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.
2013-12-01
In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.
Broadband Processing in a Noisy Shallow Ocean Environment: A Particle Filtering Approach
Candy, J. V.
2016-04-14
Here we report that when a broadband source propagates sound in a shallow ocean the received data can become quite complicated due to temperature-related sound-speed variations and therefore a highly dispersive environment. Noise and uncertainties disrupt this already chaotic environment even further because disturbances propagate through the same inherent acoustic channel. The broadband (signal) estimation/detection problem can be decomposed into a set of narrowband solutions that are processed separately and then combined to achieve more enhancement of signal levels than that available from a single frequency, thereby allowing more information to be extracted leading to a more reliable source detection.more » A Bayesian solution to the broadband modal function tracking, pressure-field enhancement, and source detection problem is developed that leads to nonparametric estimates of desired posterior distributions enabling the estimation of useful statistics and an improved processor/detector. In conclusion, to investigate the processor capabilities, we synthesize an ensemble of noisy, broadband, shallow-ocean measurements to evaluate its overall performance using an information theoretical metric for the preprocessor and the receiver operating characteristic curve for the detector.« less
NASA Astrophysics Data System (ADS)
Gualandi, Adriano; Serpelloni, Enrico; Elina Belardinelli, Maria; Bonafede, Maurizio; Pezzo, Giuseppe; Tolomei, Cristiano
2015-04-01
A critical point in the analysis of ground displacement time series, as those measured by modern space geodetic techniques (primarly continuous GPS/GNSS and InSAR) is the development of data driven methods that allow to discern and characterize the different sources that generate the observed displacements. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows to reduce the dimensionality of the data space maintaining most of the variance of the dataset explained. It reproduces the original data using a limited number of Principal Components, but it also shows some deficiencies, since PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem. The recovering and separation of the different sources that generate the observed ground deformation is a fundamental task in order to provide a physical meaning to the possible different sources. PCA fails in the BSS problem since it looks for a new Euclidean space where the projected data are uncorrelated. Usually, the uncorrelation condition is not strong enough and it has been proven that the BSS problem can be tackled imposing on the components to be independent. The Independent Component Analysis (ICA) is, in fact, another popular technique adopted to approach this problem, and it can be used in all those fields where PCA is also applied. An ICA approach enables us to explain the displacement time series imposing a fewer number of constraints on the model, and to reveal anomalies in the data such as transient deformation signals. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here we introduce the vbICA technique and present its application on synthetic data that simulate a GPS network recording ground deformation in a tectonically active region, with synthetic time-series containing interseismic, coseismic, and postseismic deformation, plus seasonal deformation, and white and coloured noise. We study the ability of the algorithm to recover the original (known) sources of deformation, and then apply it to a real scenario: the Emilia seismic sequence (2012, northern Italy), which is an example of seismic sequence occurred in a slowly converging tectonic setting, characterized by several local to regional anthropogenic or natural sources of deformation, mainly subsidence due to fluid withdrawal and sediments compaction. We apply both PCA and vbICA to displacement time-series recorded by continuous GPS and InSAR (Pezzo et al., EGU2015-8950).
Predicting Drug Safety and Communicating Risk: Benefits of a Bayesian Approach.
Lazic, Stanley E; Edmunds, Nicholas; Pollard, Christopher E
2018-03-01
Drug toxicity is a major source of attrition in drug discovery and development. Pharmaceutical companies routinely use preclinical data to predict clinical outcomes and continue to invest in new assays to improve predictions. However, there are many open questions about how to make the best use of available data, combine diverse data, quantify risk, and communicate risk and uncertainty to enable good decisions. The costs of suboptimal decisions are clear: resources are wasted and patients may be put at risk. We argue that Bayesian methods provide answers to all of these problems and use hERG-mediated QT prolongation as a case study. Benefits of Bayesian machine learning models include intuitive probabilistic statements of risk that incorporate all sources of uncertainty, the option to include diverse data and external information, and visualizations that have a clear link between the output from a statistical model and what this means for risk. Furthermore, Bayesian methods are easy to use with modern software, making their adoption for safety screening straightforward. We include R and Python code to encourage the adoption of these methods.
NASA Astrophysics Data System (ADS)
Salvato, M.; Buchner, J.; Budavári, T.; Dwelly, T.; Merloni, A.; Brusa, M.; Rau, A.; Fotopoulou, S.; Nandra, K.
2018-02-01
We release the AllWISE counterparts and Gaia matches to 106 573 and 17 665 X-ray sources detected in the ROSAT 2RXS and XMMSL2 surveys with |b| > 15°. These are the brightest X-ray sources in the sky, but their position uncertainties and the sparse multi-wavelength coverage until now rendered the identification of their counterparts a demanding task with uncertain results. New all-sky multi-wavelength surveys of sufficient depth, like AllWISE and Gaia, and a new Bayesian statistics based algorithm, NWAY, allow us, for the first time, to provide reliable counterpart associations. NWAY extends previous distance and sky density based association methods and, using one or more priors (e.g. colours, magnitudes), weights the probability that sources from two or more catalogues are simultaneously associated on the basis of their observable characteristics. Here, counterparts have been determined using a Wide-field Infrared Survey Explorer (WISE) colour-magnitude prior. A reference sample of 4524 XMM/Chandra and Swift X-ray sources demonstrates a reliability of ∼94.7 per cent (2RXS) and 97.4 per cent (XMMSL2). Combining our results with Chandra-COSMOS data, we propose a new separation between stars and AGN in the X-ray/WISE flux-magnitude plane, valid over six orders of magnitude. We also release the NWAY code and its user manual. NWAY was extensively tested with XMM-COSMOS data. Using two different sets of priors, we find an agreement of 96 per cent and 99 per cent with published Likelihood Ratio methods. Our results were achieved faster and without any follow-up visual inspection. With the advent of deep and wide area surveys in X-rays (e.g. SRG/eROSITA, Athena/WFI) and radio (ASKAP/EMU, LOFAR, APERTIF, etc.) NWAY will provide a powerful and reliable counterpart identification tool.
Global invasion network of the brown marmorated stink bug, Halyomorpha halys.
Valentin, Rafael E; Nielsen, Anne L; Wiman, Nik G; Lee, Doo-Hyung; Fonseca, Dina M
2017-08-29
Human mediated transportation into novel habitats is a prerequisite for the establishment of non-native species that become invasive, so knowledge of common sources may allow prevention. The brown marmorated stink bug (BMSB, Halyomorpha halys) is an East Asian species now established across North America and Europe, that in the Eastern United States of America (US) and Italy is causing significant economic losses to agriculture. After US populations were shown to originate from Northern China, others have tried to source BMSB populations now in Canada, Switzerland, Italy, France, Greece, and Hungary. Due to selection of different molecular markers, however, integrating all the datasets to obtain a broader picture of BMSB's expansion has been difficult. To address this limitation we focused on a single locus, the barcode region in the cytochrome oxidase I mitochondrial gene, and analyzed representative BMSB samples from across its current global range using an Approximate Bayesian Computation approach. We found that China is the likely source of most non-native populations, with at least four separate introductions in North America and three in Europe. Additionally, we found evidence of one bridgehead event: a likely Eastern US source for the central Italy populations that interestingly share enhanced pest status.
NASA Astrophysics Data System (ADS)
Rajaona, Harizo; Septier, François; Armand, Patrick; Delignon, Yves; Olry, Christophe; Albergel, Armand; Moussafir, Jacques
2015-12-01
In the eventuality of an accidental or intentional atmospheric release, the reconstruction of the source term using measurements from a set of sensors is an important and challenging inverse problem. A rapid and accurate estimation of the source allows faster and more efficient action for first-response teams, in addition to providing better damage assessment. This paper presents a Bayesian probabilistic approach to estimate the location and the temporal emission profile of a pointwise source. The release rate is evaluated analytically by using a Gaussian assumption on its prior distribution, and is enhanced with a positivity constraint to improve the estimation. The source location is obtained by the means of an advanced iterative Monte-Carlo technique called Adaptive Multiple Importance Sampling (AMIS), which uses a recycling process at each iteration to accelerate its convergence. The proposed methodology is tested using synthetic and real concentration data in the framework of the Fusion Field Trials 2007 (FFT-07) experiment. The quality of the obtained results is comparable to those coming from the Markov Chain Monte Carlo (MCMC) algorithm, a popular Bayesian method used for source estimation. Moreover, the adaptive processing of the AMIS provides a better sampling efficiency by reusing all the generated samples.
Bayesian Inference in the Modern Design of Experiments
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A.; Valdés-Hernández, Pedro A.; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A.
2017-01-01
The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods, Quality Measures and Visualization Routines are freely available in a public website. PMID:29200994
Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A; Valdés-Hernández, Pedro A; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A
2017-01-01
The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods, Quality Measures and Visualization Routines are freely available in a public website.
Ransom, Katherine M; Grote, Mark N.; Deinhart, Amanda; Eppich, Gary; Kendall, Carol; Sanborn, Matthew E.; Sounders, A. Kate; Wimpenny, Joshua; Yin, Qing-zhu; Young, Megan B.; Harter, Thomas
2016-01-01
Groundwater quality is a concern in alluvial aquifers that underlie agricultural areas, such as in the San Joaquin Valley of California. Shallow domestic wells (less than 150 m deep) in agricultural areas are often contaminated by nitrate. Agricultural and rural nitrate sources include dairy manure, synthetic fertilizers, and septic waste. Knowledge of the relative proportion that each of these sources contributes to nitrate concentration in individual wells can aid future regulatory and land management decisions. We show that nitrogen and oxygen isotopes of nitrate, boron isotopes, and iodine concentrations are a useful, novel combination of groundwater tracers to differentiate between manure, fertilizers, septic waste, and natural sources of nitrate. Furthermore, in this work, we develop a new Bayesian mixing model in which these isotopic and elemental tracers were used to estimate the probability distribution of the fractional contributions of manure, fertilizers, septic waste, and natural sources to the nitrate concentration found in an individual well. The approach was applied to 56 nitrate-impacted private domestic wells located in the San Joaquin Valley. Model analysis found that some domestic wells were clearly dominated by the manure source and suggests evidence for majority contributions from either the septic or fertilizer source for other wells. But, predictions of fractional contributions for septic and fertilizer sources were often of similar magnitude, perhaps because modeled uncertainty about the fraction of each was large. For validation of the Bayesian model, fractional estimates were compared to surrounding land use and estimated source contributions were broadly consistent with nearby land use types.
NASA Astrophysics Data System (ADS)
Granade, Christopher; Combes, Joshua; Cory, D. G.
2016-03-01
In recent years, Bayesian methods have been proposed as a solution to a wide range of issues in quantum state and process tomography. State-of-the-art Bayesian tomography solutions suffer from three problems: numerical intractability, a lack of informative prior distributions, and an inability to track time-dependent processes. Here, we address all three problems. First, we use modern statistical methods, as pioneered by Huszár and Houlsby (2012 Phys. Rev. A 85 052120) and by Ferrie (2014 New J. Phys. 16 093035), to make Bayesian tomography numerically tractable. Our approach allows for practical computation of Bayesian point and region estimators for quantum states and channels. Second, we propose the first priors on quantum states and channels that allow for including useful experimental insight. Finally, we develop a method that allows tracking of time-dependent states and estimates the drift and diffusion processes affecting a state. We provide source code and animated visual examples for our methods.
A Bayesian framework for infrasound location
NASA Astrophysics Data System (ADS)
Modrak, Ryan T.; Arrowsmith, Stephen J.; Anderson, Dale N.
2010-04-01
We develop a framework for location of infrasound events using backazimuth and infrasonic arrival times from multiple arrays. Bayesian infrasonic source location (BISL) developed here estimates event location and associated credibility regions. BISL accounts for unknown source-to-array path or phase by formulating infrasonic group velocity as random. Differences between observed and predicted source-to-array traveltimes are partitioned into two additive Gaussian sources, measurement error and model error, the second of which accounts for the unknown influence of wind and temperature on path. By applying the technique to both synthetic tests and ground-truth events, we highlight the complementary nature of back azimuths and arrival times for estimating well-constrained event locations. BISL is an extension to methods developed earlier by Arrowsmith et al. that provided simple bounds on location using a grid-search technique.
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
ERIC Educational Resources Information Center
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Non-linear Parameter Estimates from Non-stationary MEG Data
Martínez-Vargas, Juan D.; López, Jose D.; Baker, Adam; Castellanos-Dominguez, German; Woolrich, Mark W.; Barnes, Gareth
2016-01-01
We demonstrate a method to estimate key electrophysiological parameters from resting state data. In this paper, we focus on the estimation of head-position parameters. The recovery of these parameters is especially challenging as they are non-linearly related to the measured field. In order to do this we use an empirical Bayesian scheme to estimate the cortical current distribution due to a range of laterally shifted head-models. We compare different methods of approaching this problem from the division of M/EEG data into stationary sections and performing separate source inversions, to explaining all of the M/EEG data with a single inversion. We demonstrate this through estimation of head position in both simulated and empirical resting state MEG data collected using a head-cast. PMID:27597815
Embedding the results of focussed Bayesian fusion into a global context
NASA Astrophysics Data System (ADS)
Sander, Jennifer; Heizmann, Michael
2014-05-01
Bayesian statistics offers a well-founded and powerful fusion methodology also for the fusion of heterogeneous information sources. However, except in special cases, the needed posterior distribution is not analytically derivable. As consequence, Bayesian fusion may cause unacceptably high computational and storage costs in practice. Local Bayesian fusion approaches aim at reducing the complexity of the Bayesian fusion methodology significantly. This is done by concentrating the actual Bayesian fusion on the potentially most task relevant parts of the domain of the Properties of Interest. Our research on these approaches is motivated by an analogy to criminal investigations where criminalists pursue clues also only locally. This publication follows previous publications on a special local Bayesian fusion technique called focussed Bayesian fusion. Here, the actual calculation of the posterior distribution gets completely restricted to a suitably chosen local context. By this, the global posterior distribution is not completely determined. Strategies for using the results of a focussed Bayesian analysis appropriately are needed. In this publication, we primarily contrast different ways of embedding the results of focussed Bayesian fusion explicitly into a global context. To obtain a unique global posterior distribution, we analyze the application of the Maximum Entropy Principle that has been shown to be successfully applicable in metrology and in different other areas. To address the special need for making further decisions subsequently to the actual fusion task, we further analyze criteria for decision making under partial information.
Efficient Bayesian experimental design for contaminant source identification
NASA Astrophysics Data System (ADS)
Zhang, Jiangjiang; Zeng, Lingzao; Chen, Cheng; Chen, Dingjiang; Wu, Laosheng
2015-01-01
In this study, an efficient full Bayesian approach is developed for the optimal sampling well location design and source parameters identification of groundwater contaminants. An information measure, i.e., the relative entropy, is employed to quantify the information gain from concentration measurements in identifying unknown parameters. In this approach, the sampling locations that give the maximum expected relative entropy are selected as the optimal design. After the sampling locations are determined, a Bayesian approach based on Markov Chain Monte Carlo (MCMC) is used to estimate unknown parameters. In both the design and estimation, the contaminant transport equation is required to be solved many times to evaluate the likelihood. To reduce the computational burden, an interpolation method based on the adaptive sparse grid is utilized to construct a surrogate for the contaminant transport equation. The approximated likelihood can be evaluated directly from the surrogate, which greatly accelerates the design and estimation process. The accuracy and efficiency of our approach are demonstrated through numerical case studies. It is shown that the methods can be used to assist in both single sampling location and monitoring network design for contaminant source identifications in groundwater.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brewer, Brendon J.; Foreman-Mackey, Daniel; Hogg, David W., E-mail: bj.brewer@auckland.ac.nz
We present and implement a probabilistic (Bayesian) method for producing catalogs from images of stellar fields. The method is capable of inferring the number of sources N in the image and can also handle the challenges introduced by noise, overlapping sources, and an unknown point-spread function. The luminosity function of the stars can also be inferred, even when the precise luminosity of each star is uncertain, via the use of a hierarchical Bayesian model. The computational feasibility of the method is demonstrated on two simulated images with different numbers of stars. We find that our method successfully recovers the inputmore » parameter values along with principled uncertainties even when the field is crowded. We also compare our results with those obtained from the SExtractor software. While the two approaches largely agree about the fluxes of the bright stars, the Bayesian approach provides more accurate inferences about the faint stars and the number of stars, particularly in the crowded case.« less
Underwater passive acoustic localization of Pacific walruses in the northeastern Chukchi Sea.
Rideout, Brendan P; Dosso, Stan E; Hannay, David E
2013-09-01
This paper develops and applies a linearized Bayesian localization algorithm based on acoustic arrival times of marine mammal vocalizations at spatially-separated receivers which provides three-dimensional (3D) location estimates with rigorous uncertainty analysis. To properly account for uncertainty in receiver parameters (3D hydrophone locations and synchronization times) and environmental parameters (water depth and sound-speed correction), these quantities are treated as unknowns constrained by prior estimates and prior uncertainties. Unknown scaling factors on both the prior and arrival-time uncertainties are estimated by minimizing Akaike's Bayesian information criterion (a maximum entropy condition). Maximum a posteriori estimates for sound source locations and times, receiver parameters, and environmental parameters are calculated simultaneously using measurements of arrival times for direct and interface-reflected acoustic paths. Posterior uncertainties for all unknowns incorporate both arrival time and prior uncertainties. Monte Carlo simulation results demonstrate that, for the cases considered here, linearization errors are small and the lack of an accurate sound-speed profile does not cause significant biases in the estimated locations. A sequence of Pacific walrus vocalizations, recorded in the Chukchi Sea northwest of Alaska, is localized using this technique, yielding a track estimate and uncertainties with an estimated speed comparable to normal walrus swim speeds.
Color generalization across hue and saturation in chicks described by a simple (Bayesian) model.
Scholtyssek, Christine; Osorio, Daniel C; Baddeley, Roland J
2016-08-01
Color conveys important information for birds in tasks such as foraging and mate choice, but in the natural world color signals can vary substantially, so birds may benefit from generalizing responses to perceptually discriminable colors. Studying color generalization is therefore a way to understand how birds take account of suprathreshold stimulus variations in decision making. Former studies on color generalization have focused on hue variation, but natural colors often vary in saturation, which could be an additional, independent source of information. We combine behavioral experiments and statistical modeling to investigate whether color generalization by poultry chicks depends on the chromatic dimension in which colors vary. Chicks were trained to discriminate colors separated by equal distances on a hue or a saturation dimension, in a receptor-based color space. Generalization tests then compared the birds' responses to familiar and novel colors lying on the same chromatic dimension. To characterize generalization we introduce a Bayesian model that extracts a threshold color distance beyond which chicks treat novel colors as significantly different from the rewarded training color. These thresholds were the same for generalization along the hue and saturation dimensions, demonstrating that responses to novel colors depend on similarity and expected variation of color signals but are independent of the chromatic dimension.
Bayesian networks for maritime traffic accident prevention: benefits and challenges.
Hänninen, Maria
2014-12-01
Bayesian networks are quantitative modeling tools whose applications to the maritime traffic safety context are becoming more popular. This paper discusses the utilization of Bayesian networks in maritime safety modeling. Based on literature and the author's own experiences, the paper studies what Bayesian networks can offer to maritime accident prevention and safety modeling and discusses a few challenges in their application to this context. It is argued that the capability of representing rather complex, not necessarily causal but uncertain relationships makes Bayesian networks an attractive modeling tool for the maritime safety and accidents. Furthermore, as the maritime accident and safety data is still rather scarce and has some quality problems, the possibility to combine data with expert knowledge and the easy way of updating the model after acquiring more evidence further enhance their feasibility. However, eliciting the probabilities from the maritime experts might be challenging and the model validation can be tricky. It is concluded that with the utilization of several data sources, Bayesian updating, dynamic modeling, and hidden nodes for latent variables, Bayesian networks are rather well-suited tools for the maritime safety management and decision-making. Copyright © 2014 Elsevier Ltd. All rights reserved.
Uncertainty aggregation and reduction in structure-material performance prediction
NASA Astrophysics Data System (ADS)
Hu, Zhen; Mahadevan, Sankaran; Ao, Dan
2018-02-01
An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.
NASA Astrophysics Data System (ADS)
Xia, Yongqiu; Li, Yuefei; Zhang, Xinyu; Yan, Xiaoyuan
2017-01-01
Nitrate (NO3-) pollution is a serious problem worldwide, particularly in countries with intensive agricultural and population activities. Previous studies have used δ15N-NO3- and δ18O-NO3- to determine the NO3- sources in rivers. However, this approach is subject to substantial uncertainties and limitations because of the numerous NO3- sources, the wide isotopic ranges, and the existing isotopic fractionations. In this study, we outline a combined procedure for improving the determination of NO3- sources in a paddy agriculture-urban gradient watershed in eastern China. First, the main sources of NO3- in the Qinhuai River were examined by the dual-isotope biplot approach, in which we narrowed the isotope ranges using site-specific isotopic results. Next, the bacterial groups and chemical properties of the river water were analyzed to verify these sources. Finally, we introduced a Bayesian model to apportion the spatiotemporal variations of the NO3- sources. Denitrification was first incorporated into the Bayesian model because denitrification plays an important role in the nitrogen pathway. The results showed that fertilizer contributed large amounts of NO3- to the surface water in traditional agricultural regions, whereas manure effluents were the dominant NO3- source in intensified agricultural regions, especially during the wet seasons. Sewage effluents were important in all three land uses and exhibited great differences between the dry season and the wet season. This combined analysis quantitatively delineates the proportion of NO3- sources from paddy agriculture to urban river water for both dry and wet seasons and incorporates isotopic fractionation and uncertainties in the source compositions.
Blind source separation problem in GPS time series
NASA Astrophysics Data System (ADS)
Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.
2016-04-01
A critical point in the analysis of ground displacement time series, as those recorded by space geodetic techniques, is the development of data-driven methods that allow the different sources of deformation to be discerned and characterized in the space and time domains. Multivariate statistic includes several approaches that can be considered as a part of data-driven methods. A widely used technique is the principal component analysis (PCA), which allows us to reduce the dimensionality of the data space while maintaining most of the variance of the dataset explained. However, PCA does not perform well in finding the solution to the so-called blind source separation (BSS) problem, i.e., in recovering and separating the original sources that generate the observed data. This is mainly due to the fact that PCA minimizes the misfit calculated using an L2 norm (χ 2), looking for a new Euclidean space where the projected data are uncorrelated. The independent component analysis (ICA) is a popular technique adopted to approach the BSS problem. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we test the use of a modified variational Bayesian ICA (vbICA) method to recover the multiple sources of ground deformation even in the presence of missing data. The vbICA method models the probability density function (pdf) of each source signal using a mix of Gaussian distributions, allowing for more flexibility in the description of the pdf of the sources with respect to standard ICA, and giving a more reliable estimate of them. Here we present its application to synthetic global positioning system (GPS) position time series, generated by simulating deformation near an active fault, including inter-seismic, co-seismic, and post-seismic signals, plus seasonal signals and noise, and an additional time-dependent volcanic source. We evaluate the ability of the PCA and ICA decomposition techniques in explaining the data and in recovering the original (known) sources. Using the same number of components, we find that the vbICA method fits the data almost as well as a PCA method, since the χ 2 increase is less than 10 % the value calculated using a PCA decomposition. Unlike PCA, the vbICA algorithm is found to correctly separate the sources if the correlation of the dataset is low (<0.67) and the geodetic network is sufficiently dense (ten continuous GPS stations within a box of side equal to two times the locking depth of a fault where an earthquake of Mw >6 occurred). We also provide a cookbook for the use of the vbICA algorithm in analyses of position time series for tectonic and non-tectonic applications.
Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets.
Clark, Alex M; Dole, Krishna; Coulon-Spektor, Anna; McNutt, Andrew; Grass, George; Freundlich, Joel S; Reynolds, Robert C; Ekins, Sean
2015-06-22
On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user's own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery.
Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets
2015-01-01
On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user’s own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery. PMID:25994950
Kernel and divergence techniques in high energy physics separations
NASA Astrophysics Data System (ADS)
Bouř, Petr; Kůs, Václav; Franc, Jiří
2017-10-01
Binary decision trees under the Bayesian decision technique are used for supervised classification of high-dimensional data. We present a great potential of adaptive kernel density estimation as the nested separation method of the supervised binary divergence decision tree. Also, we provide a proof of alternative computing approach for kernel estimates utilizing Fourier transform. Further, we apply our method to Monte Carlo data set from the particle accelerator Tevatron at DØ experiment in Fermilab and provide final top-antitop signal separation results. We have achieved up to 82 % AUC while using the restricted feature selection entering the signal separation procedure.
Perceptual learning shapes multisensory causal inference via two distinct mechanisms
McGovern, David P.; Roudaia, Eugenie; Newell, Fiona N.; Roach, Neil W.
2016-01-01
To accurately represent the environment, our brains must integrate sensory signals from a common source while segregating those from independent sources. A reasonable strategy for performing this task is to restrict integration to cues that coincide in space and time. However, because multisensory signals are subject to differential transmission and processing delays, the brain must retain a degree of tolerance for temporal discrepancies. Recent research suggests that the width of this ‘temporal binding window’ can be reduced through perceptual learning, however, little is known about the mechanisms underlying these experience-dependent effects. Here, in separate experiments, we measure the temporal and spatial binding windows of human participants before and after training on an audiovisual temporal discrimination task. We show that training leads to two distinct effects on multisensory integration in the form of (i) a specific narrowing of the temporal binding window that does not transfer to spatial binding and (ii) a general reduction in the magnitude of crossmodal interactions across all spatiotemporal disparities. These effects arise naturally from a Bayesian model of causal inference in which learning improves the precision of audiovisual timing estimation, whilst concomitantly decreasing the prior expectation that stimuli emanate from a common source. PMID:27091411
Perceptual learning shapes multisensory causal inference via two distinct mechanisms.
McGovern, David P; Roudaia, Eugenie; Newell, Fiona N; Roach, Neil W
2016-04-19
To accurately represent the environment, our brains must integrate sensory signals from a common source while segregating those from independent sources. A reasonable strategy for performing this task is to restrict integration to cues that coincide in space and time. However, because multisensory signals are subject to differential transmission and processing delays, the brain must retain a degree of tolerance for temporal discrepancies. Recent research suggests that the width of this 'temporal binding window' can be reduced through perceptual learning, however, little is known about the mechanisms underlying these experience-dependent effects. Here, in separate experiments, we measure the temporal and spatial binding windows of human participants before and after training on an audiovisual temporal discrimination task. We show that training leads to two distinct effects on multisensory integration in the form of (i) a specific narrowing of the temporal binding window that does not transfer to spatial binding and (ii) a general reduction in the magnitude of crossmodal interactions across all spatiotemporal disparities. These effects arise naturally from a Bayesian model of causal inference in which learning improves the precision of audiovisual timing estimation, whilst concomitantly decreasing the prior expectation that stimuli emanate from a common source.
Hackstadt, Amber J; Peng, Roger D
2014-11-01
Time series studies have suggested that air pollution can negatively impact health. These studies have typically focused on the total mass of fine particulate matter air pollution or the individual chemical constituents that contribute to it, and not source-specific contributions to air pollution. Source-specific contribution estimates are useful from a regulatory standpoint by allowing regulators to focus limited resources on reducing emissions from sources that are major contributors to air pollution and are also desired when estimating source-specific health effects. However, researchers often lack direct observations of the emissions at the source level. We propose a Bayesian multivariate receptor model to infer information about source contributions from ambient air pollution measurements. The proposed model incorporates information from national databases containing data on both the composition of source emissions and the amount of emissions from known sources of air pollution. The proposed model is used to perform source apportionment analyses for two distinct locations in the United States (Boston, Massachusetts and Phoenix, Arizona). Our results mirror previous source apportionment analyses that did not utilize the information from national databases and provide additional information about uncertainty that is relevant to the estimation of health effects.
NASA Astrophysics Data System (ADS)
Lundgren, P.; Camacho, A.; Poland, M. P.; Miklius, A.; Samsonov, S. V.; Milillo, P.
2013-12-01
The availability of synthetic aperture radar (SAR) interferometry (InSAR) data has increased our awareness of the complexity of volcano deformation sources. InSAR's spatial completeness helps identify or clarify source process mechanisms at volcanoes (i.e. Mt. Etna east flank motion; Lazufre crustal magma body; Kilauea dike complexity) and also improves potential model realism. In recent years, Bayesian inference methods have gained widespread use because of their ability to constrain not only source model parameters, but also their uncertainties. They are computationally intensive, however, which tends to limit them to a few geometrically rather simple source representations (for example, spheres). An alternative approach involves solving for irregular pressure and/or density sources from a three-dimensional (3-D) grid of source/density cells. This method has the ability to solve for arbitrarily shaped bodies of constant absolute pressure/density difference. We compare results for both Bayesian (a Markov chain Monte Carlo algorithm) and the irregular source methods for two volcanoes: Kilauea, Hawaii, and Copahue, Argentina-Chile border. Kilauea has extensive InSAR and GPS databases from which to explore the results for the irregular method with respect to the Bayesian approach, prior models, and an extensive set of ancillary data. One caveat, however, is the current restriction in the irregular model inversion to volume-pressure sources (and at a single excess pressure change), which limits its application in cases where sources such as faults or dikes are present. Preliminary results for Kilauea summit deflation during the March 2011 Kamoamoa eruption suggests a northeast-elongated magma body lying roughly 1-1.5 km below the surface. Copahue is a southern Andes volcano that has been inflating since early 2012, with intermittent summit eruptive activity since late 2012. We have an extensive InSAR time series from RADARSAT-2 and COSMO-SkyMed data, although both are from descending tracks. Preliminary modeling suggests a very irregular magma body that extends from the volcanic edifice to less than 5 km depth and located slightly north of the summit at shallow depths but to the ENE at greater depths. In our preliminary analysis, we find that there are potential limitations and trade-offs in the Bayesian results that suggest the simplicity of the assumed analytic source may generate systematic biases in source parameters. Instead, the irregular 3-D solution appears to provide greater realism, but is limited in the number and type of sources that can be modeled.
Bayesian Integrated Microbial Forensics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jarman, Kristin H.; Kreuzer-Martin, Helen W.; Wunschel, David S.
2008-06-01
In the aftermath of the 2001 anthrax letters, researchers have been exploring ways to predict the production environment of unknown source microorganisms. Different mass spectral techniques are being developed to characterize components of a microbe’s culture medium including water, carbon and nitrogen sources, metal ions added, and the presence of agar. Individually, each technique has the potential to identify one or two ingredients in a culture medium recipe. However, by integrating data from multiple mass spectral techniques, a more complete characterization is possible. We present a Bayesian statistical approach to integrated microbial forensics and illustrate its application on spores grownmore » in different culture media.« less
NASA Astrophysics Data System (ADS)
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (< 30 m) controlled methane release experiments using a specialized vehicle mounted with fast response methane analyzers and a GPS unit. Methane concentrations were measured at two different heights along mobile traversals downwind of the sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
Discriminative Bayesian Dictionary Learning for Classification.
Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal
2016-12-01
We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.
Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C
2016-02-01
A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam.
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
The Bayesian Cramér-Rao lower bound in Astrometry
NASA Astrophysics Data System (ADS)
Mendez, R. A.; Echeverria, A.; Silva, J.; Orchard, M.
2018-01-01
A determination of the highest precision that can be achieved in the measurement of the location of a stellar-like object has been a topic of permanent interest by the astrometric community. The so-called (parametric, or non-Bayesian) Cramér-Rao (CR hereafter) bound provides a lower bound for the variance with which one could estimate the position of a point source. This has been studied recently by Mendez et al. (2013, 2014, 2015). In this work we present a different approach to the same problem (Echeverria et al. 2016), using a Bayesian CR setting which has a number of advantages over the parametric scenario.
The Bayesian Cramér-Rao lower bound in Astrometry
NASA Astrophysics Data System (ADS)
Mendez, R. A.; Echeverria, A.; Silva, J.; Orchard, M.
2017-07-01
A determination of the highest precision that can be achieved in the measurement of the location of a stellar-like object has been a topic of permanent interest by the astrometric community. The so-called (parametric, or non-Bayesian) Cramér-Rao (CR hereafter) bound provides a lower bound for the variance with which one could estimate the position of a point source. This has been studied recently by Mendez and collaborators (2014, 2015). In this work we present a different approach to the same problem (Echeverria et al. 2016), using a Bayesian CR setting which has a number of advantages over the parametric scenario.
Neuronal integration of dynamic sources: Bayesian learning and Bayesian inference.
Siegelmann, Hava T; Holzman, Lars E
2010-09-01
One of the brain's most basic functions is integrating sensory data from diverse sources. This ability causes us to question whether the neural system is computationally capable of intelligently integrating data, not only when sources have known, fixed relative dependencies but also when it must determine such relative weightings based on dynamic conditions, and then use these learned weightings to accurately infer information about the world. We suggest that the brain is, in fact, fully capable of computing this parallel task in a single network and describe a neural inspired circuit with this property. Our implementation suggests the possibility that evidence learning requires a more complex organization of the network than was previously assumed, where neurons have different specialties, whose emergence brings the desired adaptivity seen in human online inference.
Incorporating approximation error in surrogate based Bayesian inversion
NASA Astrophysics Data System (ADS)
Zhang, J.; Zeng, L.; Li, W.; Wu, L.
2015-12-01
There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.
Liu, Kai; Cui, Meng-Ying; Cao, Peng; Wang, Jiang-Bo
2016-01-01
On urban arterials, travel time estimation is challenging especially from various data sources. Typically, fusing loop detector data and probe vehicle data to estimate travel time is a troublesome issue while considering the data issue of uncertain, imprecise and even conflicting. In this paper, we propose an improved data fusing methodology for link travel time estimation. Link travel times are simultaneously pre-estimated using loop detector data and probe vehicle data, based on which Bayesian fusion is then applied to fuse the estimated travel times. Next, Iterative Bayesian estimation is proposed to improve Bayesian fusion by incorporating two strategies: 1) substitution strategy which replaces the lower accurate travel time estimation from one sensor with the current fused travel time; and 2) specially-designed conditions for convergence which restrict the estimated travel time in a reasonable range. The estimation results show that, the proposed method outperforms probe vehicle data based method, loop detector based method and single Bayesian fusion, and the mean absolute percentage error is reduced to 4.8%. Additionally, iterative Bayesian estimation performs better for lighter traffic flows when the variability of travel time is practically higher than other periods.
Cui, Meng-Ying; Cao, Peng; Wang, Jiang-Bo
2016-01-01
On urban arterials, travel time estimation is challenging especially from various data sources. Typically, fusing loop detector data and probe vehicle data to estimate travel time is a troublesome issue while considering the data issue of uncertain, imprecise and even conflicting. In this paper, we propose an improved data fusing methodology for link travel time estimation. Link travel times are simultaneously pre-estimated using loop detector data and probe vehicle data, based on which Bayesian fusion is then applied to fuse the estimated travel times. Next, Iterative Bayesian estimation is proposed to improve Bayesian fusion by incorporating two strategies: 1) substitution strategy which replaces the lower accurate travel time estimation from one sensor with the current fused travel time; and 2) specially-designed conditions for convergence which restrict the estimated travel time in a reasonable range. The estimation results show that, the proposed method outperforms probe vehicle data based method, loop detector based method and single Bayesian fusion, and the mean absolute percentage error is reduced to 4.8%. Additionally, iterative Bayesian estimation performs better for lighter traffic flows when the variability of travel time is practically higher than other periods. PMID:27362654
Applications of Bayesian spectrum representation in acoustics
NASA Astrophysics Data System (ADS)
Botts, Jonathan M.
This dissertation utilizes a Bayesian inference framework to enhance the solution of inverse problems where the forward model maps to acoustic spectra. A Bayesian solution to filter design inverts a acoustic spectra to pole-zero locations of a discrete-time filter model. Spatial sound field analysis with a spherical microphone array is a data analysis problem that requires inversion of spatio-temporal spectra to directions of arrival. As with many inverse problems, a probabilistic analysis results in richer solutions than can be achieved with ad-hoc methods. In the filter design problem, the Bayesian inversion results in globally optimal coefficient estimates as well as an estimate the most concise filter capable of representing the given spectrum, within a single framework. This approach is demonstrated on synthetic spectra, head-related transfer function spectra, and measured acoustic reflection spectra. The Bayesian model-based analysis of spatial room impulse responses is presented as an analogous problem with equally rich solution. The model selection mechanism provides an estimate of the number of arrivals, which is necessary to properly infer the directions of simultaneous arrivals. Although, spectrum inversion problems are fairly ubiquitous, the scope of this dissertation has been limited to these two and derivative problems. The Bayesian approach to filter design is demonstrated on an artificial spectrum to illustrate the model comparison mechanism and then on measured head-related transfer functions to show the potential range of application. Coupled with sampling methods, the Bayesian approach is shown to outperform least-squares filter design methods commonly used in commercial software, confirming the need for a global search of the parameter space. The resulting designs are shown to be comparable to those that result from global optimization methods, but the Bayesian approach has the added advantage of a filter length estimate within the same unified framework. The application to reflection data is useful for representing frequency-dependent impedance boundaries in finite difference acoustic simulations. Furthermore, since the filter transfer function is a parametric model, it can be modified to incorporate arbitrary frequency weighting and account for the band-limited nature of measured reflection spectra. Finally, the model is modified to compensate for dispersive error in the finite difference simulation, from the filter design process. Stemming from the filter boundary problem, the implementation of pressure sources in finite difference simulation is addressed in order to assure that schemes properly converge. A class of parameterized source functions is proposed and shown to offer straightforward control of residual error in the simulation. Guided by the notion that the solution to be approximated affects the approximation error, sources are designed which reduce residual dispersive error to the size of round-off errors. The early part of a room impulse response can be characterized by a series of isolated plane waves. Measured with an array of microphones, plane waves map to a directional response of the array or spatial intensity map. Probabilistic inversion of this response results in estimates of the number and directions of image source arrivals. The model-based inversion is shown to avoid ambiguities associated with peak-finding or inspection of the spatial intensity map. For this problem, determining the number of arrivals in a given frame is critical for properly inferring the state of the sound field. This analysis is effectively compression of the spatial room response, which is useful for analysis or encoding of the spatial sound field. Parametric, model-based formulations of these problems enhance the solution in all cases, and a Bayesian interpretation provides a principled approach to model comparison and parameter estimation. v
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, Erin A.; Robinson, Sean M.; Anderson, Kevin K.
2015-01-19
Here we present a novel technique for the localization of radiological sources in urban or rural environments from an aerial platform. The technique is based on a Bayesian approach to localization, in which measured count rates in a time series are compared with predicted count rates from a series of pre-calculated test sources to define likelihood. Furthermore, this technique is expanded by using a localized treatment with a limited field of view (FOV), coupled with a likelihood ratio reevaluation, allowing for real-time computation on commodity hardware for arbitrarily complex detector models and terrain. In particular, detectors with inherent asymmetry ofmore » response (such as those employing internal collimation or self-shielding for enhanced directional awareness) are leveraged by this approach to provide improved localization. Our results from the localization technique are shown for simulated flight data using monolithic as well as directionally-aware detector models, and the capability of the methodology to locate radioisotopes is estimated for several test cases. This localization technique is shown to facilitate urban search by allowing quick and adaptive estimates of source location, in many cases from a single flyover near a source. In particular, this method represents a significant advancement from earlier methods like full-field Bayesian likelihood, which is not generally fast enough to allow for broad-field search in real time, and highest-net-counts estimation, which has a localization error that depends strongly on flight path and cannot generally operate without exhaustive search« less
A new software for deformation source optimization, the Bayesian Earthquake Analysis Tool (BEAT)
NASA Astrophysics Data System (ADS)
Vasyura-Bathke, H.; Dutta, R.; Jonsson, S.; Mai, P. M.
2017-12-01
Modern studies of crustal deformation and the related source estimation, including magmatic and tectonic sources, increasingly use non-linear optimization strategies to estimate geometric and/or kinematic source parameters and often consider both jointly, geodetic and seismic data. Bayesian inference is increasingly being used for estimating posterior distributions of deformation source model parameters, given measured/estimated/assumed data and model uncertainties. For instance, some studies consider uncertainties of a layered medium and propagate these into source parameter uncertainties, while others use informative priors to reduce the model parameter space. In addition, innovative sampling algorithms have been developed to efficiently explore the high-dimensional parameter spaces. Compared to earlier studies, these improvements have resulted in overall more robust source model parameter estimates that include uncertainties. However, the computational burden of these methods is high and estimation codes are rarely made available along with the published results. Even if the codes are accessible, it is usually challenging to assemble them into a single optimization framework as they are typically coded in different programing languages. Therefore, further progress and future applications of these methods/codes are hampered, while reproducibility and validation of results has become essentially impossible. In the spirit of providing open-access and modular codes to facilitate progress and reproducible research in deformation source estimations, we undertook the effort of developing BEAT, a python package that comprises all the above-mentioned features in one single programing environment. The package builds on the pyrocko seismological toolbox (www.pyrocko.org), and uses the pymc3 module for Bayesian statistical model fitting. BEAT is an open-source package (https://github.com/hvasbath/beat), and we encourage and solicit contributions to the project. Here, we present our strategy for developing BEAT and show application examples; especially the effect of including the model prediction uncertainty of the velocity model in following source optimizations: full moment tensor, Mogi source, moderate strike-slip earth-quake.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Jiangjiang; Li, Weixuan; Zeng, Lingzao
Surrogate models are commonly used in Bayesian approaches such as Markov Chain Monte Carlo (MCMC) to avoid repetitive CPU-demanding model evaluations. However, the approximation error of a surrogate may lead to biased estimations of the posterior distribution. This bias can be corrected by constructing a very accurate surrogate or implementing MCMC in a two-stage manner. Since the two-stage MCMC requires extra original model evaluations, the computational cost is still high. If the information of measurement is incorporated, a locally accurate approximation of the original model can be adaptively constructed with low computational cost. Based on this idea, we propose amore » Gaussian process (GP) surrogate-based Bayesian experimental design and parameter estimation approach for groundwater contaminant source identification problems. A major advantage of the GP surrogate is that it provides a convenient estimation of the approximation error, which can be incorporated in the Bayesian formula to avoid over-confident estimation of the posterior distribution. The proposed approach is tested with a numerical case study. Without sacrificing the estimation accuracy, the new approach achieves about 200 times of speed-up compared to our previous work using two-stage MCMC.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gastelum, Zoe N.; Whitney, Paul D.; White, Amanda M.
2013-07-15
Pacific Northwest National Laboratory has spent several years researching, developing, and validating large Bayesian network models to support integration of open source data sets for nuclear proliferation research. Our current work focuses on generating a set of interrelated models for multi-source assessment of nuclear programs, as opposed to a single comprehensive model. By using this approach, we can break down the models to cover logical sub-problems that can utilize different expertise and data sources. This approach allows researchers to utilize the models individually or in combination to detect and characterize a nuclear program and identify data gaps. The models operatemore » at various levels of granularity, covering a combination of state-level assessments with more detailed models of site or facility characteristics. This paper will describe the current open source-driven, nuclear nonproliferation models under development, the pros and cons of the analytical approach, and areas for additional research.« less
Real-time realizations of the Bayesian Infrasonic Source Localization Method
NASA Astrophysics Data System (ADS)
Pinsky, V.; Arrowsmith, S.; Hofstetter, A.; Nippress, A.
2015-12-01
The Bayesian Infrasonic Source Localization method (BISL), introduced by Mordak et al. (2010) and upgraded by Marcillo et al. (2014) is destined for the accurate estimation of the atmospheric event origin at local, regional and global scales by the seismic and infrasonic networks and arrays. The BISL is based on probabilistic models of the source-station infrasonic signal propagation time, picking time and azimuth estimate merged with a prior knowledge about celerity distribution. It requires at each hypothetical source location, integration of the product of the corresponding source-station likelihood functions multiplied by a prior probability density function of celerity over the multivariate parameter space. The present BISL realization is generally time-consuming procedure based on numerical integration. The computational scheme proposed simplifies the target function so that integrals are taken exactly and are represented via standard functions. This makes the procedure much faster and realizable in real-time without practical loss of accuracy. The procedure executed as PYTHON-FORTRAN code demonstrates high performance on a set of the model and real data.
Bayesian models for comparative analysis integrating phylogenetic uncertainty.
de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P
2012-06-28
Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602
Estimate of main local sources to ambient ultrafine particle number concentrations in an urban area
NASA Astrophysics Data System (ADS)
Rahman, Md Mahmudur; Mazaheri, Mandana; Clifford, Sam; Morawska, Lidia
2017-09-01
Quantifying and apportioning the contribution of a range of sources to ultrafine particles (UFPs, D < 100 nm) is a challenge due to the complex nature of the urban environments. Although vehicular emissions have long been considered one of the major sources of ultrafine particles in urban areas, the contribution of other major urban sources is not yet fully understood. This paper aims to determine and quantify the contribution of local ground traffic, nucleated particle (NP) formation and distant non-traffic (e.g. airport, oil refineries, and seaport) sources to the total ambient particle number concentration (PNC) in a busy, inner-city area in Brisbane, Australia using Bayesian statistical modelling and other exploratory tools. The Bayesian model was trained on the PNC data on days where NP formations were known to have not occurred, hourly traffic counts, solar radiation data, and smooth daily trend. The model was applied to apportion and quantify the contribution of NP formations and local traffic and non-traffic sources to UFPs. The data analysis incorporated long-term measured time-series of total PNC (D ≥ 6 nm), particle number size distributions (PSD, D = 8 to 400 nm), PM2.5, PM10, NOx, CO, meteorological parameters and traffic counts at a stationary monitoring site. The developed Bayesian model showed reliable predictive performances in quantifying the contribution of NP formation events to UFPs (up to 4 × 104 particles cm- 3), with a significant day to day variability. The model identified potential NP formation and no-formations days based on PNC data and quantified the sources contribution to UFPs. Exploratory statistical analyses show that total mean PNC during the middle of the day was up to 32% higher than during peak morning and evening traffic periods, which were associated with NP formation events. The majority of UFPs measured during the peak traffic and NP formation periods were between 30-100 nm and smaller than 30 nm, respectively. To date, this is the first application of Bayesian model to apportion different sources contribution to UFPs, and therefore the importance of this study is not only in its modelling outcomes but in demonstrating the applicability and advantages of this statistical approach to air pollution studies.
Bayesian Cue Integration as a Developmental Outcome of Reward Mediated Learning
Weisswange, Thomas H.; Rothkopf, Constantin A.; Rodemann, Tobias; Triesch, Jochen
2011-01-01
Average human behavior in cue combination tasks is well predicted by Bayesian inference models. As this capability is acquired over developmental timescales, the question arises, how it is learned. Here we investigated whether reward dependent learning, that is well established at the computational, behavioral, and neuronal levels, could contribute to this development. It is shown that a model free reinforcement learning algorithm can indeed learn to do cue integration, i.e. weight uncertain cues according to their respective reliabilities and even do so if reliabilities are changing. We also consider the case of causal inference where multimodal signals can originate from one or multiple separate objects and should not always be integrated. In this case, the learner is shown to develop a behavior that is closest to Bayesian model averaging. We conclude that reward mediated learning could be a driving force for the development of cue integration and causal inference. PMID:21750717
Research on Bayes matting algorithm based on Gaussian mixture model
NASA Astrophysics Data System (ADS)
Quan, Wei; Jiang, Shan; Han, Cheng; Zhang, Chao; Jiang, Zhengang
2015-12-01
The digital matting problem is a classical problem of imaging. It aims at separating non-rectangular foreground objects from a background image, and compositing with a new background image. Accurate matting determines the quality of the compositing image. A Bayesian matting Algorithm Based on Gaussian Mixture Model is proposed to solve this matting problem. Firstly, the traditional Bayesian framework is improved by introducing Gaussian mixture model. Then, a weighting factor is added in order to suppress the noises of the compositing images. Finally, the effect is further improved by regulating the user's input. This algorithm is applied to matting jobs of classical images. The results are compared to the traditional Bayesian method. It is shown that our algorithm has better performance in detail such as hair. Our algorithm eliminates the noise well. And it is very effectively in dealing with the kind of work, such as interested objects with intricate boundaries.
Dosso, Stan E; Wilmut, Michael J; Nielsen, Peter L
2010-07-01
This paper applies Bayesian source tracking in an uncertain environment to Mediterranean Sea data, and investigates the resulting tracks and track uncertainties as a function of data information content (number of data time-segments, number of frequencies, and signal-to-noise ratio) and of prior information (environmental uncertainties and source-velocity constraints). To track low-level sources, acoustic data recorded for multiple time segments (corresponding to multiple source positions along the track) are inverted simultaneously. Environmental uncertainty is addressed by including unknown water-column and seabed properties as nuisance parameters in an augmented inversion. Two approaches are considered: Focalization-tracking maximizes the posterior probability density (PPD) over the unknown source and environmental parameters. Marginalization-tracking integrates the PPD over environmental parameters to obtain a sequence of joint marginal probability distributions over source coordinates, from which the most-probable track and track uncertainties can be extracted. Both approaches apply track constraints on the maximum allowable vertical and radial source velocity. The two approaches are applied for towed-source acoustic data recorded at a vertical line array at a shallow-water test site in the Mediterranean Sea where previous geoacoustic studies have been carried out.
Dawson, Colin; Gerken, Louann
2011-09-01
While many constraints on learning must be relatively experience-independent, past experience provides a rich source of guidance for subsequent learning. Discovering structure in some domain can inform a learner's future hypotheses about that domain. If a general property accounts for particular sub-patterns, a rational learner should not stipulate separate explanations for each detail without additional evidence, as the general structure has "explained away" the original evidence. In a grammar-learning experiment using tone sequences, manipulating learners' prior exposure to a tone environment affects their sensitivity to the grammar-defining feature, in this case consecutive repeated tones. Grammar-learning performance is worse if context melodies are "smooth" -- when small intervals occur more than large ones -- as Smoothness is a general property accounting for a high rate of repetition. We present an idealized Bayesian model as a "best case" benchmark for learning repetition grammars. When context melodies are Smooth, the model places greater weight on the small-interval constraint, and does not learn the repetition rule as well as when context melodies are not Smooth, paralleling the human learners. These findings support an account of abstract grammar-induction in which learners rationally assess the statistical evidence for underlying structure based on a generative model of the environment. Copyright © 2010 Elsevier B.V. All rights reserved.
Using Bayesian Networks for Candidate Generation in Consistency-based Diagnosis
NASA Technical Reports Server (NTRS)
Narasimhan, Sriram; Mengshoel, Ole
2008-01-01
Consistency-based diagnosis relies heavily on the assumption that discrepancies between model predictions and sensor observations can be detected accurately. When sources of uncertainty like sensor noise and model abstraction exist robust schemes have to be designed to make a binary decision on whether predictions are consistent with observations. This risks the occurrence of false alarms and missed alarms when an erroneous decision is made. Moreover when multiple sensors (with differing sensing properties) are available the degree of match between predictions and observations can be used to guide the search for fault candidates. In this paper we propose a novel approach to handle this problem using Bayesian networks. In the consistency- based diagnosis formulation, automatically generated Bayesian networks are used to encode a probabilistic measure of fit between predictions and observations. A Bayesian network inference algorithm is used to compute most probable fault candidates.
Spatiotemporal Bayesian analysis of Lyme disease in New York state, 1990-2000.
Chen, Haiyan; Stratton, Howard H; Caraco, Thomas B; White, Dennis J
2006-07-01
Mapping ordinarily increases our understanding of nontrivial spatial and temporal heterogeneities in disease rates. However, the large number of parameters required by the corresponding statistical models often complicates detailed analysis. This study investigates the feasibility of a fully Bayesian hierarchical regression approach to the problem and identifies how it outperforms two more popular methods: crude rate estimates (CRE) and empirical Bayes standardization (EBS). In particular, we apply a fully Bayesian approach to the spatiotemporal analysis of Lyme disease incidence in New York state for the period 1990-2000. These results are compared with those obtained by CRE and EBS in Chen et al. (2005). We show that the fully Bayesian regression model not only gives more reliable estimates of disease rates than the other two approaches but also allows for tractable models that can accommodate more numerous sources of variation and unknown parameters.
Luta, George; Ford, Melissa B; Bondy, Melissa; Shields, Peter G; Stamey, James D
2013-04-01
Recent research suggests that the Bayesian paradigm may be useful for modeling biases in epidemiological studies, such as those due to misclassification and missing data. We used Bayesian methods to perform sensitivity analyses for assessing the robustness of study findings to the potential effect of these two important sources of bias. We used data from a study of the joint associations of radiotherapy and smoking with primary lung cancer among breast cancer survivors. We used Bayesian methods to provide an operational way to combine both validation data and expert opinion to account for misclassification of the two risk factors and missing data. For comparative purposes we considered a "full model" that allowed for both misclassification and missing data, along with alternative models that considered only misclassification or missing data, and the naïve model that ignored both sources of bias. We identified noticeable differences between the four models with respect to the posterior distributions of the odds ratios that described the joint associations of radiotherapy and smoking with primary lung cancer. Despite those differences we found that the general conclusions regarding the pattern of associations were the same regardless of the model used. Overall our results indicate a nonsignificantly decreased lung cancer risk due to radiotherapy among nonsmokers, and a mildly increased risk among smokers. We described easy to implement Bayesian methods to perform sensitivity analyses for assessing the robustness of study findings to misclassification and missing data. Copyright © 2012 Elsevier Ltd. All rights reserved.
The Chandra Xbootes Survey - IV: Mid-Infrared and Submillimeter Counterparts
NASA Astrophysics Data System (ADS)
Brown, Arianna; Mitchell-Wynne, Ketron; Cooray, Asantha R.; Nayyeri, Hooshang
2016-06-01
In this work, we use a Bayesian technique to identify mid-IR and submillimeter counterparts for 3,213 X-ray point sources detected in the Chandra XBoötes Survey so as to characterize the relationship between black hole activity and star formation in the XBoötes region. The Chandra XBoötes Survey is a 5-ks X-ray survey of the 9.3 square degree Boötes Field of the NOAO Deep Wide-Field Survey (NDWFS), a survey imaged from the optical to the near-IR. We use a likelihood ratio analysis on Spitzer-IRAC data taken from The Spitzer Deep, Wide-Field Survey (SDWFS) to determine mid-IR counterparts, and a similar method on Herschel-SPIRE sources detected at 250µm from The Herschel Multi-tiered Extragalactic Survey to determine the submillimeter counterparts. The likelihood ratio analysis (LRA) provides the probability that a(n) IRAC or SPIRE point source is the true counterpart to a Chandra source. The analysis is comprised of three parts: the normalized magnitude distributions of counterparts and background sources, and the radial probability distribution of the separation distance between the IRAC or SPIRE source and the Chandra source. Many Chandra sources have multiple prospective counterparts in each band, so additional analysis is performed to determine the identification reliability of the candidates. Identification reliability values lie between 0 and 1, and sources with identification reliability values ≥0.8 are chosen to be the true counterparts. With these results, we will consider the statistical implications of the sample's redshifts, mid-IR and submillimeter luminosities, and star formation rates.
A Bayesian Model of the Memory Colour Effect.
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects.
A Bayesian Model of the Memory Colour Effect
Olkkonen, Maria; Gegenfurtner, Karl R.
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects. PMID:29760874
Quantitative assessment of Pb sources in isotopic mixtures using a Bayesian mixing model.
Longman, Jack; Veres, Daniel; Ersek, Vasile; Phillips, Donald L; Chauvel, Catherine; Tamas, Calin G
2018-04-18
Lead (Pb) isotopes provide valuable insights into the origin of Pb within a sample, typically allowing for reliable fingerprinting of their source. This is useful for a variety of applications, from tracing sources of pollution-related Pb, to the origins of Pb in archaeological artefacts. However, current approaches investigate source proportions via graphical means, or simple mixing models. As such, an approach, which quantitatively assesses source proportions and fingerprints the signature of analysed Pb, especially for larger numbers of sources, would be valuable. Here we use an advanced Bayesian isotope mixing model for three such applications: tracing dust sources in pre-anthropogenic environmental samples, tracking changing ore exploitation during the Roman period, and identifying the source of Pb in a Roman-age mining artefact. These examples indicate this approach can understand changing Pb sources deposited during both pre-anthropogenic times, when natural cycling of Pb dominated, and the Roman period, one marked by significant anthropogenic pollution. Our archaeometric investigation indicates clear input of Pb from Romanian ores previously speculated, but not proven, to have been the Pb source. Our approach can be applied to a range of disciplines, providing a new method for robustly tracing sources of Pb observed within a variety of environments.
Understanding the complex relationships underlying hot flashes: a Bayesian network approach.
Smith, Rebecca L; Gallicchio, Lisa M; Flaws, Jodi A
2018-02-01
The mechanism underlying hot flashes is not well-understood, primarily because of complex relationships between and among hot flashes and their risk factors. We explored those relationships using a Bayesian network approach based on a 2006 to 2015 cohort study of hot flashes among 776 female residents, 45 to 54 years old, in the Baltimore area. Bayesian networks were fit for each outcome (current hot flashes, hot flashes before the end of the study, hot flash severity, hot flash frequency, and age at first hot flashes) separately and together with a list of risk factors (estrogen, progesterone, testosterone, body mass index and obesity, race, income level, education level, smoking history, drinking history, and activity level). Each fitting was conducted separately on all women and only perimenopausal women, at enrollment and 4 years after enrollment. Hormone levels, almost always interrelated, were the most common variable linked to hot flashes; hormone levels were sometimes related to body mass index, but were not directly related to any other risk factors. Smoking was also frequently associated with increased likelihood of severe symptoms, but not through an antiestrogenic pathway. The age at first hot flashes was related only to race. All other factors were either not related to outcomes or were mediated entirely by race, hormone levels, or smoking. These models can serve as a guide for design of studies into the causal network underlying hot flashes.
Ascribing soil erosion of hillslope components to river sediment yield.
Nosrati, Kazem
2017-06-01
In recent decades, soil erosion has increased in catchments of Iran. It is, therefore, necessary to understand soil erosion processes and sources in order to mitigate this problem. Geomorphic landforms play an important role in influencing water erosion. Therefore, ascribing hillslope components soil erosion to river sediment yield could be useful for soil and sediment management in order to decrease the off-site effects related to downstream sedimentation areas. The main objectives of this study were to apply radionuclide tracers and soil organic carbon to determine relative contributions of hillslope component sediment sources in two land use types (forest and crop field) by using a Bayesian-mixing model, as well as to estimate the uncertainty in sediment fingerprinting in a mountainous catchment of western Iran. In this analysis, 137 Cs, 40 K, 238 U, 226 Ra, 232 Th and soil organic carbon tracers were measured in 32 different sampling sites from four hillslope component sediment sources (summit, shoulder, backslope, and toeslope) in forested and crop fields along with six bed sediment samples at the downstream reach of the catchment. To quantify the sediment source proportions, the Bayesian mixing model was based on (1) primary sediment sources and (2) combined primary and secondary sediment sources. The results of both approaches indicated that erosion from crop field shoulder dominated the sources of river sediments. The estimated contribution of crop field shoulder for all river samples was 63.7% (32.4-79.8%) for primary sediment sources approach, and 67% (15.3%-81.7%) for the combined primary and secondary sources approach. The Bayesian mixing model, based on an optimum set of tracers, estimated that the highest contribution of soil erosion in crop field land use and shoulder-component landforms constituted the most important land-use factor. This technique could, therefore, be a useful tool for soil and sediment control management strategies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics
NASA Astrophysics Data System (ADS)
Abe, Sumiyoshi
2014-11-01
The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.
Le Bras, Ronan J; Kuzma, Heidi; Sucic, Victor; Bokelmann, Götz
2016-05-01
A notable sequence of calls was encountered, spanning several days in January 2003, in the central part of the Indian Ocean on a hydrophone triplet recording acoustic data at a 250 Hz sampling rate. This paper presents signal processing methods applied to the waveform data to detect, group, extract amplitude and bearing estimates for the recorded signals. An approximate location for the source of the sequence of calls is inferred from extracting the features from the waveform. As the source approaches the hydrophone triplet, the source level (SL) of the calls is estimated at 187 ± 6 dB re: 1 μPa-1 m in the 15-60 Hz frequency range. The calls are attributed to a subgroup of blue whales, Balaenoptera musculus, with a characteristic acoustic signature. A Bayesian location method using probabilistic models for bearing and amplitude is demonstrated on the calls sequence. The method is applied to the case of detection at a single triad of hydrophones and results in a probability distribution map for the origin of the calls. It can be extended to detections at multiple triads and because of the Bayesian formulation, additional modeling complexity can be built-in as needed.
NASA Astrophysics Data System (ADS)
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Bao, Jie; Swiler, Laura
2017-12-01
In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated - reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.
A Bayesian blind survey for cold molecular gas in the Universe
NASA Astrophysics Data System (ADS)
Lentati, L.; Carilli, C.; Alexander, P.; Walter, F.; Decarli, R.
2014-10-01
A new Bayesian method for performing an image domain search for line-emitting galaxies is presented. The method uses both spatial and spectral information to robustly determine the source properties, employing either simple Gaussian, or other physically motivated models whilst using the evidence to determine the probability that the source is real. In this paper, we describe the method, and its application to both a simulated data set, and a blind survey for cold molecular gas using observations of the Hubble Deep Field-North taken with the Plateau de Bure Interferometer. We make a total of six robust detections in the survey, five of which have counterparts in other observing bands. We identify the most secure detections found in a previous investigation, while finding one new probable line source with an optical ID not seen in the previous analysis. This study acts as a pilot application of Bayesian statistics to future searches to be carried out both for low-J CO transitions of high-redshift galaxies using the Jansky Very Large Array (JVLA), and at millimetre wavelengths with Atacama Large Millimeter/submillimeter Array (ALMA), enabling the inference of robust scientific conclusions about the history of the molecular gas properties of star-forming galaxies in the Universe through cosmic time.
NASA Astrophysics Data System (ADS)
Lockhart, K.; Harter, T.; Grote, M.; Young, M. B.; Eppich, G.; Deinhart, A.; Wimpenny, J.; Yin, Q. Z.
2014-12-01
Groundwater quality is a concern in alluvial aquifers underlying agricultural areas worldwide, an example of which is the San Joaquin Valley, California. Nitrate from land applied fertilizers or from animal waste can leach to groundwater and contaminate drinking water resources. Dairy manure and synthetic fertilizers are the major sources of nitrate in groundwater in the San Joaquin Valley, however, septic waste can be a major source in some areas. As in other such regions around the world, the rural population in the San Joaquin Valley relies almost exclusively on shallow domestic wells (≤150 m deep), of which many have been affected by nitrate. Consumption of water containing nitrate above the drinking water limit has been linked to major health effects including low blood oxygen in infants and certain cancers. Knowledge of the proportion of each of the three main nitrate sources (manure, synthetic fertilizer, and septic waste) contributing to individual well nitrate can aid future regulatory decisions. Nitrogen, oxygen, and boron isotopes can be used as tracers to differentiate between the three main nitrate sources. Mixing models quantify the proportional contributions of sources to a mixture by using the concentration of conservative tracers within each source as a source signature. Deterministic mixing models are common, but do not allow for variability in the tracer source concentration or overlap of tracer concentrations between sources. Bayesian statistics used in conjunction with mixing models can incorporate variability in the source signature. We developed a Bayesian mixing model on a pilot network of 32 private domestic wells in the San Joaquin Valley for which nitrate as well as nitrogen, oxygen, and boron isotopes were measured. Probability distributions for nitrogen, oxygen, and boron isotope source signatures for manure, fertilizer, and septic waste were compiled from the literature and from a previous groundwater monitoring project on several dairies in the San Joaquin Valley. Median percent contribution of nitrate to wells from fertilizer, manure, and septic waste generally match the expected source based on land use patterns, with some exceptions.
Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)
NASA Astrophysics Data System (ADS)
Kasibhatla, P.
2004-12-01
In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.
Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L
2016-02-10
Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.
A Bayesian network approach to the database search problem in criminal proceedings
2012-01-01
Background The ‘database search problem’, that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions. The method’s graphical environment, along with its computational and probabilistic architectures, represents a rich package that offers analysts and discussants with additional modes of interaction, concise representation, and coherent communication. PMID:22849390
NASA Astrophysics Data System (ADS)
Echeverria, Alex; Silva, Jorge F.; Mendez, Rene A.; Orchard, Marcos
2016-10-01
Context. The best precision that can be achieved to estimate the location of a stellar-like object is a topic of permanent interest in the astrometric community. Aims: We analyze bounds for the best position estimation of a stellar-like object on a CCD detector array in a Bayesian setting where the position is unknown, but where we have access to a prior distribution. In contrast to a parametric setting where we estimate a parameter from observations, the Bayesian approach estimates a random object (I.e., the position is a random variable) from observations that are statistically dependent on the position. Methods: We characterize the Bayesian Cramér-Rao (CR) that bounds the minimum mean square error (MMSE) of the best estimator of the position of a point source on a linear CCD-like detector, as a function of the properties of detector, the source, and the background. Results: We quantify and analyze the increase in astrometric performance from the use of a prior distribution of the object position, which is not available in the classical parametric setting. This gain is shown to be significant for various observational regimes, in particular in the case of faint objects or when the observations are taken under poor conditions. Furthermore, we present numerical evidence that the MMSE estimator of this problem tightly achieves the Bayesian CR bound. This is a remarkable result, demonstrating that all the performance gains presented in our analysis can be achieved with the MMSE estimator. Conclusions: The Bayesian CR bound can be used as a benchmark indicator of the expected maximum positional precision of a set of astrometric measurements in which prior information can be incorporated. This bound can be achieved through the conditional mean estimator, in contrast to the parametric case where no unbiased estimator precisely reaches the CR bound.
Weiss, Scott T.
2014-01-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com. PMID:24922310
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Separating Gravitational Wave Signals from Instrument Artifacts
NASA Technical Reports Server (NTRS)
Littenberg, Tyson B.; Cornish, Neil J.
2010-01-01
Central to the gravitational wave detection problem is the challenge of separating features in the data produced by astrophysical sources from features produced by the detector. Matched filtering provides an optimal solution for Gaussian noise, but in practice, transient noise excursions or "glitches" complicate the analysis. Detector diagnostics and coincidence tests can be used to veto many glitches which may otherwise be misinterpreted as gravitational wave signals. The glitches that remain can lead to long tails in the matched filter search statistics and drive up the detection threshold. Here we describe a Bayesian approach that incorporates a more realistic model for the instrument noise allowing for fluctuating noise levels that vary independently across frequency bands, and deterministic "glitch fitting" using wavelets as "glitch templates", the number of which is determined by a trans-dimensional Markov chain Monte Carlo algorithm. We demonstrate the method's effectiveness on simulated data containing low amplitude gravitational wave signals from inspiraling binary black hole systems, and simulated non-stationary and non-Gaussian noise comprised of a Gaussian component with the standard LIGO/Virgo spectrum, and injected glitches of various amplitude, prevalence, and variety. Glitch fitting allows us to detect significantly weaker signals than standard techniques.
NASA Astrophysics Data System (ADS)
Han, C.; Udalski, A.; Gould, A.; Bond, I. A.; and; Albrow, M. D.; Chung, S.-J.; Jung, Y. K.; Ryu, Y.-H.; Shin, I.-G.; Yee, J. C.; Zhu, W.; Cha, S.-M.; Kim, S.-L.; Kim, D.-J.; Lee, C.-U.; Lee, Y.; Park, B.-G.; KMTNet Collaboration; Skowron, J.; Mróz, P.; Pietrukowicz, P.; Kozłowski, S.; Poleski, R.; Szymański, M. K.; Soszyński, I.; Ulaczyk, K.; Pawlak, M.; OGLE Collaboration; Abe, F.; Asakura, Y.; Barry, R.; Bennett, D. P.; Bhattacharya, A.; Donachie, M.; Evans, P.; Fukui, A.; Hirao, Y.; Itow, Y.; Koshimoto, N.; Li, M. C. A.; Ling, C. H.; Masuda, K.; Matsubara, Y.; Muraki, Y.; Nagakane, M.; Ohnishi, K.; Ranc, C.; Rattenbury, N. J.; Saito, To.; Sharan, A.; Sullivan, D. J.; Sumi, T.; Suzuki, D.; Tristram, P. J.; Yamada, T.; Yamada, T.; Yonehara, A.; The MOA Collaboration
2017-10-01
We report the discovery of a planet-mass companion to the microlens OGLE-2016-BLG-0263L. Unlike most low-mass companions that were detected through perturbations to the smooth and symmetric light curves produced by the primary, the companion was discovered through the channel of a repeating event, in which the companion itself produced its own single-mass light curve after the event produced by the primary had ended. Thanks to the continuous coverage of the second peak by high-cadence surveys, the possibility of the repeating nature due to source binarity is excluded with a 96% confidence level. The mass of the companion estimated by a Bayesian analysis is {M}{{p}}={4.1}-2.5+6.5 {M}{{J}}. The projected primary-companion separation is {a}\\perp ={6.5}-1.9+1.3 au. The ratio of the separation to the snow-line distance of {a}\\perp /{a}{sl}˜ 15.4 corresponds to the region beyond Neptune, the outermost planet of the solar system. We discuss the importance of high-cadence surveys in expanding the range of microlensing detections of low-mass companions and future space-based microlensing surveys.
De March, I; Sironi, E; Taroni, F
2016-09-01
Analysis of marks recovered from different crime scenes can be useful to detect a linkage between criminal cases, even though a putative source for the recovered traces is not available. This particular circumstance is often encountered in the early stage of investigations and thus, the evaluation of evidence association may provide useful information for the investigators. This association is evaluated here from a probabilistic point of view: a likelihood ratio based approach is suggested in order to quantify the strength of the evidence of trace association in the light of two mutually exclusive propositions, namely that the n traces come from a common source or from an unspecified number of sources. To deal with this kind of problem, probabilistic graphical models are used, in form of Bayesian networks and object-oriented Bayesian networks, allowing users to intuitively handle with uncertainty related to the inferential problem. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
English, Sangeeta B.; Shih, Shou-Ching; Ramoni, Marco F.; Smith, Lois E.; Butte, Atul J.
2014-01-01
Though genome-wide technologies, such as microarrays, are widely used, data from these methods are considered noisy; there is still varied success in downstream biological validation. We report a method that increases the likelihood of successfully validating microarray findings using real time RT-PCR, including genes at low expression levels and with small differences. We use a Bayesian network to identify the most relevant sources of noise based on the successes and failures in validation for an initial set of selected genes, and then improve our subsequent selection of genes for validation based on eliminating these sources of noise. The network displays the significant sources of noise in an experiment, and scores the likelihood of validation for every gene. We show how the method can significantly increase validation success rates. In conclusion, in this study, we have successfully added a new automated step to determine the contributory sources of noise that determine successful or unsuccessful downstream biological validation. PMID:18790084
True versus Apparent Malaria Infection Prevalence: The Contribution of a Bayesian Approach
Claes, Filip; Van Hong, Nguyen; Torres, Kathy; Mao, Sokny; Van den Eede, Peter; Thi Thinh, Ta; Gamboa, Dioni; Sochantha, Tho; Thang, Ngo Duc; Coosemans, Marc; Büscher, Philippe; D'Alessandro, Umberto; Berkvens, Dirk; Erhart, Annette
2011-01-01
Aims To present a new approach for estimating the “true prevalence” of malaria and apply it to datasets from Peru, Vietnam, and Cambodia. Methods Bayesian models were developed for estimating both the malaria prevalence using different diagnostic tests (microscopy, PCR & ELISA), without the need of a gold standard, and the tests' characteristics. Several sources of information, i.e. data, expert opinions and other sources of knowledge can be integrated into the model. This approach resulting in an optimal and harmonized estimate of malaria infection prevalence, with no conflict between the different sources of information, was tested on data from Peru, Vietnam and Cambodia. Results Malaria sero-prevalence was relatively low in all sites, with ELISA showing the highest estimates. The sensitivity of microscopy and ELISA were statistically lower in Vietnam than in the other sites. Similarly, the specificities of microscopy, ELISA and PCR were significantly lower in Vietnam than in the other sites. In Vietnam and Peru, microscopy was closer to the “true” estimate than the other 2 tests while as expected ELISA, with its lower specificity, usually overestimated the prevalence. Conclusions Bayesian methods are useful for analyzing prevalence results when no gold standard diagnostic test is available. Though some results are expected, e.g. PCR more sensitive than microscopy, a standardized and context-independent quantification of the diagnostic tests' characteristics (sensitivity and specificity) and the underlying malaria prevalence may be useful for comparing different sites. Indeed, the use of a single diagnostic technique could strongly bias the prevalence estimation. This limitation can be circumvented by using a Bayesian framework taking into account the imperfect characteristics of the currently available diagnostic tests. As discussed in the paper, this approach may further support global malaria burden estimation initiatives. PMID:21364745
NASA Astrophysics Data System (ADS)
Li, D.
2016-12-01
Sudden water pollution accidents are unavoidable risk events that we must learn to co-exist with. In China's Taihu River Basin, the river flow conditions are complicated with frequently artificial interference. Sudden water pollution accident occurs mainly in the form of a large number of abnormal discharge of wastewater, and has the characteristics with the sudden occurrence, the uncontrollable scope, the uncertainty object and the concentrated distribution of many risk sources. Effective prevention of pollution accidents that may occur is of great significance for the water quality safety management. Bayesian networks can be applied to represent the relationship between pollution sources and river water quality intuitively. Using the time sequential Monte Carlo algorithm, the pollution sources state switching model, water quality model for river network and Bayesian reasoning is integrated together, and the sudden water pollution risk assessment model for river network is developed to quantify the water quality risk under the collective influence of multiple pollution sources. Based on the isotope water transport mechanism, a dynamic tracing model of multiple pollution sources is established, which can describe the relationship between the excessive risk of the system and the multiple risk sources. Finally, the diagnostic reasoning algorithm based on Bayesian network is coupled with the multi-source tracing model, which can identify the contribution of each risk source to the system risk under the complex flow conditions. Taking Taihu Lake water system as the research object, the model is applied to obtain the reasonable results under the three typical years. Studies have shown that the water quality risk at critical sections are influenced by the pollution risk source, the boundary water quality, the hydrological conditions and self -purification capacity, and the multiple pollution sources have obvious effect on water quality risk of the receiving water body. The water quality risk assessment approach developed in this study offers a effective tool for systematically quantifying the random uncertainty in plain river network system, and it also provides the technical support for the decision-making of controlling the sudden water pollution through identification of critical pollution sources.
Wu, Jianyong; Gronewold, Andrew D; Rodriguez, Roberto A; Stewart, Jill R; Sobsey, Mark D
2014-02-01
Rapid quantification of viral pathogens in drinking and recreational water can help reduce waterborne disease risks. For this purpose, samples in small volume (e.g. 1L) are favored because of the convenience of collection, transportation and processing. However, the results of viral analysis are often subject to uncertainty. To overcome this limitation, we propose an approach that integrates Bayesian statistics, efficient concentration methods, and quantitative PCR (qPCR) to quantify viral pathogens in water. Using this approach, we quantified human adenoviruses (HAdVs) in eighteen samples of source water collected from six drinking water treatment plants. HAdVs were found in seven samples. In the other eleven samples, HAdVs were not detected by qPCR, but might have existed based on Bayesian inference. Our integrated approach that quantifies uncertainty provides a better understanding than conventional assessments of potential risks to public health, particularly in cases when pathogens may present a threat but cannot be detected by traditional methods. © 2013 Elsevier B.V. All rights reserved.
A Flexible Hierarchical Bayesian Modeling Technique for Risk Analysis of Major Accidents.
Yu, Hongyang; Khan, Faisal; Veitch, Brian
2017-09-01
Safety analysis of rare events with potentially catastrophic consequences is challenged by data scarcity and uncertainty. Traditional causation-based approaches, such as fault tree and event tree (used to model rare event), suffer from a number of weaknesses. These include the static structure of the event causation, lack of event occurrence data, and need for reliable prior information. In this study, a new hierarchical Bayesian modeling based technique is proposed to overcome these drawbacks. The proposed technique can be used as a flexible technique for risk analysis of major accidents. It enables both forward and backward analysis in quantitative reasoning and the treatment of interdependence among the model parameters. Source-to-source variability in data sources is also taken into account through a robust probabilistic safety analysis. The applicability of the proposed technique has been demonstrated through a case study in marine and offshore industry. © 2017 Society for Risk Analysis.
Bayesian multiple-source localization in an uncertain ocean environment.
Dosso, Stan E; Wilmut, Michael J
2011-06-01
This paper considers simultaneous localization of multiple acoustic sources when properties of the ocean environment (water column and seabed) are poorly known. A Bayesian formulation is developed in which the environmental parameters, noise statistics, and locations and complex strengths (amplitudes and phases) of multiple sources are considered to be unknown random variables constrained by acoustic data and prior information. Two approaches are considered for estimating source parameters. Focalization maximizes the posterior probability density (PPD) over all parameters using adaptive hybrid optimization. Marginalization integrates the PPD using efficient Markov-chain Monte Carlo methods to produce joint marginal probability distributions for source ranges and depths, from which source locations are obtained. This approach also provides quantitative uncertainty analysis for all parameters, which can aid in understanding of the inverse problem and may be of practical interest (e.g., source-strength probability distributions). In both approaches, closed-form maximum-likelihood expressions for source strengths and noise variance at each frequency allow these parameters to be sampled implicitly, substantially reducing the dimensionality and difficulty of the inversion. Examples are presented of both approaches applied to single- and multi-frequency localization of multiple sources in an uncertain shallow-water environment, and a Monte Carlo performance evaluation study is carried out. © 2011 Acoustical Society of America
Bayesian Population Forecasting: Extending the Lee-Carter Method.
Wiśniowski, Arkadiusz; Smith, Peter W F; Bijak, Jakub; Raymer, James; Forster, Jonathan J
2015-06-01
In this article, we develop a fully integrated and dynamic Bayesian approach to forecast populations by age and sex. The approach embeds the Lee-Carter type models for forecasting the age patterns, with associated measures of uncertainty, of fertility, mortality, immigration, and emigration within a cohort projection model. The methodology may be adapted to handle different data types and sources of information. To illustrate, we analyze time series data for the United Kingdom and forecast the components of population change to the year 2024. We also compare the results obtained from different forecast models for age-specific fertility, mortality, and migration. In doing so, we demonstrate the flexibility and advantages of adopting the Bayesian approach for population forecasting and highlight areas where this work could be extended.
On a full Bayesian inference for force reconstruction problems
NASA Astrophysics Data System (ADS)
Aucejo, M.; De Smet, O.
2018-05-01
In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
Bayesian focalization: quantifying source localization with environmental uncertainty.
Dosso, Stan E; Wilmut, Michael J
2007-05-01
This paper applies a Bayesian formulation to study ocean acoustic source localization as a function of uncertainty in environmental properties (water column and seabed) and of data information content [signal-to-noise ratio (SNR) and number of frequencies]. The approach follows that of the optimum uncertain field processor [A. M. Richardson and L. W. Nolte, J. Acoust. Soc. Am. 89, 2280-2284 (1991)], in that localization uncertainty is quantified by joint marginal probability distributions for source range and depth integrated over uncertain environmental properties. The integration is carried out here using Metropolis Gibbs' sampling for environmental parameters and heat-bath Gibbs' sampling for source location to provide efficient sampling over complicated parameter spaces. The approach is applied to acoustic data from a shallow-water site in the Mediterranean Sea where previous geoacoustic studies have been carried out. It is found that reliable localization requires a sufficient combination of prior (environmental) information and data information. For example, sources can be localized reliably for single-frequency data at low SNR (-3 dB) only with small environmental uncertainties, whereas successful localization with large environmental uncertainties requires higher SNR and/or multifrequency data.
An LUR/BME framework to estimate PM2.5 explained by on road mobile and stationary sources.
Reyes, Jeanette M; Serre, Marc L
2014-01-01
Knowledge of particulate matter concentrations <2.5 μm in diameter (PM2.5) across the United States is limited due to sparse monitoring across space and time. Epidemiological studies need accurate exposure estimates in order to properly investigate potential morbidity and mortality. Previous works have used geostatistics and land use regression (LUR) separately to quantify exposure. This work combines both methods by incorporating a large area variability LUR model that accounts for on road mobile emissions and stationary source emissions along with data that take into account incompleteness of PM2.5 monitors into the modern geostatistical Bayesian Maximum Entropy (BME) framework to estimate PM2.5 across the United States from 1999 to 2009. A cross-validation was done to determine the improvement of the estimate due to the LUR incorporation into BME. These results were applied to known diseases to determine predicted mortality coming from total PM2.5 as well as PM2.5 explained by major contributing sources. This method showed a mean squared error reduction of over 21.89% oversimple kriging. PM2.5 explained by on road mobile emissions and stationary emissions contributed to nearly 568,090 and 306,316 deaths, respectively, across the United States from 1999 to 2007.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graves, Todd L; Hamada, Michael S
2008-01-01
Good estimates of the reliability of a system make use of test data and expert knowledge at all available levels. Furthermore, by integrating all these information sources, one can determine how best to allocate scarce testing resources to reduce uncertainty. Both of these goals are facilitated by modern Bayesian computational methods. We apply these tools to examples that were previously solvable only through the use of ingenious approximations, and use genetic algorithms to guide resource allocation.
Storage and Retrieval Changes that Occur in the Development and Release of PI
ERIC Educational Resources Information Center
Chechile, Richard; Butler, Keith
1975-01-01
A Bayesian statistical procedure separating storage from retrieval was used to study development and release of proactive interference in the Brown-Peterson paradigm. A theory of PI is developed stressing response competition at test time and interference in transfer between short- and long-term memory. (CHK)
Thurman, Steven M; Lu, Hongjing
2014-01-01
Visual form analysis is fundamental to shape perception and likely plays a central role in perception of more complex dynamic shapes, such as moving objects or biological motion. Two primary form-based cues serve to represent the overall shape of an object: the spatial position and the orientation of locations along the boundary of the object. However, it is unclear how the visual system integrates these two sources of information in dynamic form analysis, and in particular how the brain resolves ambiguities due to sensory uncertainty and/or cue conflict. In the current study, we created animations of sparsely-sampled dynamic objects (human walkers or rotating squares) comprised of oriented Gabor patches in which orientation could either coincide or conflict with information provided by position cues. When the cues were incongruent, we found a characteristic trade-off between position and orientation information whereby position cues increasingly dominated perception as the relative uncertainty of orientation increased and vice versa. Furthermore, we found no evidence for differences in the visual processing of biological and non-biological objects, casting doubt on the claim that biological motion may be specialized in the human brain, at least in specific terms of form analysis. To explain these behavioral results quantitatively, we adopt a probabilistic template-matching model that uses Bayesian inference within local modules to estimate object shape separately from either spatial position or orientation signals. The outputs of the two modules are integrated with weights that reflect individual estimates of subjective cue reliability, and integrated over time to produce a decision about the perceived dynamics of the input data. Results of this model provided a close fit to the behavioral data, suggesting a mechanism in the human visual system that approximates rational Bayesian inference to integrate position and orientation signals in dynamic form analysis.
Archer, S C; Mc Coy, F; Wapenaar, W; Green, M J
2014-01-01
The aim of this research was to determine budgets for specific management interventions to control heifer mastitis in Irish dairy herds as an example of evidence synthesis and 1-step Bayesian micro-simulation in a veterinary context. Budgets were determined for different decision makers based on their willingness to pay. Reducing the prevalence of heifers with a high milk somatic cell count (SCC) early in the first lactation could be achieved through herd level management interventions for pre- and peri-partum heifers, however the cost effectiveness of these interventions is unknown. A synthesis of multiple sources of evidence, accounting for variability and uncertainty in the available data is invaluable to inform decision makers around likely economic outcomes of investing in disease control measures. One analytical approach to this is Bayesian micro-simulation, where the trajectory of different individuals undergoing specific interventions is simulated. The classic micro-simulation framework was extended to encompass synthesis of evidence from 2 separate statistical models and previous research, with the outcome for an individual cow or herd assessed in terms of changes in lifetime milk yield, disposal risk, and likely financial returns conditional on the interventions being simultaneously applied. The 3 interventions tested were storage of bedding inside, decreasing transition yard stocking density, and spreading of bedding evenly in the calving area. Budgets for the interventions were determined based on the minimum expected return on investment, and the probability of the desired outcome. Budgets for interventions to control heifer mastitis were highly dependent on the decision maker's willingness to pay, and hence minimum expected return on investment. Understanding the requirements of decision makers and their rational spending limits would be useful for the development of specific interventions for particular farms to control heifer mastitis, and other endemic diseases. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Irving, J.; Koepke, C.; Elsheikh, A. H.
2017-12-01
Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward process model linking subsurface parameters to measured data, which is typically assumed to be known perfectly in the inversion procedure. However, in order to make the stochastic solution of the inverse problem computationally tractable using, for example, Markov-chain-Monte-Carlo (MCMC) methods, fast approximations of the forward model are commonly employed. This introduces model error into the problem, which has the potential to significantly bias posterior statistics and hamper data integration efforts if not properly accounted for. Here, we present a new methodology for addressing the issue of model error in Bayesian solutions to hydrogeophysical inverse problems that is geared towards the common case where these errors cannot be effectively characterized globally through some parametric statistical distribution or locally based on interpolation between a small number of computed realizations. Rather than focusing on the construction of a global or local error model, we instead work towards identification of the model-error component of the residual through a projection-based approach. In this regard, pairs of approximate and detailed model runs are stored in a dictionary that grows at a specified rate during the MCMC inversion procedure. At each iteration, a local model-error basis is constructed for the current test set of model parameters using the K-nearest neighbour entries in the dictionary, which is then used to separate the model error from the other error sources before computing the likelihood of the proposed set of model parameters. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar traveltime data for three different subsurface parameterizations of varying complexity. The synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed in the inversion procedure. In each case, the developed model-error approach enables to remove posterior bias and obtain a more realistic characterization of uncertainty.
NASA Technical Reports Server (NTRS)
He, Yuning
2015-01-01
The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.
Robust nonlinear system identification: Bayesian mixture of experts using the t-distribution
NASA Astrophysics Data System (ADS)
Baldacchino, Tara; Worden, Keith; Rowson, Jennifer
2017-02-01
A novel variational Bayesian mixture of experts model for robust regression of bifurcating and piece-wise continuous processes is introduced. The mixture of experts model is a powerful model which probabilistically splits the input space allowing different models to operate in the separate regions. However, current methods have no fail-safe against outliers. In this paper, a robust mixture of experts model is proposed which consists of Student-t mixture models at the gates and Student-t distributed experts, trained via Bayesian inference. The Student-t distribution has heavier tails than the Gaussian distribution, and so it is more robust to outliers, noise and non-normality in the data. Using both simulated data and real data obtained from the Z24 bridge this robust mixture of experts performs better than its Gaussian counterpart when outliers are present. In particular, it provides robustness to outliers in two forms: unbiased parameter regression models, and robustness to overfitting/complex models.
Whose statistical reasoning is facilitated by a causal structure intervention?
McNair, Simon; Feeney, Aidan
2015-02-01
People often struggle when making Bayesian probabilistic estimates on the basis of competing sources of statistical evidence. Recently, Krynski and Tenenbaum (Journal of Experimental Psychology: General, 136, 430-450, 2007) proposed that a causal Bayesian framework accounts for peoples' errors in Bayesian reasoning and showed that, by clarifying the causal relations among the pieces of evidence, judgments on a classic statistical reasoning problem could be significantly improved. We aimed to understand whose statistical reasoning is facilitated by the causal structure intervention. In Experiment 1, although we observed causal facilitation effects overall, the effect was confined to participants high in numeracy. We did not find an overall facilitation effect in Experiment 2 but did replicate the earlier interaction between numerical ability and the presence or absence of causal content. This effect held when we controlled for general cognitive ability and thinking disposition. Our results suggest that clarifying causal structure facilitates Bayesian judgments, but only for participants with sufficient understanding of basic concepts in probability and statistics.
Bayesian sample size calculations in phase II clinical trials using a mixture of informative priors.
Gajewski, Byron J; Mayo, Matthew S
2006-08-15
A number of researchers have discussed phase II clinical trials from a Bayesian perspective. A recent article by Mayo and Gajewski focuses on sample size calculations, which they determine by specifying an informative prior distribution and then calculating a posterior probability that the true response will exceed a prespecified target. In this article, we extend these sample size calculations to include a mixture of informative prior distributions. The mixture comes from several sources of information. For example consider information from two (or more) clinicians. The first clinician is pessimistic about the drug and the second clinician is optimistic. We tabulate the results for sample size design using the fact that the simple mixture of Betas is a conjugate family for the Beta- Binomial model. We discuss the theoretical framework for these types of Bayesian designs and show that the Bayesian designs in this paper approximate this theoretical framework. Copyright 2006 John Wiley & Sons, Ltd.
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The performance of matched-field track-before-detect methods using shallow-water Pacific data.
Tantum, Stacy L; Nolte, Loren W; Krolik, Jeffrey L; Harmanci, Kerem
2002-07-01
Matched-field track-before-detect processing, which extends the concept of matched-field processing to include modeling of the source dynamics, has recently emerged as a promising approach for maintaining the track of a moving source. In this paper, optimal Bayesian and minimum variance beamforming track-before-detect algorithms which incorporate a priori knowledge of the source dynamics in addition to the underlying uncertainties in the ocean environment are presented. A Markov model is utilized for the source motion as a means of capturing the stochastic nature of the source dynamics without assuming uniform motion. In addition, the relationship between optimal Bayesian track-before-detect processing and minimum variance track-before-detect beamforming is examined, revealing how an optimal tracking philosophy may be used to guide the modification of existing beamforming techniques to incorporate track-before-detect capabilities. Further, the benefits of implementing an optimal approach over conventional methods are illustrated through application of these methods to shallow-water Pacific data collected as part of the SWellEX-1 experiment. The results show that incorporating Markovian dynamics for the source motion provides marked improvement in the ability to maintain target track without the use of a uniform velocity hypothesis.
Anezaki, Katsunori; Nakano, Takeshi; Kashiwagi, Nobuhisa
2016-01-19
Using the chemical balance method, and considering the presence of unidentified sources, we estimated the origins of PCB contamination in surface sediments of Muroran Port, Japan. It was assumed that these PCBs originated from four types of Kanechlor products (KC300, KC400, KC500, and KC600), combustion and two kinds of pigments (azo and phthalocyanine). The characteristics of these congener patterns were summarized on the basis of principal component analysis and explanatory variables determined. A Bayesian semifactor model (CMBK2) was applied to the explanatory variables to analyze the sources of PCBs in the sediments. The resulting estimates of the contribution ratio of each kind of sediment indicate that the existence of unidentified sources can be ignored and that the assumed seven sources are adequate to account for the contamination. Within the port, the contribution ratio of KC500 and KC600 (used as paints for ship hulls) was extremely high, but outside the port, the influence of azo pigments was observable to a limited degree. This indicates that environmental PCBs not derived from technical PCBs are present at levels that cannot be ignored.
Source partitioning of methane emissions and its seasonality in the U.S. Midwest
USDA-ARS?s Scientific Manuscript database
The methane (CH4) budget and its source partitioning are poorly constrained in the Midwestern, United States. We used tall tower (185 m) aerodynamic flux measurements and atmospheric scale factor Bayesian inversions (SFBI) to constrain the monthly budget and to partition the total budget into natura...
NASA Astrophysics Data System (ADS)
Validi, AbdoulAhad
2014-03-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.
NASA Astrophysics Data System (ADS)
Bagnardi, M.; Hooper, A. J.
2017-12-01
Inversions of geodetic observational data, such as Interferometric Synthetic Aperture Radar (InSAR) and Global Navigation Satellite System (GNSS) measurements, are often performed to obtain information about the source of surface displacements. Inverse problem theory has been applied to study magmatic processes, the earthquake cycle, and other phenomena that cause deformation of the Earth's interior and of its surface. Together with increasing improvements in data resolution, both spatial and temporal, new satellite missions (e.g., European Commission's Sentinel-1 satellites) are providing the unprecedented opportunity to access space-geodetic data within hours from their acquisition. To truly take advantage of these opportunities we must become able to interpret geodetic data in a rapid and robust manner. Here we present the open-source Geodetic Bayesian Inversion Software (GBIS; available for download at http://comet.nerc.ac.uk/gbis). GBIS is written in Matlab and offers a series of user-friendly and interactive pre- and post-processing tools. For example, an interactive function has been developed to estimate the characteristics of noise in InSAR data by calculating the experimental semi-variogram. The inversion software uses a Markov-chain Monte Carlo algorithm, incorporating the Metropolis-Hastings algorithm with adaptive step size, to efficiently sample the posterior probability distribution of the different source parameters. The probabilistic Bayesian approach allows the user to retrieve estimates of the optimal (best-fitting) deformation source parameters together with the associated uncertainties produced by errors in the data (and by scaling, errors in the model). The current version of GBIS (V1.0) includes fast analytical forward models for magmatic sources of different geometry (e.g., point source, finite spherical source, prolate spheroid source, penny-shaped sill-like source, and dipping-dike with uniform opening) and for dipping faults with uniform slip, embedded in a isotropic elastic half-space. However, the software architecture allows the user to easily add any other analytical or numerical forward models to calculate displacements at the surface. GBIS is delivered with a detailed user manual and three synthetic datasets for testing and practical training.
Wide Binaries in TGAS: Search Method and First Results
NASA Astrophysics Data System (ADS)
Andrews, Jeff J.; Chanamé, Julio; Agüeros, Marcel A.
2018-04-01
Half of all stars reside in binary systems, many of which have orbital separations in excess of 1000 AU. Such binaries are typically identified in astrometric catalogs by matching the proper motions vectors of close stellar pairs. We present a fully Bayesian method that properly takes into account positions, proper motions, parallaxes, and their correlated uncertainties to identify widely separated stellar binaries. After applying our method to the >2 × 106 stars in the Tycho-Gaia astrometric solution from Gaia DR1, we identify over 6000 candidate wide binaries. For those pairs with separations less than 40,000 AU, we determine the contamination rate to be ~5%. This sample has an orbital separation (a) distribution that is roughly flat in log space for separations less than ~5000 AU and follows a power law of a -1.6 at larger separations.
Model-based Bayesian signal extraction algorithm for peripheral nerves
NASA Astrophysics Data System (ADS)
Eggers, Thomas E.; Dweiri, Yazan M.; McCallum, Grant A.; Durand, Dominique M.
2017-10-01
Objective. Multi-channel cuff electrodes have recently been investigated for extracting fascicular-level motor commands from mixed neural recordings. Such signals could provide volitional, intuitive control over a robotic prosthesis for amputee patients. Recent work has demonstrated success in extracting these signals in acute and chronic preparations using spatial filtering techniques. These extracted signals, however, had low signal-to-noise ratios and thus limited their utility to binary classification. In this work a new algorithm is proposed which combines previous source localization approaches to create a model based method which operates in real time. Approach. To validate this algorithm, a saline benchtop setup was created to allow the precise placement of artificial sources within a cuff and interference sources outside the cuff. The artificial source was taken from five seconds of chronic neural activity to replicate realistic recordings. The proposed algorithm, hybrid Bayesian signal extraction (HBSE), is then compared to previous algorithms, beamforming and a Bayesian spatial filtering method, on this test data. An example chronic neural recording is also analyzed with all three algorithms. Main results. The proposed algorithm improved the signal to noise and signal to interference ratio of extracted test signals two to three fold, as well as increased the correlation coefficient between the original and recovered signals by 10-20%. These improvements translated to the chronic recording example and increased the calculated bit rate between the recovered signals and the recorded motor activity. Significance. HBSE significantly outperforms previous algorithms in extracting realistic neural signals, even in the presence of external noise sources. These results demonstrate the feasibility of extracting dynamic motor signals from a multi-fascicled intact nerve trunk, which in turn could extract motor command signals from an amputee for the end goal of controlling a prosthetic limb.
Efficient Bayesian experimental design for contaminant source identification
NASA Astrophysics Data System (ADS)
Zhang, J.; Zeng, L.
2013-12-01
In this study, an efficient full Bayesian approach is developed for the optimal sampling well location design and source parameter identification of groundwater contaminants. An information measure, i.e., the relative entropy, is employed to quantify the information gain from indirect concentration measurements in identifying unknown source parameters such as the release time, strength and location. In this approach, the sampling location that gives the maximum relative entropy is selected as the optimal one. Once the sampling location is determined, a Bayesian approach based on Markov Chain Monte Carlo (MCMC) is used to estimate unknown source parameters. In both the design and estimation, the contaminant transport equation is required to be solved many times to evaluate the likelihood. To reduce the computational burden, an interpolation method based on the adaptive sparse grid is utilized to construct a surrogate for the contaminant transport. The approximated likelihood can be evaluated directly from the surrogate, which greatly accelerates the design and estimation process. The accuracy and efficiency of our approach are demonstrated through numerical case studies. Compared with the traditional optimal design, which is based on the Gaussian linear assumption, the method developed in this study can cope with arbitrary nonlinearity. It can be used to assist in groundwater monitor network design and identification of unknown contaminant sources. Contours of the expected information gain. The optimal observing location corresponds to the maximum value. Posterior marginal probability densities of unknown parameters, the thick solid black lines are for the designed location. For comparison, other 7 lines are for randomly chosen locations. The true values are denoted by vertical lines. It is obvious that the unknown parameters are estimated better with the desinged location.
Acoustic emission based damage localization in composites structures using Bayesian identification
NASA Astrophysics Data System (ADS)
Kundu, A.; Eaton, M. J.; Al-Jumali, S.; Sikdar, S.; Pullin, R.
2017-05-01
Acoustic emission based damage detection in composite structures is based on detection of ultra high frequency packets of acoustic waves emitted from damage sources (such as fibre breakage, fatigue fracture, amongst others) with a network of distributed sensors. This non-destructive monitoring scheme requires solving an inverse problem where the measured signals are linked back to the location of the source. This in turn enables rapid deployment of mitigative measures. The presence of significant amount of uncertainty associated with the operating conditions and measurements makes the problem of damage identification quite challenging. The uncertainties stem from the fact that the measured signals are affected by the irregular geometries, manufacturing imprecision, imperfect boundary conditions, existing damages/structural degradation, amongst others. This work aims to tackle these uncertainties within a framework of automated probabilistic damage detection. The method trains a probabilistic model of the parametrized input and output model of the acoustic emission system with experimental data to give probabilistic descriptors of damage locations. A response surface modelling the acoustic emission as a function of parametrized damage signals collected from sensors would be calibrated with a training dataset using Bayesian inference. This is used to deduce damage locations in the online monitoring phase. During online monitoring, the spatially correlated time data is utilized in conjunction with the calibrated acoustic emissions model to infer the probabilistic description of the acoustic emission source within a hierarchical Bayesian inference framework. The methodology is tested on a composite structure consisting of carbon fibre panel with stiffeners and damage source behaviour has been experimentally simulated using standard H-N sources. The methodology presented in this study would be applicable in the current form to structural damage detection under varying operational loads and would be investigated in future studies.
Application of hierarchical Bayesian unmixing models in river sediment source apportionment
NASA Astrophysics Data System (ADS)
Blake, Will; Smith, Hugh; Navas, Ana; Bodé, Samuel; Goddard, Rupert; Zou Kuzyk, Zou; Lennard, Amy; Lobb, David; Owens, Phil; Palazon, Leticia; Petticrew, Ellen; Gaspar, Leticia; Stock, Brian; Boeckx, Pacsal; Semmens, Brice
2016-04-01
Fingerprinting and unmixing concepts are used widely across environmental disciplines for forensic evaluation of pollutant sources. In aquatic and marine systems, this includes tracking the source of organic and inorganic pollutants in water and linking problem sediment to soil erosion and land use sources. It is, however, the particular complexity of ecological systems that has driven creation of the most sophisticated mixing models, primarily to (i) evaluate diet composition in complex ecological food webs, (ii) inform population structure and (iii) explore animal movement. In the context of the new hierarchical Bayesian unmixing model, MIXSIAR, developed to characterise intra-population niche variation in ecological systems, we evaluate the linkage between ecological 'prey' and 'consumer' concepts and river basin sediment 'source' and sediment 'mixtures' to exemplify the value of ecological modelling tools to river basin science. Recent studies have outlined advantages presented by Bayesian unmixing approaches in handling complex source and mixture datasets while dealing appropriately with uncertainty in parameter probability distributions. MixSIAR is unique in that it allows individual fixed and random effects associated with mixture hierarchy, i.e. factors that might exert an influence on model outcome for mixture groups, to be explored within the source-receptor framework. This offers new and powerful ways of interpreting river basin apportionment data. In this contribution, key components of the model are evaluated in the context of common experimental designs for sediment fingerprinting studies namely simple, nested and distributed catchment sampling programmes. Illustrative examples using geochemical and compound specific stable isotope datasets are presented and used to discuss best practice with specific attention to (1) the tracer selection process, (2) incorporation of fixed effects relating to sample timeframe and sediment type in the modelling process, (3) deriving and using informative priors in sediment fingerprinting context and (4) transparency of the process and replication of model results by other users.
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
Objectively combining AR5 instrumental period and paleoclimate climate sensitivity evidence
NASA Astrophysics Data System (ADS)
Lewis, Nicholas; Grünwald, Peter
2018-03-01
Combining instrumental period evidence regarding equilibrium climate sensitivity with largely independent paleoclimate proxy evidence should enable a more constrained sensitivity estimate to be obtained. Previous, subjective Bayesian approaches involved selection of a prior probability distribution reflecting the investigators' beliefs about climate sensitivity. Here a recently developed approach employing two different statistical methods—objective Bayesian and frequentist likelihood-ratio—is used to combine instrumental period and paleoclimate evidence based on data presented and assessments made in the IPCC Fifth Assessment Report. Probabilistic estimates from each source of evidence are represented by posterior probability density functions (PDFs) of physically-appropriate form that can be uniquely factored into a likelihood function and a noninformative prior distribution. The three-parameter form is shown accurately to fit a wide range of estimated climate sensitivity PDFs. The likelihood functions relating to the probabilistic estimates from the two sources are multiplicatively combined and a prior is derived that is noninformative for inference from the combined evidence. A posterior PDF that incorporates the evidence from both sources is produced using a single-step approach, which avoids the order-dependency that would arise if Bayesian updating were used. Results are compared with an alternative approach using the frequentist signed root likelihood ratio method. Results from these two methods are effectively identical, and provide a 5-95% range for climate sensitivity of 1.1-4.05 K (median 1.87 K).
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; ...
2017-10-17
In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan
In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less
Adkison, Milo D.; Peterman, R.M.
1996-01-01
Bayesian methods have been proposed to estimate optimal escapement goals, using both knowledge about physical determinants of salmon productivity and stock-recruitment data. The Bayesian approach has several advantages over many traditional methods for estimating stock productivity: it allows integration of information from diverse sources and provides a framework for decision-making that takes into account uncertainty reflected in the data. However, results can be critically dependent on details of implementation of this approach. For instance, unintended and unwarranted confidence about stock-recruitment relationships can arise if the range of relationships examined is too narrow, if too few discrete alternatives are considered, or if data are contradictory. This unfounded confidence can result in a suboptimal choice of a spawning escapement goal.
Strelioff, Christopher C; Crutchfield, James P; Hübler, Alfred W
2007-07-01
Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k , from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.
Fast model updating coupling Bayesian inference and PGD model reduction
NASA Astrophysics Data System (ADS)
Rubio, Paul-Baptiste; Louf, François; Chamoin, Ludovic
2018-04-01
The paper focuses on a coupled Bayesian-Proper Generalized Decomposition (PGD) approach for the real-time identification and updating of numerical models. The purpose is to use the most general case of Bayesian inference theory in order to address inverse problems and to deal with different sources of uncertainties (measurement and model errors, stochastic parameters). In order to do so with a reasonable CPU cost, the idea is to replace the direct model called for Monte-Carlo sampling by a PGD reduced model, and in some cases directly compute the probability density functions from the obtained analytical formulation. This procedure is first applied to a welding control example with the updating of a deterministic parameter. In the second application, the identification of a stochastic parameter is studied through a glued assembly example.
A fuzzy Bayesian approach to flood frequency estimation with imprecise historical information
Kiss, Andrea; Viglione, Alberto; Viertl, Reinhard; Blöschl, Günter
2016-01-01
Abstract This paper presents a novel framework that links imprecision (through a fuzzy approach) and stochastic uncertainty (through a Bayesian approach) in estimating flood probabilities from historical flood information and systematic flood discharge data. The method exploits the linguistic characteristics of historical source material to construct membership functions, which may be wider or narrower, depending on the vagueness of the statements. The membership functions are either included in the prior distribution or the likelihood function to obtain a fuzzy version of the flood frequency curve. The viability of the approach is demonstrated by three case studies that differ in terms of their hydromorphological conditions (from an Alpine river with bedrock profile to a flat lowland river with extensive flood plains) and historical source material (including narratives, town and county meeting protocols, flood marks and damage accounts). The case studies are presented in order of increasing fuzziness (the Rhine at Basel, Switzerland; the Werra at Meiningen, Germany; and the Tisza at Szeged, Hungary). Incorporating imprecise historical information is found to reduce the range between the 5% and 95% Bayesian credibility bounds of the 100 year floods by 45% and 61% for the Rhine and Werra case studies, respectively. The strengths and limitations of the framework are discussed relative to alternative (non‐fuzzy) methods. The fuzzy Bayesian inference framework provides a flexible methodology that fits the imprecise nature of linguistic information on historical floods as available in historical written documentation. PMID:27840456
A fuzzy Bayesian approach to flood frequency estimation with imprecise historical information
NASA Astrophysics Data System (ADS)
Salinas, José Luis; Kiss, Andrea; Viglione, Alberto; Viertl, Reinhard; Blöschl, Günter
2016-09-01
This paper presents a novel framework that links imprecision (through a fuzzy approach) and stochastic uncertainty (through a Bayesian approach) in estimating flood probabilities from historical flood information and systematic flood discharge data. The method exploits the linguistic characteristics of historical source material to construct membership functions, which may be wider or narrower, depending on the vagueness of the statements. The membership functions are either included in the prior distribution or the likelihood function to obtain a fuzzy version of the flood frequency curve. The viability of the approach is demonstrated by three case studies that differ in terms of their hydromorphological conditions (from an Alpine river with bedrock profile to a flat lowland river with extensive flood plains) and historical source material (including narratives, town and county meeting protocols, flood marks and damage accounts). The case studies are presented in order of increasing fuzziness (the Rhine at Basel, Switzerland; the Werra at Meiningen, Germany; and the Tisza at Szeged, Hungary). Incorporating imprecise historical information is found to reduce the range between the 5% and 95% Bayesian credibility bounds of the 100 year floods by 45% and 61% for the Rhine and Werra case studies, respectively. The strengths and limitations of the framework are discussed relative to alternative (non-fuzzy) methods. The fuzzy Bayesian inference framework provides a flexible methodology that fits the imprecise nature of linguistic information on historical floods as available in historical written documentation.
A fuzzy Bayesian approach to flood frequency estimation with imprecise historical information.
Salinas, José Luis; Kiss, Andrea; Viglione, Alberto; Viertl, Reinhard; Blöschl, Günter
2016-09-01
This paper presents a novel framework that links imprecision (through a fuzzy approach) and stochastic uncertainty (through a Bayesian approach) in estimating flood probabilities from historical flood information and systematic flood discharge data. The method exploits the linguistic characteristics of historical source material to construct membership functions, which may be wider or narrower, depending on the vagueness of the statements. The membership functions are either included in the prior distribution or the likelihood function to obtain a fuzzy version of the flood frequency curve. The viability of the approach is demonstrated by three case studies that differ in terms of their hydromorphological conditions (from an Alpine river with bedrock profile to a flat lowland river with extensive flood plains) and historical source material (including narratives, town and county meeting protocols, flood marks and damage accounts). The case studies are presented in order of increasing fuzziness (the Rhine at Basel, Switzerland; the Werra at Meiningen, Germany; and the Tisza at Szeged, Hungary). Incorporating imprecise historical information is found to reduce the range between the 5% and 95% Bayesian credibility bounds of the 100 year floods by 45% and 61% for the Rhine and Werra case studies, respectively. The strengths and limitations of the framework are discussed relative to alternative (non-fuzzy) methods. The fuzzy Bayesian inference framework provides a flexible methodology that fits the imprecise nature of linguistic information on historical floods as available in historical written documentation.
Heuristic Bayesian segmentation for discovery of coexpressed genes within genomic regions.
Pehkonen, Petri; Wong, Garry; Törönen, Petri
2010-01-01
Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.
Shankle, William R; Pooley, James P; Steyvers, Mark; Hara, Junko; Mangrola, Tushar; Reisberg, Barry; Lee, Michael D
2013-01-01
Determining how cognition affects functional abilities is important in Alzheimer disease and related disorders. A total of 280 patients (normal or Alzheimer disease and related disorders) received a total of 1514 assessments using the functional assessment staging test (FAST) procedure and the MCI Screen. A hierarchical Bayesian cognitive processing model was created by embedding a signal detection theory model of the MCI Screen-delayed recognition memory task into a hierarchical Bayesian framework. The signal detection theory model used latent parameters of discriminability (memory process) and response bias (executive function) to predict, simultaneously, recognition memory performance for each patient and each FAST severity group. The observed recognition memory data did not distinguish the 6 FAST severity stages, but the latent parameters completely separated them. The latent parameters were also used successfully to transform the ordinal FAST measure into a continuous measure reflecting the underlying continuum of functional severity. Hierarchical Bayesian cognitive processing models applied to recognition memory data from clinical practice settings accurately translated a latent measure of cognition into a continuous measure of functional severity for both individuals and FAST groups. Such a translation links 2 levels of brain information processing and may enable more accurate correlations with other levels, such as those characterized by biomarkers.
Struchen, R; Vial, F; Andersson, M G
2017-04-26
Delayed reporting of health data may hamper the early detection of infectious diseases in surveillance systems. Furthermore, combining multiple data streams, e.g. aiming at improving a system's sensitivity, can be challenging. In this study, we used a Bayesian framework where the result is presented as the value of evidence, i.e. the likelihood ratio for the evidence under outbreak versus baseline conditions. Based on a historical data set of routinely collected cattle mortality events, we evaluated outbreak detection performance (sensitivity, time to detection, in-control run length) under the Bayesian approach among three scenarios: presence of delayed data reporting, but not accounting for it; presence of delayed data reporting accounted for; and absence of delayed data reporting (i.e. an ideal system). Performance on larger and smaller outbreaks was compared with a classical approach, considering syndromes separately or combined. We found that the Bayesian approach performed better than the classical approach, especially for the smaller outbreaks. Furthermore, the Bayesian approach performed similarly well in the scenario where delayed reporting was accounted for to the scenario where it was absent. We argue that the value of evidence framework may be suitable for surveillance systems with multiple syndromes and delayed reporting of data.
Bayesian source term estimation of atmospheric releases in urban areas using LES approach.
Xue, Fei; Kikumoto, Hideki; Li, Xiaofeng; Ooka, Ryozo
2018-05-05
The estimation of source information from limited measurements of a sensor network is a challenging inverse problem, which can be viewed as an assimilation process of the observed concentration data and the predicted concentration data. When dealing with releases in built-up areas, the predicted data are generally obtained by the Reynolds-averaged Navier-Stokes (RANS) equations, which yields building-resolving results; however, RANS-based models are outperformed by large-eddy simulation (LES) in the predictions of both airflow and dispersion. Therefore, it is important to explore the possibility of improving the estimation of the source parameters by using the LES approach. In this paper, a novel source term estimation method is proposed based on LES approach using Bayesian inference. The source-receptor relationship is obtained by solving the adjoint equations constructed using the time-averaged flow field simulated by the LES approach based on the gradient diffusion hypothesis. A wind tunnel experiment with a constant point source downwind of a single building model is used to evaluate the performance of the proposed method, which is compared with that of the existing method using a RANS model. The results show that the proposed method reduces the errors of source location and releasing strength by 77% and 28%, respectively. Copyright © 2018 Elsevier B.V. All rights reserved.
ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration
Bottolo, Leonardo; Langley, Sarah R.; Petretto, Enrico; Tiret, Laurence; Tregouet, David; Richardson, Sylvia
2011-01-01
Summary: ESS++ is a C++ implementation of a fully Bayesian variable selection approach for single and multiple response linear regression. ESS++ works well both when the number of observations is larger than the number of predictors and in the ‘large p, small n’ case. In the current version, ESS++ can handle several hundred observations, thousands of predictors and a few responses simultaneously. The core engine of ESS++ for the selection of relevant predictors is based on Evolutionary Monte Carlo. Our implementation is open source, allowing community-based alterations and improvements. Availability: C++ source code and documentation including compilation instructions are available under GNU licence at http://bgx.org.uk/software/ESS.html. Contact: l.bottolo@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21233165
OGLE-2008-BLG-355Lb: A massive planet around a late-type star
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koshimoto, N.; Sumi, T.; Fukagawa, M.
2014-06-20
We report the discovery of a massive planet, OGLE-2008-BLG-355Lb. The light curve analysis indicates a planet:host mass ratio of q = 0.0118 ± 0.0006 at a separation of 0.877 ± 0.010 Einstein radii. We do not measure a significant microlensing parallax signal and do not have high angular resolution images that could detect the planetary host star. Therefore, we do not have a direct measurement of the host star mass. A Bayesian analysis, assuming that all host stars have equal probability to host a planet with the measured mass ratio, implies a host star mass of M{sub h}=0.37{sub −0.17}{sup +0.30}more » M{sub ⊙} and a companion of mass M{sub P}=4.6{sub −2.2}{sup +3.7}M{sub J}, at a projected separation of r{sub ⊥}=1.70{sub −0.30}{sup +0.29} AU. The implied distance to the planetary system is D {sub L} = 6.8 ± 1.1 kpc. A planetary system with the properties preferred by the Bayesian analysis may be a challenge to the core accretion model of planet formation, as the core accretion model predicts that massive planets are far more likely to form around more massive host stars. This core accretion model prediction is not consistent with our Bayesian prior of an equal probability of host stars of all masses to host a planet with the measured mass ratio. Thus, if the core accretion model prediction is right, we should expect that follow-up high angular resolution observations will detect a host star with a mass in the upper part of the range allowed by the Bayesian analysis. That is, the host would probably be a K or G dwarf.« less
Iliev, Filip L.; Stanev, Valentin G.; Vesselinov, Velimir V.
2018-01-01
Factor analysis is broadly used as a powerful unsupervised machine learning tool for reconstruction of hidden features in recorded mixtures of signals. In the case of a linear approximation, the mixtures can be decomposed by a variety of model-free Blind Source Separation (BSS) algorithms. Most of the available BSS algorithms consider an instantaneous mixing of signals, while the case when the mixtures are linear combinations of signals with delays is less explored. Especially difficult is the case when the number of sources of the signals with delays is unknown and has to be determined from the data as well. To address this problem, in this paper, we present a new method based on Nonnegative Matrix Factorization (NMF) that is capable of identifying: (a) the unknown number of the sources, (b) the delays and speed of propagation of the signals, and (c) the locations of the sources. Our method can be used to decompose records of mixtures of signals with delays emitted by an unknown number of sources in a nondispersive medium, based only on recorded data. This is the case, for example, when electromagnetic signals from multiple antennas are received asynchronously; or mixtures of acoustic or seismic signals recorded by sensors located at different positions; or when a shift in frequency is induced by the Doppler effect. By applying our method to synthetic datasets, we demonstrate its ability to identify the unknown number of sources as well as the waveforms, the delays, and the strengths of the signals. Using Bayesian analysis, we also evaluate estimation uncertainties and identify the region of likelihood where the positions of the sources can be found. PMID:29518126
Iliev, Filip L; Stanev, Valentin G; Vesselinov, Velimir V; Alexandrov, Boian S
2018-01-01
Factor analysis is broadly used as a powerful unsupervised machine learning tool for reconstruction of hidden features in recorded mixtures of signals. In the case of a linear approximation, the mixtures can be decomposed by a variety of model-free Blind Source Separation (BSS) algorithms. Most of the available BSS algorithms consider an instantaneous mixing of signals, while the case when the mixtures are linear combinations of signals with delays is less explored. Especially difficult is the case when the number of sources of the signals with delays is unknown and has to be determined from the data as well. To address this problem, in this paper, we present a new method based on Nonnegative Matrix Factorization (NMF) that is capable of identifying: (a) the unknown number of the sources, (b) the delays and speed of propagation of the signals, and (c) the locations of the sources. Our method can be used to decompose records of mixtures of signals with delays emitted by an unknown number of sources in a nondispersive medium, based only on recorded data. This is the case, for example, when electromagnetic signals from multiple antennas are received asynchronously; or mixtures of acoustic or seismic signals recorded by sensors located at different positions; or when a shift in frequency is induced by the Doppler effect. By applying our method to synthetic datasets, we demonstrate its ability to identify the unknown number of sources as well as the waveforms, the delays, and the strengths of the signals. Using Bayesian analysis, we also evaluate estimation uncertainties and identify the region of likelihood where the positions of the sources can be found.
Bowhead whale localization using asynchronous hydrophones in the Chukchi Sea.
Warner, Graham A; Dosso, Stan E; Hannay, David E; Dettmer, Jan
2016-07-01
This paper estimates bowhead whale locations and uncertainties using non-linear Bayesian inversion of their modally-dispersed calls recorded on asynchronous recorders in the Chukchi Sea, Alaska. Bowhead calls were recorded on a cluster of 7 asynchronous ocean-bottom hydrophones that were separated by 0.5-9.2 km. A warping time-frequency analysis is used to extract relative mode arrival times as a function of frequency for nine frequency-modulated whale calls that dispersed in the shallow water environment. Each call was recorded on multiple hydrophones and the mode arrival times are inverted for: the whale location in the horizontal plane, source instantaneous frequency (IF), water sound-speed profile, seabed geoacoustic parameters, relative recorder clock drifts, and residual error standard deviations, all with estimated uncertainties. A simulation study shows that accurate prior environmental knowledge is not required for accurate localization as long as the inversion treats the environment as unknown. Joint inversion of multiple recorded calls is shown to substantially reduce uncertainties in location, source IF, and relative clock drift. Whale location uncertainties are estimated to be 30-160 m and relative clock drift uncertainties are 3-26 ms.
NASA Astrophysics Data System (ADS)
Smith, J. P.; Owens, P. N.; Gaspar, L.; Lobb, D. A.; Petticrew, E. L.
2015-12-01
An understanding of sediment redistribution processes and the main sediment sources within a watershed is needed to support watershed management strategies. The fingerprinting technique is increasingly being recognized as a method for establishing the source of the sediment transported within watersheds. However, the different behaviour of the various fingerprinting properties has been recognized as a major limitation of the technique, and the uncertainty associated with tracer selection needs to be addressed. There are also questions associated with which modelling approach (frequentist or Bayesian) is the best to unmix complex environmental mixtures, such as river sediment. This study aims to compare and evaluate the differences between fingerprinting predictions provided by a Bayesian unmixing model (MixSIAR) using different groups of tracer properties for use in sediment source identification. We used fallout radionuclides (e.g. 137Cs) and geochemical elements (e.g. As) as conventional fingerprinting properties, and colour parameters as emerging properties; both alone and in combination. These fingerprinting properties are being used (i.e. Koiter et al., 2013; Barthod et al., 2015) to determine the proportional contributions of fine sediment in the South Tobacco Creek Watershed, an agricultural watershed located in Manitoba, Canada. We show that the unmixing model using a combination of fallout radionuclides and geochemical tracers gave similar results to the model based on colour parameters. Furthermore, we show that a model that combines all tracers (i.e. radionuclide/geochemical and colour) gave similar results, showing that sediment sources change from predominantly topsoil in the upper reaches of the watershed to channel bank and bedrock outcrop material in the lower reaches. Barthod LRM et al. (2015). Selecting color-based tracers and classifying sediment sources in the assessment of sediment dynamics using sediment source fingerprinting. J Environ Qual. Doi:10.2134/jeq2015.01.0043 Koiter AJ et al. (2013). Investigating the role of connectivity and scale in assessing the sources of sediment in an agricultural watershed in the Canadian prairies using sediment source fingerprinting. J Soils Sediments, 13, 1676-1691.
Source Partitioning of Methane Emissions and its Seasonality in the U.S. Midwest
NASA Astrophysics Data System (ADS)
Chen, Zichong; Griffis, Timothy J.; Baker, John M.; Millet, Dylan B.; Wood, Jeffrey D.; Dlugokencky, Edward J.; Andrews, Arlyn E.; Sweeney, Colm; Hu, Cheng; Kolka, Randall K.
2018-02-01
The methane (CH4) budget and its source partitioning are poorly constrained in the Midwestern United States. We used tall tower (185 m) aerodynamic flux measurements and atmospheric scale factor Bayesian inversions to constrain the monthly budget and to partition the total budget into natural (e.g., wetlands) and anthropogenic (e.g., livestock, waste, and natural gas) sources for the period June 2016 to September 2017. Aerodynamic flux observations indicated that the landscape was a CH4 source with a mean annual CH4 flux of +13.7 ± 0.34 nmol m-2 s-1 and was rarely a net sink. The scale factor Bayesian inversion analyses revealed a mean annual source of +12.3 ± 2.1 nmol m-2 s-1. Flux partitioning revealed that the anthropogenic source (7.8 ± 1.6 Tg CH4 yr-1) was 1.5 times greater than the bottom-up gridded United States Environmental Protection Agency inventory, in which livestock and oil/gas sources were underestimated by 1.8-fold and 1.3-fold, respectively. Wetland emissions (4.0 ± 1.2 Tg CH4 yr-1) were the second largest source, accounting for 34% of the total budget. The temporal variability of total CH4 emissions was dominated by wetlands with peak emissions occurring in August. In contrast, emissions from oil/gas and other anthropogenic sources showed relatively weak seasonality.
Praveen, Paurush; Fröhlich, Holger
2013-01-01
Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.
On-line Bayesian model updating for structural health monitoring
NASA Astrophysics Data System (ADS)
Rocchetta, Roberto; Broggi, Matteo; Huchet, Quentin; Patelli, Edoardo
2018-03-01
Fatigue induced cracks is a dangerous failure mechanism which affects mechanical components subject to alternating load cycles. System health monitoring should be adopted to identify cracks which can jeopardise the structure. Real-time damage detection may fail in the identification of the cracks due to different sources of uncertainty which have been poorly assessed or even fully neglected. In this paper, a novel efficient and robust procedure is used for the detection of cracks locations and lengths in mechanical components. A Bayesian model updating framework is employed, which allows accounting for relevant sources of uncertainty. The idea underpinning the approach is to identify the most probable crack consistent with the experimental measurements. To tackle the computational cost of the Bayesian approach an emulator is adopted for replacing the computationally costly Finite Element model. To improve the overall robustness of the procedure, different numerical likelihoods, measurement noises and imprecision in the value of model parameters are analysed and their effects quantified. The accuracy of the stochastic updating and the efficiency of the numerical procedure are discussed. An experimental aluminium frame and on a numerical model of a typical car suspension arm are used to demonstrate the applicability of the approach.
A Bayesian approach to multi-messenger astronomy: identification of gravitational-wave host galaxies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fan, XiLong; Messenger, Christopher; Heng, Ik Siong
We present a general framework for incorporating astrophysical information into Bayesian parameter estimation techniques used by gravitational wave data analysis to facilitate multi-messenger astronomy. Since the progenitors of transient gravitational wave events, such as compact binary coalescences, are likely to be associated with a host galaxy, improvements to the source sky location estimates through the use of host galaxy information are explored. To demonstrate how host galaxy properties can be included, we simulate a population of compact binary coalescences and show that for ∼8.5% of simulations within 200 Mpc, the top 10 most likely galaxies account for a ∼50% ofmore » the total probability of hosting a gravitational wave source. The true gravitational wave source host galaxy is in the top 10 galaxy candidates ∼10% of the time. Furthermore, we show that by including host galaxy information, a better estimate of the inclination angle of a compact binary gravitational wave source can be obtained. We also demonstrate the flexibility of our method by incorporating the use of either the B or K band into our analysis.« less
Estimation of gross land-use change and its uncertainty using a Bayesian data assimilation approach
NASA Astrophysics Data System (ADS)
Levy, Peter; van Oijen, Marcel; Buys, Gwen; Tomlinson, Sam
2018-03-01
We present a method for estimating land-use change using a Bayesian data assimilation approach. The approach provides a general framework for combining multiple disparate data sources with a simple model. This allows us to constrain estimates of gross land-use change with reliable national-scale census data, whilst retaining the detailed information available from several other sources. Eight different data sources, with three different data structures, were combined in our posterior estimate of land use and land-use change, and other data sources could easily be added in future. The tendency for observations to underestimate gross land-use change is accounted for by allowing for a skewed distribution in the likelihood function. The data structure produced has high temporal and spatial resolution, and is appropriate for dynamic process-based modelling. Uncertainty is propagated appropriately into the output, so we have a full posterior distribution of output and parameters. The data are available in the widely used netCDF file format from http://eidc.ceh.ac.uk/.
THREAT ANTICIPATION AND DECEPTIVE REASONING USING BAYESIAN BELIEF NETWORKS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Allgood, Glenn O; Olama, Mohammed M; Lake, Joe E
Recent events highlight the need for tools to anticipate threats posed by terrorists. Assessing these threats requires combining information from disparate data sources such as analytic models, simulations, historical data, sensor networks, and user judgments. These disparate data can be combined in a coherent, analytically defensible, and understandable manner using a Bayesian belief network (BBN). In this paper, we develop a BBN threat anticipatory model based on a deceptive reasoning algorithm using a network engineering process that treats the probability distributions of the BBN nodes within the broader context of the system development process.
Model-based Bayesian filtering of cardiac contaminants from biomedical recordings.
Sameni, R; Shamsollahi, M B; Jutten, C
2008-05-01
Electrocardiogram (ECG) and magnetocardiogram (MCG) signals are among the most considerable sources of noise for other biomedical signals. In some recent works, a Bayesian filtering framework has been proposed for denoising the ECG signals. In this paper, it is shown that this framework may be effectively used for removing cardiac contaminants such as the ECG, MCG and ballistocardiographic artifacts from different biomedical recordings such as the electroencephalogram, electromyogram and also for canceling maternal cardiac signals from fetal ECG/MCG. The proposed method is evaluated on simulated and real signals.
Comparing Bayesian stable isotope mixing models: Which tools are best for sediments?
NASA Astrophysics Data System (ADS)
Morris, David; Macko, Stephen
2016-04-01
Bayesian stable isotope mixing models have received much attention as a means of coping with multiple sources and uncertainty in isotope ecology (e.g. Phillips et al., 2014), enabling the probabilistic determination of the contributions made by each food source to the total diet of the organism in question. We have applied these techniques to marine sediments for the first time. The sediments of the Chukchi Sea and Beaufort Sea offer an opportunity to utilize these models for organic geochemistry, as there are three likely sources of organic carbon; pelagic phytoplankton, sea ice algae and terrestrial material from rivers and coastal erosion, as well as considerable variation in the marine δ13C values. Bayesian mixing models using bulk δ13C and δ15N data from Shelf Basin Interaction samples allow for the probabilistic determination of the contributions made by each of the sources to the organic carbon budget, and can be compared with existing source contribution estimates based upon biomarker models (e.g. Belicka & Harvey, 2009, Faux, Belicka, & Rodger Harvey, 2011). The δ13C of this preserved material varied from -22.1 to -16.7‰ (mean -19.4±1.3‰), while δ15N varied from 4.1 to 7.6‰ (mean 5.7±1.1‰). Using the SIAR model, we found that water column productivity was the source of between 50 and 70% of the organic carbon buried in this portion of the western Arctic with the remainder mainly supplied by sea ice algal productivity (25-35%) and terrestrial inputs (15%). With many mixing models now available, this study will compare SIAR with MixSIAR and the new FRUITS model. Monte Carlo modeling of the mixing polygon will be used to validate the models, and hierarchical models will be utilised to glean more information from the data set.
Recognition of degraded handwritten digits using dynamic Bayesian networks
NASA Astrophysics Data System (ADS)
Likforman-Sulem, Laurence; Sigelle, Marc
2007-01-01
We investigate in this paper the application of dynamic Bayesian networks (DBNs) to the recognition of handwritten digits. The main idea is to couple two separate HMMs into various architectures. First, a vertical HMM and a horizontal HMM are built observing the evolving streams of image columns and image rows respectively. Then, two coupled architectures are proposed to model interactions between these two streams and to capture the 2D nature of character images. Experiments performed on the MNIST handwritten digit database show that coupled architectures yield better recognition performances than non-coupled ones. Additional experiments conducted on artificially degraded (broken) characters demonstrate that coupled architectures better cope with such degradation than non coupled ones and than discriminative methods such as SVMs.
Probabilistic models in human sensorimotor control
Wolpert, Daniel M.
2009-01-01
Sensory and motor uncertainty form a fundamental constraint on human sensorimotor control. Bayesian decision theory (BDT) has emerged as a unifying framework to understand how the central nervous system performs optimal estimation and control in the face of such uncertainty. BDT has two components: Bayesian statistics and decision theory. Here we review Bayesian statistics and show how it applies to estimating the state of the world and our own body. Recent results suggest that when learning novel tasks we are able to learn the statistical properties of both the world and our own sensory apparatus so as to perform estimation using Bayesian statistics. We review studies which suggest that humans can combine multiple sources of information to form maximum likelihood estimates, can incorporate prior beliefs about possible states of the world so as to generate maximum a posteriori estimates and can use Kalman filter-based processes to estimate time-varying states. Finally, we review Bayesian decision theory in motor control and how the central nervous system processes errors to determine loss functions and optimal actions. We review results that suggest we plan movements based on statistics of our actions that result from signal-dependent noise on our motor outputs. Taken together these studies provide a statistical framework for how the motor system performs in the presence of uncertainty. PMID:17628731
Gamalo-Siebers, Margaret; Savic, Jasmina; Basu, Cynthia; Zhao, Xin; Gopalakrishnan, Mathangi; Gao, Aijun; Song, Guochen; Baygani, Simin; Thompson, Laura; Xia, H Amy; Price, Karen; Tiwari, Ram; Carlin, Bradley P
2017-07-01
Children represent a large underserved population of "therapeutic orphans," as an estimated 80% of children are treated off-label. However, pediatric drug development often faces substantial challenges, including economic, logistical, technical, and ethical barriers, among others. Among many efforts trying to remove these barriers, increased recent attention has been paid to extrapolation; that is, the leveraging of available data from adults or older age groups to draw conclusions for the pediatric population. The Bayesian statistical paradigm is natural in this setting, as it permits the combining (or "borrowing") of information across disparate sources, such as the adult and pediatric data. In this paper, authored by the pediatric subteam of the Drug Information Association Bayesian Scientific Working Group and Adaptive Design Working Group, we develop, illustrate, and provide suggestions on Bayesian statistical methods that could be used to design improved pediatric development programs that use all available information in the most efficient manner. A variety of relevant Bayesian approaches are described, several of which are illustrated through 2 case studies: extrapolating adult efficacy data to expand the labeling for Remicade to include pediatric ulcerative colitis and extrapolating adult exposure-response information for antiepileptic drugs to pediatrics. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Thomsen, Nanna I.; Binning, Philip J.; McKnight, Ursula S.; Tuxen, Nina; Bjerg, Poul L.; Troldborg, Mads
2016-05-01
A key component in risk assessment of contaminated sites is in the formulation of a conceptual site model (CSM). A CSM is a simplified representation of reality and forms the basis for the mathematical modeling of contaminant fate and transport at the site. The CSM should therefore identify the most important site-specific features and processes that may affect the contaminant transport behavior at the site. However, the development of a CSM will always be associated with uncertainties due to limited data and lack of understanding of the site conditions. CSM uncertainty is often found to be a major source of model error and it should therefore be accounted for when evaluating uncertainties in risk assessments. We present a Bayesian belief network (BBN) approach for constructing CSMs and assessing their uncertainty at contaminated sites. BBNs are graphical probabilistic models that are effective for integrating quantitative and qualitative information, and thus can strengthen decisions when empirical data are lacking. The proposed BBN approach facilitates a systematic construction of multiple CSMs, and then determines the belief in each CSM using a variety of data types and/or expert opinion at different knowledge levels. The developed BBNs combine data from desktop studies and initial site investigations with expert opinion to assess which of the CSMs are more likely to reflect the actual site conditions. The method is demonstrated on a Danish field site, contaminated with chlorinated ethenes. Four different CSMs are developed by combining two contaminant source zone interpretations (presence or absence of a separate phase contamination) and two geological interpretations (fractured or unfractured clay till). The beliefs in each of the CSMs are assessed sequentially based on data from three investigation stages (a screening investigation, a more detailed investigation, and an expert consultation) to demonstrate that the belief can be updated as more information becomes available.
NASA Astrophysics Data System (ADS)
Lundquist, K. A.; Jensen, D. D.; Lucas, D. D.
2017-12-01
Atmospheric source reconstruction allows for the probabilistic estimate of source characteristics of an atmospheric release using observations of the release. Performance of the inversion depends partially on the temporal frequency and spatial scale of the observations. The objective of this study is to quantify the sensitivity of the source reconstruction method to sparse spatial and temporal observations. To this end, simulations of atmospheric transport of noble gasses are created for the 2006 nuclear test at the Punggye-ri nuclear test site. Synthetic observations are collected from the simulation, and are taken as "ground truth". Data denial techniques are used to progressively coarsen the temporal and spatial resolution of the synthetic observations, while the source reconstruction model seeks to recover the true input parameters from the synthetic observations. Reconstructed parameters considered here are source location, source timing and source quantity. Reconstruction is achieved by running an ensemble of thousands of dispersion model runs that sample from a uniform distribution of the input parameters. Machine learning is used to train a computationally-efficient surrogate model from the ensemble simulations. Monte Carlo sampling and Bayesian inversion are then used in conjunction with the surrogate model to quantify the posterior probability density functions of source input parameters. This research seeks to inform decision makers of the tradeoffs between more expensive, high frequency observations and less expensive, low frequency observations.
Bayesian characterization of uncertainty in species interaction strengths.
Wolf, Christopher; Novak, Mark; Gitelman, Alix I
2017-06-01
Considerable effort has been devoted to the estimation of species interaction strengths. This effort has focused primarily on statistical significance testing and obtaining point estimates of parameters that contribute to interaction strength magnitudes, leaving the characterization of uncertainty associated with those estimates unconsidered. We consider a means of characterizing the uncertainty of a generalist predator's interaction strengths by formulating an observational method for estimating a predator's prey-specific per capita attack rates as a Bayesian statistical model. This formulation permits the explicit incorporation of multiple sources of uncertainty. A key insight is the informative nature of several so-called non-informative priors that have been used in modeling the sparse data typical of predator feeding surveys. We introduce to ecology a new neutral prior and provide evidence for its superior performance. We use a case study to consider the attack rates in a New Zealand intertidal whelk predator, and we illustrate not only that Bayesian point estimates can be made to correspond with those obtained by frequentist approaches, but also that estimation uncertainty as described by 95% intervals is more useful and biologically realistic using the Bayesian method. In particular, unlike in bootstrap confidence intervals, the lower bounds of the Bayesian posterior intervals for attack rates do not include zero when a predator-prey interaction is in fact observed. We conclude that the Bayesian framework provides a straightforward, probabilistic characterization of interaction strength uncertainty, enabling future considerations of both the deterministic and stochastic drivers of interaction strength and their impact on food webs.
The Star Blended with the MOA-2008-BLG-310 Source Is Not the Exoplanet Host Star
NASA Astrophysics Data System (ADS)
Bhattacharya, A.; Bennett, D. P.; Anderson, J.; Bond, I. A.; Gould, A.; Batista, V.; Beaulieu, J. P.; Fouqué, P.; Marquette, J. B.; Pogge, R.
2017-08-01
High-resolution Hubble Space Telescope (HST) image analysis of the MOA-2008-BLG-310 microlens system indicates that the excess flux at the location of the source found in the discovery paper cannot primarily be due to the lens star because it does not match the lens-source relative proper motion, {μ }{rel}, predicted by the microlens models. This excess flux is most likely to be due to an unrelated star that happens to be located in close proximity to the source star. Two epochs of HST observations indicate proper motion for this blend star that is typical of a random bulge star but is not consistent with a companion to the source or lens stars if the flux is dominated by only one star, aside from the lens. We consider models in which the excess flux is due to a combination of an unrelated star and the lens star, and this yields a 95% confidence level upper limit on the lens star brightness of {I}L> 22.44 and {V}L> 23.62. A Bayesian analysis using a standard Galactic model and these magnitude limits yields a host star mass of {M}h={0.21}-0.09+0.21 {M}⊙ and a planet mass of {m}p={23.4}-9.9+23.9 {M}\\oplus at a projected separation of {a}\\perp ={1.12}-0.17+0.16 au. This result illustrates that excess flux in a high-resolution image of a microlens-source system need not be due to the lens. It is important to check that the lens-source relative proper motion is consistent with the microlensing prediction. The high-resolution image analysis techniques developed in this paper can be used to verify the WFIRST exoplanet microlensing survey mass measurements.
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Park, Eun Sug; Symanski, Elaine; Han, Daikwon; Spiegelman, Clifford
2015-06-01
A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.
An Uncertainty Quantification Framework for Prognostics and Condition-Based Monitoring
NASA Technical Reports Server (NTRS)
Sankararaman, Shankar; Goebel, Kai
2014-01-01
This paper presents a computational framework for uncertainty quantification in prognostics in the context of condition-based monitoring of aerospace systems. The different sources of uncertainty and the various uncertainty quantification activities in condition-based prognostics are outlined in detail, and it is demonstrated that the Bayesian subjective approach is suitable for interpreting uncertainty in online monitoring. A state-space model-based framework for prognostics, that can rigorously account for the various sources of uncertainty, is presented. Prognostics consists of two important steps. First, the state of the system is estimated using Bayesian tracking, and then, the future states of the system are predicted until failure, thereby computing the remaining useful life of the system. The proposed framework is illustrated using the power system of a planetary rover test-bed, which is being developed and studied at NASA Ames Research Center.
Holmes, Tyson H.; Lewis, David B.
2014-01-01
Bayesian estimation techniques offer a systematic and quantitative approach for synthesizing data drawn from the literature to model immunological systems. As detailed here, the practitioner begins with a theoretical model and then sequentially draws information from source data sets and/or published findings to inform estimation of model parameters. Options are available to weigh these various sources of information differentially per objective measures of their corresponding scientific strengths. This approach is illustrated in depth through a carefully worked example for a model of decline in T-cell receptor excision circle content of peripheral T cells during development and aging. Estimates from this model indicate that 21 years of age is plausible for the developmental timing of mean age of onset of decline in T-cell receptor excision circle content of peripheral T cells. PMID:25179832
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Le; Timbie, Peter T.; Bunn, Emory F.
In this paper, we present a new Bayesian semi-blind approach for foreground removal in observations of the 21 cm signal measured by interferometers. The technique, which we call H i Expectation–Maximization Independent Component Analysis (HIEMICA), is an extension of the Independent Component Analysis technique developed for two-dimensional (2D) cosmic microwave background maps to three-dimensional (3D) 21 cm cosmological signals measured by interferometers. This technique provides a fully Bayesian inference of power spectra and maps and separates the foregrounds from the signal based on the diversity of their power spectra. Relying only on the statistical independence of the components, this approachmore » can jointly estimate the 3D power spectrum of the 21 cm signal, as well as the 2D angular power spectrum and the frequency dependence of each foreground component, without any prior assumptions about the foregrounds. This approach has been tested extensively by applying it to mock data from interferometric 21 cm intensity mapping observations under idealized assumptions of instrumental effects. We also discuss the impact when the noise properties are not known completely. As a first step toward solving the 21 cm power spectrum analysis problem, we compare the semi-blind HIEMICA technique to the commonly used Principal Component Analysis. Under the same idealized circumstances, the proposed technique provides significantly improved recovery of the power spectrum. This technique can be applied in a straightforward manner to all 21 cm interferometric observations, including epoch of reionization measurements, and can be extended to single-dish observations as well.« less
Source partitioning of methane emissions and its seasonality in the U.S. Midwest
Zichong Chen; Timothy J. Griffis; John M. Baker; Dylan B. Millet; Jeffrey D. Wood; Edward J. Dlugokencky; Arlyn E. Andrews; Colm Sweeney; Cheng Hu; Randall K. Kolka
2018-01-01
The methane (CH4) budget and its source partitioning are poorly constrained in the Midwestern United States. We used tall tower (185 m) aerodynamic flux measurements and atmospheric scale factor Bayesian inversions to constrain the monthly budget and to partition the total budget into natural (e.g., wetlands) and anthropogenic (e.g., livestock,...
Bayesian Rationality in Evaluating Multiple Testimonies: Incorporating the Role of Coherence
ERIC Educational Resources Information Center
Harris, Adam J. L.; Hahn, Ulrike
2009-01-01
Routinely in day-to-day life, as well as in formal settings such as the courtroom, people must aggregate information they receive from different sources. One intuitively important but underresearched factor in this context is the degree to which the reports from different sources fit together, that is, their coherence. The authors examine a…
USDA-ARS?s Scientific Manuscript database
Mixed stock analysis (MSA) is a powerful tool used in the management and conservation of numerous species. Its function is to estimate the sources of contributions in a mixture of populations of a species, as well as to estimate the probabilities that individuals originated at a source. Considerable...
Evaluating science arguments: evidence, uncertainty, and argument strength.
Corner, Adam; Hahn, Ulrike
2009-09-01
Public debates about socioscientific issues are increasingly prevalent, but the public response to messages about, for example, climate change, does not always seem to match the seriousness of the problem identified by scientists. Is there anything unique about appeals based on scientific evidence-do people evaluate science and nonscience arguments differently? In an attempt to apply a systematic framework to people's evaluation of science arguments, the authors draw on the Bayesian approach to informal argumentation. The Bayesian approach permits questions about how people evaluate science arguments to be posed and comparisons to be made between the evaluation of science and nonscience arguments. In an experiment involving three separate argument evaluation tasks, the authors investigated whether people's evaluations of science and nonscience arguments differed in any meaningful way. Although some differences were observed in the relative strength of science and nonscience arguments, the evaluation of science arguments was determined by the same factors as nonscience arguments. Our results suggest that science communicators wishing to construct a successful appeal can make use of the Bayesian framework to distinguish strong and weak arguments. 2009 APA, all rights reserved
Efficient Posterior Probability Mapping Using Savage-Dickey Ratios
Penny, William D.; Ridgway, Gerard R.
2013-01-01
Statistical Parametric Mapping (SPM) is the dominant paradigm for mass-univariate analysis of neuroimaging data. More recently, a Bayesian approach termed Posterior Probability Mapping (PPM) has been proposed as an alternative. PPM offers two advantages: (i) inferences can be made about effect size thus lending a precise physiological meaning to activated regions, (ii) regions can be declared inactive. This latter facility is most parsimoniously provided by PPMs based on Bayesian model comparisons. To date these comparisons have been implemented by an Independent Model Optimization (IMO) procedure which separately fits null and alternative models. This paper proposes a more computationally efficient procedure based on Savage-Dickey approximations to the Bayes factor, and Taylor-series approximations to the voxel-wise posterior covariance matrices. Simulations show the accuracy of this Savage-Dickey-Taylor (SDT) method to be comparable to that of IMO. Results on fMRI data show excellent agreement between SDT and IMO for second-level models, and reasonable agreement for first-level models. This Savage-Dickey test is a Bayesian analogue of the classical SPM-F and allows users to implement model comparison in a truly interactive manner. PMID:23533640
Faint Object Detection in Multi-Epoch Observations via Catalog Data Fusion
NASA Astrophysics Data System (ADS)
Budavári, Tamás; Szalay, Alexander S.; Loredo, Thomas J.
2017-03-01
Astronomy in the time-domain era faces several new challenges. One of them is the efficient use of observations obtained at multiple epochs. The work presented here addresses faint object detection and describes an incremental strategy for separating real objects from artifacts in ongoing surveys. The idea is to produce low-threshold single-epoch catalogs and to accumulate information across epochs. This is in contrast to more conventional strategies based on co-added or stacked images. We adopt a Bayesian approach, addressing object detection by calculating the marginal likelihoods for hypotheses asserting that there is no object or one object in a small image patch containing at most one cataloged source at each epoch. The object-present hypothesis interprets the sources in a patch at different epochs as arising from a genuine object; the no-object hypothesis interprets candidate sources as spurious, arising from noise peaks. We study the detection probability for constant-flux objects in a Gaussian noise setting, comparing results based on single and stacked exposures to results based on a series of single-epoch catalog summaries. Our procedure amounts to generalized cross-matching: it is the product of a factor accounting for the matching of the estimated fluxes of the candidate sources and a factor accounting for the matching of their estimated directions. We find that probabilistic fusion of multi-epoch catalogs can detect sources with similar sensitivity and selectivity compared to stacking. The probabilistic cross-matching framework underlying our approach plays an important role in maintaining detection sensitivity and points toward generalizations that could accommodate variability and complex object structure.
Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Bellato, Cláudia M; Motilal, Lambert; Zhang, Dapeng
2014-01-15
Cacao (Theobroma cacao L.), the source of cocoa, is an economically important tropical crop. One problem with the premium cacao market is contamination with off-types adulterating raw premium material. Accurate determination of the genetic identity of single cacao beans is essential for ensuring cocoa authentication. Using nanofluidic single nucleotide polymorphism (SNP) genotyping with 48 SNP markers, we generated SNP fingerprints for small quantities of DNA extracted from the seed coat of single cacao beans. On the basis of the SNP profiles, we identified an assumed adulterant variety, which was unambiguously distinguished from the authentic beans by multilocus matching. Assignment tests based on both Bayesian clustering analysis and allele frequency clearly separated all 30 authentic samples from the non-authentic samples. Distance-based principle coordinate analysis further supported these results. The nanofluidic SNP protocol, together with forensic statistical tools, is sufficiently robust to establish authentication and to verify gourmet cacao varieties. This method shows significant potential for practical application.
Multivariate neural biomarkers of emotional states are categorically distinct
Kragel, Philip A.
2015-01-01
Understanding how emotions are represented neurally is a central aim of affective neuroscience. Despite decades of neuroimaging efforts addressing this question, it remains unclear whether emotions are represented as distinct entities, as predicted by categorical theories, or are constructed from a smaller set of underlying factors, as predicted by dimensional accounts. Here, we capitalize on multivariate statistical approaches and computational modeling to directly evaluate these theoretical perspectives. We elicited discrete emotional states using music and films during functional magnetic resonance imaging scanning. Distinct patterns of neural activation predicted the emotion category of stimuli and tracked subjective experience. Bayesian model comparison revealed that combining dimensional and categorical models of emotion best characterized the information content of activation patterns. Surprisingly, categorical and dimensional aspects of emotion experience captured unique and opposing sources of neural information. These results indicate that diverse emotional states are poorly differentiated by simple models of valence and arousal, and that activity within separable neural systems can be mapped to unique emotion categories. PMID:25813790
Bayesian statistics applied to the location of the source of explosions at Stromboli Volcano, Italy
Saccorotti, G.; Chouet, B.; Martini, M.; Scarpa, R.
1998-01-01
We present a method for determining the location and spatial extent of the source of explosions at Stromboli Volcano, Italy, based on a Bayesian inversion of the slowness vector derived from frequency-slowness analyses of array data. The method searches for source locations that minimize the error between the expected and observed slowness vectors. For a given set of model parameters, the conditional probability density function of slowness vectors is approximated by a Gaussian distribution of expected errors. The method is tested with synthetics using a five-layer velocity model derived for the north flank of Stromboli and a smoothed velocity model derived from a power-law approximation of the layered structure. Application to data from Stromboli allows for a detailed examination of uncertainties in source location due to experimental errors and incomplete knowledge of the Earth model. Although the solutions are not constrained in the radial direction, excellent resolution is achieved in both transverse and depth directions. Under the assumption that the horizontal extent of the source does not exceed the crater dimension, the 90% confidence region in the estimate of the explosive source location corresponds to a small volume extending from a depth of about 100 m to a maximum depth of about 300 m beneath the active vents, with a maximum likelihood source region located in the 120- to 180-m-depth interval.
Praveen, Paurush; Fröhlich, Holger
2013-01-01
Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available. PMID:23826291
Greenhouse Gas Source Attribution: Measurements Modeling and Uncertainty Quantification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Zhen; Safta, Cosmin; Sargsyan, Khachik
2014-09-01
In this project we have developed atmospheric measurement capabilities and a suite of atmospheric modeling and analysis tools that are well suited for verifying emissions of green- house gases (GHGs) on an urban-through-regional scale. We have for the first time applied the Community Multiscale Air Quality (CMAQ) model to simulate atmospheric CO 2 . This will allow for the examination of regional-scale transport and distribution of CO 2 along with air pollutants traditionally studied using CMAQ at relatively high spatial and temporal resolution with the goal of leveraging emissions verification efforts for both air quality and climate. We have developedmore » a bias-enhanced Bayesian inference approach that can remedy the well-known problem of transport model errors in atmospheric CO 2 inversions. We have tested the approach using data and model outputs from the TransCom3 global CO 2 inversion comparison project. We have also performed two prototyping studies on inversion approaches in the generalized convection-diffusion context. One of these studies employed Polynomial Chaos Expansion to accelerate the evaluation of a regional transport model and enable efficient Markov Chain Monte Carlo sampling of the posterior for Bayesian inference. The other approach uses de- terministic inversion of a convection-diffusion-reaction system in the presence of uncertainty. These approaches should, in principle, be applicable to realistic atmospheric problems with moderate adaptation. We outline a regional greenhouse gas source inference system that integrates (1) two ap- proaches of atmospheric dispersion simulation and (2) a class of Bayesian inference and un- certainty quantification algorithms. We use two different and complementary approaches to simulate atmospheric dispersion. Specifically, we use a Eulerian chemical transport model CMAQ and a Lagrangian Particle Dispersion Model - FLEXPART-WRF. These two models share the same WRF assimilated meteorology fields, making it possible to perform a hybrid simulation, in which the Eulerian model (CMAQ) can be used to compute the initial condi- tion needed by the Lagrangian model, while the source-receptor relationships for a large state vector can be efficiently computed using the Lagrangian model in its backward mode. In ad- dition, CMAQ has a complete treatment of atmospheric chemistry of a suite of traditional air pollutants, many of which could help attribute GHGs from different sources. The inference of emissions sources using atmospheric observations is cast as a Bayesian model calibration problem, which is solved using a variety of Bayesian techniques, such as the bias-enhanced Bayesian inference algorithm, which accounts for the intrinsic model deficiency, Polynomial Chaos Expansion to accelerate model evaluation and Markov Chain Monte Carlo sampling, and Karhunen-Lo %60 eve (KL) Expansion to reduce the dimensionality of the state space. We have established an atmospheric measurement site in Livermore, CA and are collect- ing continuous measurements of CO 2 , CH 4 and other species that are typically co-emitted with these GHGs. Measurements of co-emitted species can assist in attributing the GHGs to different emissions sectors. Automatic calibrations using traceable standards are performed routinely for the gas-phase measurements. We are also collecting standard meteorological data at the Livermore site as well as planetary boundary height measurements using a ceilometer. The location of the measurement site is well suited to sample air transported between the San Francisco Bay area and the California Central Valley.« less
A new Bayesian Earthquake Analysis Tool (BEAT)
NASA Astrophysics Data System (ADS)
Vasyura-Bathke, Hannes; Dutta, Rishabh; Jónsson, Sigurjón; Mai, Martin
2017-04-01
Modern earthquake source estimation studies increasingly use non-linear optimization strategies to estimate kinematic rupture parameters, often considering geodetic and seismic data jointly. However, the optimization process is complex and consists of several steps that need to be followed in the earthquake parameter estimation procedure. These include pre-describing or modeling the fault geometry, calculating the Green's Functions (often assuming a layered elastic half-space), and estimating the distributed final slip and possibly other kinematic source parameters. Recently, Bayesian inference has become popular for estimating posterior distributions of earthquake source model parameters given measured/estimated/assumed data and model uncertainties. For instance, some research groups consider uncertainties of the layered medium and propagate these to the source parameter uncertainties. Other groups make use of informative priors to reduce the model parameter space. In addition, innovative sampling algorithms have been developed that efficiently explore the often high-dimensional parameter spaces. Compared to earlier studies, these improvements have resulted in overall more robust source model parameter estimates that include uncertainties. However, the computational demands of these methods are high and estimation codes are rarely distributed along with the published results. Even if codes are made available, it is often difficult to assemble them into a single optimization framework as they are typically coded in different programing languages. Therefore, further progress and future applications of these methods/codes are hampered, while reproducibility and validation of results has become essentially impossible. In the spirit of providing open-access and modular codes to facilitate progress and reproducible research in earthquake source estimations, we undertook the effort of producing BEAT, a python package that comprises all the above-mentioned features in one single programing environment. The package is build on top of the pyrocko seismological toolbox (www.pyrocko.org) and makes use of the pymc3 module for Bayesian statistical model fitting. BEAT is an open-source package (https://github.com/hvasbath/beat) and we encourage and solicit contributions to the project. In this contribution, we present our strategy for developing BEAT, show application examples, and discuss future developments.
Bayesian historical earthquake relocation: an example from the 1909 Taipei earthquake
Minson, Sarah E.; Lee, William H.K.
2014-01-01
Locating earthquakes from the beginning of the modern instrumental period is complicated by the fact that there are few good-quality seismograms and what traveltimes do exist may be corrupted by both large phase-pick errors and clock errors. Here, we outline a Bayesian approach to simultaneous inference of not only the hypocentre location but also the clock errors at each station and the origin time of the earthquake. This methodology improves the solution for the source location and also provides an uncertainty analysis on all of the parameters included in the inversion. As an example, we applied this Bayesian approach to the well-studied 1909 Mw 7 Taipei earthquake. While our epicentre location and origin time for the 1909 Taipei earthquake are consistent with earlier studies, our focal depth is significantly shallower suggesting a higher seismic hazard to the populous Taipei metropolitan area than previously supposed.
Approximate Bayesian computation for spatial SEIR(S) epidemic models.
Brown, Grant D; Porter, Aaron T; Oleson, Jacob J; Hinman, Jessica A
2018-02-01
Approximate Bayesia n Computation (ABC) provides an attractive approach to estimation in complex Bayesian inferential problems for which evaluation of the kernel of the posterior distribution is impossible or computationally expensive. These highly parallelizable techniques have been successfully applied to many fields, particularly in cases where more traditional approaches such as Markov chain Monte Carlo (MCMC) are impractical. In this work, we demonstrate the application of approximate Bayesian inference to spatially heterogeneous Susceptible-Exposed-Infectious-Removed (SEIR) stochastic epidemic models. These models have a tractable posterior distribution, however MCMC techniques nevertheless become computationally infeasible for moderately sized problems. We discuss the practical implementation of these techniques via the open source ABSEIR package for R. The performance of ABC relative to traditional MCMC methods in a small problem is explored under simulation, as well as in the spatially heterogeneous context of the 2014 epidemic of Chikungunya in the Americas. Copyright © 2017 Elsevier Ltd. All rights reserved.
Applying Bayesian belief networks in rapid response situations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gibson, William L; Deborah, Leishman, A.; Van Eeckhout, Edward
2008-01-01
The authors have developed an enhanced Bayesian analysis tool called the Integrated Knowledge Engine (IKE) for monitoring and surveillance. The enhancements are suited for Rapid Response Situations where decisions must be made based on uncertain and incomplete evidence from many diverse and heterogeneous sources. The enhancements extend the probabilistic results of the traditional Bayesian analysis by (1) better quantifying uncertainty arising from model parameter uncertainty and uncertain evidence, (2) optimizing the collection of evidence to reach conclusions more quickly, and (3) allowing the analyst to determine the influence of the remaining evidence that cannot be obtained in the time allowed.more » These extended features give the analyst and decision maker a better comprehension of the adequacy of the acquired evidence and hence the quality of the hurried decisions. They also describe two example systems where the above features are highlighted.« less
A novel Bayesian approach to acoustic emission data analysis.
Agletdinov, E; Pomponi, E; Merson, D; Vinogradov, A
2016-12-01
Acoustic emission (AE) technique is a popular tool for materials characterization and non-destructive testing. Originating from the stochastic motion of defects in solids, AE is a random process by nature. The challenging problem arises whenever an attempt is made to identify specific points corresponding to the changes in the trends in the fluctuating AE time series. A general Bayesian framework is proposed for the analysis of AE time series, aiming at automated finding the breakpoints signaling a crossover in the dynamics of underlying AE sources. Copyright © 2016 Elsevier B.V. All rights reserved.
Bayesian Integration of Information in Hippocampal Place Cells
Madl, Tamas; Franklin, Stan; Chen, Ke; Montaldi, Daniela; Trappl, Robert
2014-01-01
Accurate spatial localization requires a mechanism that corrects for errors, which might arise from inaccurate sensory information or neuronal noise. In this paper, we propose that Hippocampal place cells might implement such an error correction mechanism by integrating different sources of information in an approximately Bayes-optimal fashion. We compare the predictions of our model with physiological data from rats. Our results suggest that useful predictions regarding the firing fields of place cells can be made based on a single underlying principle, Bayesian cue integration, and that such predictions are possible using a remarkably small number of model parameters. PMID:24603429
McLeod, Lianne; Bharadwaj, Lalita; Epp, Tasha; Waldner, Cheryl L.
2017-01-01
Groundwater drinking water supply surveillance data were accessed to summarize water quality delivered as public and private water supplies in southern Saskatchewan as part of an exposure assessment for epidemiologic analyses of associations between water quality and type 2 diabetes or cardiovascular disease. Arsenic in drinking water has been linked to a variety of chronic diseases and previous studies have identified multiple wells with arsenic above the drinking water standard of 0.01 mg/L; therefore, arsenic concentrations were of specific interest. Principal components analysis was applied to obtain principal component (PC) scores to summarize mixtures of correlated parameters identified as health standards and those identified as aesthetic objectives in the Saskatchewan Drinking Water Quality Standards and Objective. Ordinary, universal, and empirical Bayesian kriging were used to interpolate arsenic concentrations and PC scores in southern Saskatchewan, and the results were compared. Empirical Bayesian kriging performed best across all analyses, based on having the greatest number of variables for which the root mean square error was lowest. While all of the kriging methods appeared to underestimate high values of arsenic and PC scores, empirical Bayesian kriging was chosen to summarize large scale geographic trends in groundwater-sourced drinking water quality and assess exposure to mixtures of trace metals and ions. PMID:28914824
McLeod, Lianne; Bharadwaj, Lalita; Epp, Tasha; Waldner, Cheryl L
2017-09-15
Groundwater drinking water supply surveillance data were accessed to summarize water quality delivered as public and private water supplies in southern Saskatchewan as part of an exposure assessment for epidemiologic analyses of associations between water quality and type 2 diabetes or cardiovascular disease. Arsenic in drinking water has been linked to a variety of chronic diseases and previous studies have identified multiple wells with arsenic above the drinking water standard of 0.01 mg/L; therefore, arsenic concentrations were of specific interest. Principal components analysis was applied to obtain principal component (PC) scores to summarize mixtures of correlated parameters identified as health standards and those identified as aesthetic objectives in the Saskatchewan Drinking Water Quality Standards and Objective. Ordinary, universal, and empirical Bayesian kriging were used to interpolate arsenic concentrations and PC scores in southern Saskatchewan, and the results were compared. Empirical Bayesian kriging performed best across all analyses, based on having the greatest number of variables for which the root mean square error was lowest. While all of the kriging methods appeared to underestimate high values of arsenic and PC scores, empirical Bayesian kriging was chosen to summarize large scale geographic trends in groundwater-sourced drinking water quality and assess exposure to mixtures of trace metals and ions.
Bayesian analysis of energy and count rate data for detection of low count rate radioactive sources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klumpp, John
We propose a radiation detection system which generates its own discrete sampling distribution based on past measurements of background. The advantage to this approach is that it can take into account variations in background with respect to time, location, energy spectra, detector-specific characteristics (i.e. different efficiencies at different count rates and energies), etc. This would therefore be a 'machine learning' approach, in which the algorithm updates and improves its characterization of background over time. The system would have a 'learning mode,' in which it measures and analyzes background count rates, and a 'detection mode,' in which it compares measurements frommore » an unknown source against its unique background distribution. By characterizing and accounting for variations in the background, general purpose radiation detectors can be improved with little or no increase in cost. The statistical and computational techniques to perform this kind of analysis have already been developed. The necessary signal analysis can be accomplished using existing Bayesian algorithms which account for multiple channels, multiple detectors, and multiple time intervals. Furthermore, Bayesian machine-learning techniques have already been developed which, with trivial modifications, can generate appropriate decision thresholds based on the comparison of new measurements against a nonparametric sampling distribution. (authors)« less
Bayesian Travel Time Inversion adopting Gaussian Process Regression
NASA Astrophysics Data System (ADS)
Mauerberger, S.; Holschneider, M.
2017-12-01
A major application in seismology is the determination of seismic velocity models. Travel time measurements are putting an integral constraint on the velocity between source and receiver. We provide insight into travel time inversion from a correlation-based Bayesian point of view. Therefore, the concept of Gaussian process regression is adopted to estimate a velocity model. The non-linear travel time integral is approximated by a 1st order Taylor expansion. A heuristic covariance describes correlations amongst observations and a priori model. That approach enables us to assess a proxy of the Bayesian posterior distribution at ordinary computational costs. No multi dimensional numeric integration nor excessive sampling is necessary. Instead of stacking the data, we suggest to progressively build the posterior distribution. Incorporating only a single evidence at a time accounts for the deficit of linearization. As a result, the most probable model is given by the posterior mean whereas uncertainties are described by the posterior covariance.As a proof of concept, a synthetic purely 1d model is addressed. Therefore a single source accompanied by multiple receivers is considered on top of a model comprising a discontinuity. We consider travel times of both phases - direct and reflected wave - corrupted by noise. Left and right of the interface are assumed independent where the squared exponential kernel serves as covariance.
Ting, Chih-Chung; Yu, Chia-Chen; Maloney, Laurence T.
2015-01-01
In Bayesian decision theory, knowledge about the probabilities of possible outcomes is captured by a prior distribution and a likelihood function. The prior reflects past knowledge and the likelihood summarizes current sensory information. The two combined (integrated) form a posterior distribution that allows estimation of the probability of different possible outcomes. In this study, we investigated the neural mechanisms underlying Bayesian integration using a novel lottery decision task in which both prior knowledge and likelihood information about reward probability were systematically manipulated on a trial-by-trial basis. Consistent with Bayesian integration, as sample size increased, subjects tended to weigh likelihood information more compared with prior information. Using fMRI in humans, we found that the medial prefrontal cortex (mPFC) correlated with the mean of the posterior distribution, a statistic that reflects the integration of prior knowledge and likelihood of reward probability. Subsequent analysis revealed that both prior and likelihood information were represented in mPFC and that the neural representations of prior and likelihood in mPFC reflected changes in the behaviorally estimated weights assigned to these different sources of information in response to changes in the environment. Together, these results establish the role of mPFC in prior-likelihood integration and highlight its involvement in representing and integrating these distinct sources of information. PMID:25632152
Uncertain deduction and conditional reasoning.
Evans, Jonathan St B T; Thompson, Valerie A; Over, David E
2015-01-01
There has been a paradigm shift in the psychology of deductive reasoning. Many researchers no longer think it is appropriate to ask people to assume premises and decide what necessarily follows, with the results evaluated by binary extensional logic. Most every day and scientific inference is made from more or less confidently held beliefs and not assumptions, and the relevant normative standard is Bayesian probability theory. We argue that the study of "uncertain deduction" should directly ask people to assign probabilities to both premises and conclusions, and report an experiment using this method. We assess this reasoning by two Bayesian metrics: probabilistic validity and coherence according to probability theory. On both measures, participants perform above chance in conditional reasoning, but they do much better when statements are grouped as inferences, rather than evaluated in separate tasks.
A Bayesian Machine Learning Model for Estimating Building Occupancy from Open Source Data
Stewart, Robert N.; Urban, Marie L.; Duchscherer, Samantha E.; ...
2016-01-01
Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the artmore » by introducing the Population Data Tables (PDT), a Bayesian based informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000ft2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the art by introducing the Population Data Tables (PDT), a Bayesian model and informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000 ft 2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahoney, Christine M.; Kelly, Ryan T.; Alexander, M. L.
Key elements regarding the use of non-radioactive ionization sources will be presented as related to explosives detection by mass spectrometry and ion mobility spectrometry. Various non-radioactive ionization sources will be discussed along with associated ionization mechanisms pertaining to specific sample types.
Cultural Geography Model Validation
2010-03-01
the Cultural Geography Model (CGM), a government owned, open source multi - agent system utilizing Bayesian networks, queuing systems, the Theory of...referent determined either from theory or SME opinion. 4. CGM Overview The CGM is a government-owned, open source, data driven multi - agent social...HSCB, validation, social network analysis ABSTRACT: In the current warfighting environment , the military needs robust modeling and simulation (M&S
Finite‐fault Bayesian inversion of teleseismic body waves
Clayton, Brandon; Hartzell, Stephen; Moschetti, Morgan P.; Minson, Sarah E.
2017-01-01
Inverting geophysical data has provided fundamental information about the behavior of earthquake rupture. However, inferring kinematic source model parameters for finite‐fault ruptures is an intrinsically underdetermined problem (the problem of nonuniqueness), because we are restricted to finite noisy observations. Although many studies use least‐squares techniques to make the finite‐fault problem tractable, these methods generally lack the ability to apply non‐Gaussian error analysis and the imposition of nonlinear constraints. However, the Bayesian approach can be employed to find a Gaussian or non‐Gaussian distribution of all probable model parameters, while utilizing nonlinear constraints. We present case studies to quantify the resolving power and associated uncertainties using only teleseismic body waves in a Bayesian framework to infer the slip history for a synthetic case and two earthquakes: the 2011 Mw 7.1 Van, east Turkey, earthquake and the 2010 Mw 7.2 El Mayor–Cucapah, Baja California, earthquake. In implementing the Bayesian method, we further present two distinct solutions to investigate the uncertainties by performing the inversion with and without velocity structure perturbations. We find that the posterior ensemble becomes broader when including velocity structure variability and introduces a spatial smearing of slip. Using the Bayesian framework solely on teleseismic body waves, we find rake is poorly constrained by the observations and rise time is poorly resolved when slip amplitude is low.
Improved Bayesian Infrasonic Source Localization for regional infrasound
Blom, Philip S.; Marcillo, Omar; Arrowsmith, Stephen J.
2015-10-20
The Bayesian Infrasonic Source Localization (BISL) methodology is examined and simplified providing a generalized method of estimating the source location and time for an infrasonic event and the mathematical framework is used therein. The likelihood function describing an infrasonic detection used in BISL has been redefined to include the von Mises distribution developed in directional statistics and propagation-based, physically derived celerity-range and azimuth deviation models. Frameworks for constructing propagation-based celerity-range and azimuth deviation statistics are presented to demonstrate how stochastic propagation modelling methods can be used to improve the precision and accuracy of the posterior probability density function describing themore » source localization. Infrasonic signals recorded at a number of arrays in the western United States produced by rocket motor detonations at the Utah Test and Training Range are used to demonstrate the application of the new mathematical framework and to quantify the improvement obtained by using the stochastic propagation modelling methods. Moreover, using propagation-based priors, the spatial and temporal confidence bounds of the source decreased by more than 40 per cent in all cases and by as much as 80 per cent in one case. Further, the accuracy of the estimates remained high, keeping the ground truth within the 99 per cent confidence bounds for all cases.« less
Hanks, E.M.; Hooten, M.B.; Baker, F.A.
2011-01-01
Ecological spatial data often come from multiple sources, varying in extent and accuracy. We describe a general approach to reconciling such data sets through the use of the Bayesian hierarchical framework. This approach provides a way for the data sets to borrow strength from one another while allowing for inference on the underlying ecological process. We apply this approach to study the incidence of eastern spruce dwarf mistletoe (Arceuthobium pusillum) in Minnesota black spruce (Picea mariana). A Minnesota Department of Natural Resources operational inventory of black spruce stands in northern Minnesota found mistletoe in 11% of surveyed stands, while a small, specific-pest survey found mistletoe in 56% of the surveyed stands. We reconcile these two surveys within a Bayesian hierarchical framework and predict that 35-59% of black spruce stands in northern Minnesota are infested with dwarf mistletoe. ?? 2011 by the Ecological Society of America.
Bayesian LASSO, scale space and decision making in association genetics.
Pasanen, Leena; Holmström, Lasse; Sillanpää, Mikko J
2015-01-01
LASSO is a penalized regression method that facilitates model fitting in situations where there are as many, or even more explanatory variables than observations, and only a few variables are relevant in explaining the data. We focus on the Bayesian version of LASSO and consider four problems that need special attention: (i) controlling false positives, (ii) multiple comparisons, (iii) collinearity among explanatory variables, and (iv) the choice of the tuning parameter that controls the amount of shrinkage and the sparsity of the estimates. The particular application considered is association genetics, where LASSO regression can be used to find links between chromosome locations and phenotypic traits in a biological organism. However, the proposed techniques are relevant also in other contexts where LASSO is used for variable selection. We separate the true associations from false positives using the posterior distribution of the effects (regression coefficients) provided by Bayesian LASSO. We propose to solve the multiple comparisons problem by using simultaneous inference based on the joint posterior distribution of the effects. Bayesian LASSO also tends to distribute an effect among collinear variables, making detection of an association difficult. We propose to solve this problem by considering not only individual effects but also their functionals (i.e. sums and differences). Finally, whereas in Bayesian LASSO the tuning parameter is often regarded as a random variable, we adopt a scale space view and consider a whole range of fixed tuning parameters, instead. The effect estimates and the associated inference are considered for all tuning parameters in the selected range and the results are visualized with color maps that provide useful insights into data and the association problem considered. The methods are illustrated using two sets of artificial data and one real data set, all representing typical settings in association genetics.
A combined Fuzzy and Naive Bayesian strategy can be used to assign event codes to injury narratives.
Marucci-Wellman, H; Lehto, M; Corns, H
2011-12-01
Bayesian methods show promise for classifying injury narratives from large administrative datasets into cause groups. This study examined a combined approach where two Bayesian models (Fuzzy and Naïve) were used to either classify a narrative or select it for manual review. Injury narratives were extracted from claims filed with a worker's compensation insurance provider between January 2002 and December 2004. Narratives were separated into a training set (n=11,000) and prediction set (n=3,000). Expert coders assigned two-digit Bureau of Labor Statistics Occupational Injury and Illness Classification event codes to each narrative. Fuzzy and Naïve Bayesian models were developed using manually classified cases in the training set. Two semi-automatic machine coding strategies were evaluated. The first strategy assigned cases for manual review if the Fuzzy and Naïve models disagreed on the classification. The second strategy selected additional cases for manual review from the Agree dataset using prediction strength to reach a level of 50% computer coding and 50% manual coding. When agreement alone was used as the filtering strategy, the majority were coded by the computer (n=1,928, 64%) leaving 36% for manual review. The overall combined (human plus computer) sensitivity was 0.90 and positive predictive value (PPV) was >0.90 for 11 of 18 2-digit event categories. Implementing the 2nd strategy improved results with an overall sensitivity of 0.95 and PPV >0.90 for 17 of 18 categories. A combined Naïve-Fuzzy Bayesian approach can classify some narratives with high accuracy and identify others most beneficial for manual review, reducing the burden on human coders.
Toward a probabilistic acoustic emission source location algorithm: A Bayesian approach
NASA Astrophysics Data System (ADS)
Schumacher, Thomas; Straub, Daniel; Higgins, Christopher
2012-09-01
Acoustic emissions (AE) are stress waves initiated by sudden strain releases within a solid body. These can be caused by internal mechanisms such as crack opening or propagation, crushing, or rubbing of crack surfaces. One application for the AE technique in the field of Structural Engineering is Structural Health Monitoring (SHM). With piezo-electric sensors mounted to the surface of the structure, stress waves can be detected, recorded, and stored for later analysis. An important step in quantitative AE analysis is the estimation of the stress wave source locations. Commonly, source location results are presented in a rather deterministic manner as spatial and temporal points, excluding information about uncertainties and errors. Due to variability in the material properties and uncertainty in the mathematical model, measures of uncertainty are needed beyond best-fit point solutions for source locations. This paper introduces a novel holistic framework for the development of a probabilistic source location algorithm. Bayesian analysis methods with Markov Chain Monte Carlo (MCMC) simulation are employed where all source location parameters are described with posterior probability density functions (PDFs). The proposed methodology is applied to an example employing data collected from a realistic section of a reinforced concrete bridge column. The selected approach is general and has the advantage that it can be extended and refined efficiently. Results are discussed and future steps to improve the algorithm are suggested.
Bayesian reconstruction of transmission within outbreaks using genomic variants.
De Maio, Nicola; Worby, Colin J; Wilson, Daniel J; Stoesser, Nicole
2018-04-01
Pathogen genome sequencing can reveal details of transmission histories and is a powerful tool in the fight against infectious disease. In particular, within-host pathogen genomic variants identified through heterozygous nucleotide base calls are a potential source of information to identify linked cases and infer direction and time of transmission. However, using such data effectively to model disease transmission presents a number of challenges, including differentiating genuine variants from those observed due to sequencing error, as well as the specification of a realistic model for within-host pathogen population dynamics. Here we propose a new Bayesian approach to transmission inference, BadTrIP (BAyesian epiDemiological TRansmission Inference from Polymorphisms), that explicitly models evolution of pathogen populations in an outbreak, transmission (including transmission bottlenecks), and sequencing error. BadTrIP enables the inference of host-to-host transmission from pathogen sequencing data and epidemiological data. By assuming that genomic variants are unlinked, our method does not require the computationally intensive and unreliable reconstruction of individual haplotypes. Using simulations we show that BadTrIP is robust in most scenarios and can accurately infer transmission events by efficiently combining information from genetic and epidemiological sources; thanks to its realistic model of pathogen evolution and the inclusion of epidemiological data, BadTrIP is also more accurate than existing approaches. BadTrIP is distributed as an open source package (https://bitbucket.org/nicofmay/badtrip) for the phylogenetic software BEAST2. We apply our method to reconstruct transmission history at the early stages of the 2014 Ebola outbreak, showcasing the power of within-host genomic variants to reconstruct transmission events.
Tan, Sarah; Makela, Susanna; Heller, Daliah; Konty, Kevin; Balter, Sharon; Zheng, Tian; Stark, James H
2018-06-01
Existing methods to estimate the prevalence of chronic hepatitis C (HCV) in New York City (NYC) are limited in scope and fail to assess hard-to-reach subpopulations with highest risk such as injecting drug users (IDUs). To address these limitations, we employ a Bayesian multi-parameter evidence synthesis model to systematically combine multiple sources of data, account for bias in certain data sources, and provide unbiased HCV prevalence estimates with associated uncertainty. Our approach improves on previous estimates by explicitly accounting for injecting drug use and including data from high-risk subpopulations such as the incarcerated, and is more inclusive, utilizing ten NYC data sources. In addition, we derive two new equations to allow age at first injecting drug use data for former and current IDUs to be incorporated into the Bayesian evidence synthesis, a first for this type of model. Our estimated overall HCV prevalence as of 2012 among NYC adults aged 20-59 years is 2.78% (95% CI 2.61-2.94%), which represents between 124,900 and 140,000 chronic HCV cases. These estimates suggest that HCV prevalence in NYC is higher than previously indicated from household surveys (2.2%) and the surveillance system (2.37%), and that HCV transmission is increasing among young injecting adults in NYC. An ancillary benefit from our results is an estimate of current IDUs aged 20-59 in NYC: 0.58% or 27,600 individuals. Copyright © 2018 Elsevier B.V. All rights reserved.
Phan, Kevin; Xie, Ashleigh; Kumar, Narendra; Wong, Sophia; Medi, Caroline; La Meir, Mark; Yan, Tristan D
2015-08-01
Simplified maze procedures involving radiofrequency, cryoenergy and microwave energy sources have been increasingly utilized for surgical treatment of atrial fibrillation as an alternative to the traditional cut-and-sew approach. In the absence of direct comparisons, a Bayesian network meta-analysis is another alternative to assess the relative effect of different treatments, using indirect evidence. A Bayesian meta-analysis of indirect evidence was performed using 16 published randomized trials identified from 6 databases. Rank probability analysis was used to rank each intervention in terms of their probability of having the best outcome. Sinus rhythm prevalence beyond the 12-month follow-up was similar between the cut-and-sew, microwave and radiofrequency approaches, which were all ranked better than cryoablation (respectively, 39, 36, and 25 vs 1%). The cut-and-sew maze was ranked worst in terms of mortality outcomes compared with microwave, radiofrequency and cryoenergy (2 vs 19, 34, and 24%, respectively). The cut-and-sew maze procedure was associated with significantly lower stroke rates compared with microwave ablation [odds ratio <0.01; 95% confidence interval 0.00, 0.82], and ranked the best in terms of pacemaker requirements compared with microwave, radiofrequency and cryoenergy (81 vs 14, and 1, <0.01% respectively). Bayesian rank probability analysis shows that the cut-and-sew approach is associated with the best outcomes in terms of sinus rhythm prevalence and stroke outcomes, and remains the gold standard approach for AF treatment. Given the limitations of indirect comparison analysis, these results should be viewed with caution and not over-interpreted. © The Author 2014. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.
NASA Astrophysics Data System (ADS)
Rosenheim, B. E.; Firesinger, D.; Roberts, M. L.; Burton, J. R.; Khan, N.; Moyer, R. P.
2016-12-01
Radiocarbon (14C) sediment core chronologies benefit from a high density of dates, even when precision of individual dates is sacrificed. This is demonstrated by a combined approach of rapid 14C analysis of CO2 gas generated from carbonates and organic material coupled with Bayesian statistical modeling. Analysis of 14C is facilitated by the gas ion source on the Continuous Flow Accelerator Mass Spectrometry (CFAMS) system at the Woods Hole Oceanographic Institution's National Ocean Sciences Accelerator Mass Spectrometry facility. This instrument is capable of producing a 14C determination of +/- 100 14C y precision every 4-5 minutes, with limited sample handling (dissolution of carbonates and/or combustion of organic carbon in evacuated containers). Rapid analysis allows over-preparation of samples to include replicates at each depth and/or comparison of different sample types at particular depths in a sediment or peat core. Analysis priority is given to depths that have the least chronologic precision as determined by Bayesian modeling of the chronology of calibrated ages. Use of such a statistical approach to determine the order in which samples are run ensures that the chronology constantly improves so long as material is available for the analysis of chronologic weak points. Ultimately, accuracy of the chronology is determined by the material that is actually being dated, and our combined approach allows testing of different constituents of the organic carbon pool and the carbonate minerals within a core. We will present preliminary results from a deep-sea sediment core abundant in deep-sea foraminifera as well as coastal wetland peat cores to demonstrate statistical improvements in sediment- and peat-core chronologies obtained by increasing the quantity and decreasing the quality of individual dates.
Predicting Football Matches Results using Bayesian Networks for English Premier League (EPL)
NASA Astrophysics Data System (ADS)
Razali, Nazim; Mustapha, Aida; Yatim, Faiz Ahmad; Aziz, Ruhaya Ab
2017-08-01
The issues of modeling asscoiation football prediction model has become increasingly popular in the last few years and many different approaches of prediction models have been proposed with the point of evaluating the attributes that lead a football team to lose, draw or win the match. There are three types of approaches has been considered for predicting football matches results which include statistical approaches, machine learning approaches and Bayesian approaches. Lately, many studies regarding football prediction models has been produced using Bayesian approaches. This paper proposes a Bayesian Networks (BNs) to predict the results of football matches in term of home win (H), away win (A) and draw (D). The English Premier League (EPL) for three seasons of 2010-2011, 2011-2012 and 2012-2013 has been selected and reviewed. K-fold cross validation has been used for testing the accuracy of prediction model. The required information about the football data is sourced from a legitimate site at http://www.football-data.co.uk. BNs achieved predictive accuracy of 75.09% in average across three seasons. It is hoped that the results could be used as the benchmark output for future research in predicting football matches results.
NASA Astrophysics Data System (ADS)
Blecic, Jasmina; Harrington, Joseph; Bowman, Matthew O.; Cubillos, Patricio E.; Stemm, Madison; Foster, Andrew
2014-11-01
We present a new, open-source, Thermochemical Equilibrium Abundances (TEA) code that calculates the abundances of gaseous molecular species. TEA uses the Gibbs-free-energy minimization method with an iterative Lagrangian optimization scheme. It initializes the radiative-transfer calculation in our Bayesian Atmospheric Radiative Transfer (BART) code. Given elemental abundances, TEA calculates molecular abundances for a particular temperature and pressure or a list of temperature-pressure pairs. The code is tested against the original method developed by White at al. (1958), the analytic method developed by Burrows and Sharp (1999), and the Newton-Raphson method implemented in the open-source Chemical Equilibrium with Applications (CEA) code. TEA is written in Python and is available to the community via the open-source development site GitHub.com. We also present BART applied to eclipse depths of WASP-43b exoplanet, constraining atmospheric thermal and chemical parameters. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G. JB holds a NASA Earth and Space Science Fellowship.
Algorithmic procedures for Bayesian MEG/EEG source reconstruction in SPM☆
López, J.D.; Litvak, V.; Espinosa, J.J.; Friston, K.; Barnes, G.R.
2014-01-01
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (Minimum Norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy—an approximation to the marginal likelihood or evidence of the solution. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardise the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with the help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational Free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm. PMID:24041874
NASA Astrophysics Data System (ADS)
D'Addabbo, Annarita; Refice, Alberto; Lovergine, Francesco P.; Pasquariello, Guido
2018-03-01
High-resolution, remotely sensed images of the Earth surface have been proven to be of help in producing detailed flood maps, thanks to their synoptic overview of the flooded area and frequent revisits. However, flood scenarios can be complex situations, requiring the integration of different data in order to provide accurate and robust flood information. Several processing approaches have been recently proposed to efficiently combine and integrate heterogeneous information sources. In this paper, we introduce DAFNE, a Matlab®-based, open source toolbox, conceived to produce flood maps from remotely sensed and other ancillary information, through a data fusion approach. DAFNE is based on Bayesian Networks, and is composed of several independent modules, each one performing a different task. Multi-temporal and multi-sensor data can be easily handled, with the possibility of following the evolution of an event through multi-temporal output flood maps. Each DAFNE module can be easily modified or upgraded to meet different user needs. The DAFNE suite is presented together with an example of its application.
Inferring source attribution from a multiyear multisource data set of Salmonella in Minnesota.
Ahlstrom, C; Muellner, P; Spencer, S E F; Hong, S; Saupe, A; Rovira, A; Hedberg, C; Perez, A; Muellner, U; Alvarez, J
2017-12-01
Salmonella enterica is a global health concern because of its widespread association with foodborne illness. Bayesian models have been developed to attribute the burden of human salmonellosis to specific sources with the ultimate objective of prioritizing intervention strategies. Important considerations of source attribution models include the evaluation of the quality of input data, assessment of whether attribution results logically reflect the data trends and identification of patterns within the data that might explain the detailed contribution of different sources to the disease burden. Here, more than 12,000 non-typhoidal Salmonella isolates from human, bovine, porcine, chicken and turkey sources that originated in Minnesota were analysed. A modified Bayesian source attribution model (available in a dedicated R package), accounting for non-sampled sources of infection, attributed 4,672 human cases to sources assessed here. Most (60%) cases were attributed to chicken, although there was a spike in cases attributed to a non-sampled source in the second half of the study period. Molecular epidemiological analysis methods were used to supplement risk modelling, and a visual attribution application was developed to facilitate data exploration and comprehension of the large multiyear data set assessed here. A large amount of within-source diversity and low similarity between sources was observed, and visual exploration of data provided clues into variations driving the attribution modelling results. Results from this pillared approach provided first attribution estimates for Salmonella in Minnesota and offer an understanding of current data gaps as well as key pathogen population features, such as serotype frequency, similarity and diversity across the sources. Results here will be used to inform policy and management strategies ultimately intended to prevent and control Salmonella infection in the state. © 2017 Blackwell Verlag GmbH.
The Chandra Source Catalog 2.0: Spectral Properties
NASA Astrophysics Data System (ADS)
McCollough, Michael L.; Siemiginowska, Aneta; Burke, Douglas; Nowak, Michael A.; Primini, Francis Anthony; Laurino, Omar; Nguyen, Dan T.; Allen, Christopher E.; Anderson, Craig S.; Budynkiewicz, Jamie A.; Chen, Judy C.; Civano, Francesca Maria; D'Abrusco, Raffaele; Doe, Stephen M.; Evans, Ian N.; Evans, Janet D.; Fabbiano, Giuseppina; Gibbs, Danny G., II; Glotfelty, Kenny J.; Graessle, Dale E.; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; Houck, John C.; Lauer, Jennifer L.; Lee, Nicholas P.; Martínez-Galarza, Juan Rafael; McDowell, Jonathan C.; Miller, Joseph; McLaughlin, Warren; Morgan, Douglas L.; Mossman, Amy E.; Nichols, Joy S.; Paxson, Charles; Plummer, David A.; Rots, Arnold H.; Sundheim, Beth A.; Tibbetts, Michael; Van Stone, David W.; Zografou, Panagoula; Chandra Source Catalog Team
2018-01-01
The second release of the Chandra Source Catalog (CSC) contains all sources identified from sixteen years' worth of publicly accessible observations. The vast majority of these sources have been observed with the ACIS detector and have spectral information in 0.5-7 keV energy range. Here we describe the methods used to automatically derive spectral properties for each source detected by the standard processing pipeline and included in the final CSC. The sources with high signal to noise ratio (exceeding 150 net counts) were fit in Sherpa (the modeling and fitting application from the Chandra Interactive Analysis of Observations package) using wstat as a fit statistic and Bayesian draws method to determine errors. Three models were fit to each source: an absorbed power-law, blackbody, and Bremsstrahlung emission. The fitted parameter values for the power-law, blackbody, and Bremsstrahlung models were included in the catalog with the calculated flux for each model. The CSC also provides the source energy fluxes computed from the normalizations of predefined absorbed power-law, black-body, Bremsstrahlung, and APEC models needed to match the observed net X-ray counts. For sources that have been observed multiple times we performed a Bayesian Blocks analysis will have been performed (see the Primini et al. poster) and the most significant block will have a joint fit performed for the mentioned spectral models. In addition, we provide access to data products for each source: a file with source spectrum, the background spectrum, and the spectral response of the detector. Hardness ratios were calculated for each source between pairs of energy bands (soft, medium and hard). This work has been supported by NASA under contract NAS 8-03060 to the Smithsonian Astrophysical Observatory for operation of the Chandra X-ray Center.
An ecological genetic delineation of local seed-source provenance for ecological restoration
Krauss, Siegfried L; Sinclair, Elizabeth A; Bussell, John D; Hobbs, Richard J
2013-01-01
An increasingly important practical application of the analysis of spatial genetic structure within plant species is to help define the extent of local provenance seed collection zones that minimize negative impacts in ecological restoration programs. Here, we derive seed sourcing guidelines from a novel range-wide assessment of spatial genetic structure of 24 populations of Banksia menziesii (Proteaceae), a widely distributed Western Australian tree of significance in local ecological restoration programs. An analysis of molecular variance (AMOVA) of 100 amplified fragment length polymorphism (AFLP) markers revealed significant genetic differentiation among populations (ΦPT = 0.18). Pairwise population genetic dissimilarity was correlated with geographic distance, but not environmental distance derived from 15 climate variables, suggesting overall neutrality of these markers with regard to these climate variables. Nevertheless, Bayesian outlier analysis identified four markers potentially under selection, although these were not correlated with the climate variables. We calculated a global R-statistic using analysis of similarities (ANOSIM) to test the statistical significance of population differentiation and to infer a threshold seed collection zone distance of ∼60 km (all markers) and 100 km (outlier markers) when genetic distance was regressed against geographic distance. Population pairs separated by >60 km were, on average, twice as likely to be significantly genetically differentiated than population pairs separated by <60 km, suggesting that habitat-matched sites within a 30-km radius around a restoration site genetically defines a local provenance seed collection zone for B. menziesii. Our approach is a novel probability-based practical solution for the delineation of a local seed collection zone to minimize negative genetic impacts in ecological restoration. PMID:23919158
Determining the Intensity of a Point-Like Source Observed on the Background of AN Extended Source
NASA Astrophysics Data System (ADS)
Kornienko, Y. V.; Skuratovskiy, S. I.
2014-12-01
The problem of determining the time dependence of intensity of a point-like source in case of atmospheric blur is formulated and solved by using the Bayesian statistical approach. A pointlike source is supposed to be observed on the background of an extended source with constant in time though unknown brightness. The equation system for optimal statistical estimation of the sequence of intensity values in observation moments is obtained. The problem is particularly relevant for studying gravitational mirages which appear while observing a quasar through the gravitational field of a far galaxy.
J-Plus: Morphological Classification Of Compact And Extended Sources By Pdf Analysis
NASA Astrophysics Data System (ADS)
López-Sanjuan, C.; Vázquez-Ramió, H.; Varela, J.; Spinoso, D.; Cristóbal-Hornillos, D.; Viironen, K.; Muniesa, D.; J-PLUS Collaboration
2017-10-01
We present a morphological classification of J-PLUS EDR sources into compact (i.e. stars) and extended (i.e. galaxies). Such classification is based on the Bayesian modelling of the concentration distribution, including observational errors and magnitude + sky position priors. We provide the star / galaxy probability of each source computed from the gri images. The comparison with the SDSS number counts support our classification up to r 21. The 31.7 deg² analised comprises 150k stars and 101k galaxies.
An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations
Majumdar, Arunabha; Haldar, Tanushree; Bhattacharya, Sourabh; Witte, John S.
2018-01-01
Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy). For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes) that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC) technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package ‘CPBayes’ implementing the proposed method. PMID:29432419
Moving in time: Bayesian causal inference explains movement coordination to auditory beats
Elliott, Mark T.; Wing, Alan M.; Welchman, Andrew E.
2014-01-01
Many everyday skilled actions depend on moving in time with signals that are embedded in complex auditory streams (e.g. musical performance, dancing or simply holding a conversation). Such behaviour is apparently effortless; however, it is not known how humans combine auditory signals to support movement production and coordination. Here, we test how participants synchronize their movements when there are potentially conflicting auditory targets to guide their actions. Participants tapped their fingers in time with two simultaneously presented metronomes of equal tempo, but differing in phase and temporal regularity. Synchronization therefore depended on integrating the two timing cues into a single-event estimate or treating the cues as independent and thereby selecting one signal over the other. We show that a Bayesian inference process explains the situations in which participants choose to integrate or separate signals, and predicts motor timing errors. Simulations of this causal inference process demonstrate that this model provides a better description of the data than other plausible models. Our findings suggest that humans exploit a Bayesian inference process to control movement timing in situations where the origin of auditory signals needs to be resolved. PMID:24850915
IMAGINE: Interstellar MAGnetic field INference Engine
NASA Astrophysics Data System (ADS)
Steininger, Theo
2018-03-01
IMAGINE (Interstellar MAGnetic field INference Engine) performs inference on generic parametric models of the Galaxy. The modular open source framework uses highly optimized tools and technology such as the MultiNest sampler (ascl:1109.006) and the information field theory framework NIFTy (ascl:1302.013) to create an instance of the Milky Way based on a set of parameters for physical observables, using Bayesian statistics to judge the mismatch between measured data and model prediction. The flexibility of the IMAGINE framework allows for simple refitting for newly available data sets and makes state-of-the-art Bayesian methods easily accessible particularly for random components of the Galactic magnetic field.
Uncertain deduction and conditional reasoning
Evans, Jonathan St. B. T.; Thompson, Valerie A.; Over, David E.
2015-01-01
There has been a paradigm shift in the psychology of deductive reasoning. Many researchers no longer think it is appropriate to ask people to assume premises and decide what necessarily follows, with the results evaluated by binary extensional logic. Most every day and scientific inference is made from more or less confidently held beliefs and not assumptions, and the relevant normative standard is Bayesian probability theory. We argue that the study of “uncertain deduction” should directly ask people to assign probabilities to both premises and conclusions, and report an experiment using this method. We assess this reasoning by two Bayesian metrics: probabilistic validity and coherence according to probability theory. On both measures, participants perform above chance in conditional reasoning, but they do much better when statements are grouped as inferences, rather than evaluated in separate tasks. PMID:25904888
MOA-2008-BLG-379Lb: A massive planet from a high magnification event with a faint source
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suzuki, D.; Sumi, T.; Fukagawa, M.
2014-01-10
We report on the analysis of the high microlensing event MOA-2008-BLG-379, which has a strong microlensing anomaly at its peak due to a massive planet with a mass ratio of q = 6.9 × 10{sup –3}. Because the faint source star crosses the large resonant caustic, the planetary signal dominates the light curve. This is unusual for planetary microlensing events, and as a result, the planetary nature of this light curve was not immediately noticed. The planetary nature of the event was found when the Microlensing Observations in Astrophysics (MOA) Collaboration conducted a systematic study of binary microlensing events previouslymore » identified by the MOA alert system. We have conducted a Bayesian analysis based on a standard Galactic model to estimate the physical parameters of the lens system. This yields a host star mass of M{sub L}=3.3{sub −1.2}{sup +1.7} M{sub ⊙} orbited by a planet of mass m{sub P}=0.56{sub −0.27}{sup +0.24} M{sub Jup} at an orbital separation of a=3.3{sub −1.2}{sup +1.3} AU at a distance of D{sub L}=4.1{sub −1.9}{sup +1.7} kpc. The faint source magnitude of I {sub S} = 21.30 and relatively high lens-source relative proper motion of μ{sub rel} = 7.6 ± 1.6 mas yr{sup –1} imply that high angular resolution adaptive optics or Hubble Space Telescope observations are likely to be able to detect the source star, which would determine the masses and distance of the planet and its host star.« less
Faint Object Detection in Multi-Epoch Observations via Catalog Data Fusion
DOE Office of Scientific and Technical Information (OSTI.GOV)
Budavári, Tamás; Szalay, Alexander S.; Loredo, Thomas J.
Astronomy in the time-domain era faces several new challenges. One of them is the efficient use of observations obtained at multiple epochs. The work presented here addresses faint object detection and describes an incremental strategy for separating real objects from artifacts in ongoing surveys. The idea is to produce low-threshold single-epoch catalogs and to accumulate information across epochs. This is in contrast to more conventional strategies based on co-added or stacked images. We adopt a Bayesian approach, addressing object detection by calculating the marginal likelihoods for hypotheses asserting that there is no object or one object in a small imagemore » patch containing at most one cataloged source at each epoch. The object-present hypothesis interprets the sources in a patch at different epochs as arising from a genuine object; the no-object hypothesis interprets candidate sources as spurious, arising from noise peaks. We study the detection probability for constant-flux objects in a Gaussian noise setting, comparing results based on single and stacked exposures to results based on a series of single-epoch catalog summaries. Our procedure amounts to generalized cross-matching: it is the product of a factor accounting for the matching of the estimated fluxes of the candidate sources and a factor accounting for the matching of their estimated directions. We find that probabilistic fusion of multi-epoch catalogs can detect sources with similar sensitivity and selectivity compared to stacking. The probabilistic cross-matching framework underlying our approach plays an important role in maintaining detection sensitivity and points toward generalizations that could accommodate variability and complex object structure.« less
NASA Astrophysics Data System (ADS)
Lew, E. J.; Butenhoff, C. L.; Karmakar, S.; Rice, A. L.; Khalil, A. K.
2017-12-01
Methane is the second most important greenhouse gas after carbon dioxide. In efforts to control emissions, a careful examination of the methane budget and source strengths is required. To determine methane surface fluxes, Bayesian methods are often used to provide top-down constraints. Inverse modeling derives unknown fluxes using observed methane concentrations, a chemical transport model (CTM) and prior information. The Bayesian inversion reduces prior flux uncertainties by exploiting information content in the data. While the Bayesian formalism produces internal error estimates of source fluxes, systematic or external errors that arise from user choices in the inversion scheme are often much larger. Here we examine model sensitivity and uncertainty of our inversion under different observation data sets and CTM grid resolution. We compare posterior surface fluxes using the data product GLOBALVIEW-CH4 against the event-level molar mixing ratio data available from NOAA. GLOBALVIEW-CH4 is a collection of CH4 concentration estimates from 221 sites, collected by 12 laboratories, that have been interpolated and extracted to provide weekly records from 1984-2008. Differently, the event-level NOAA data records methane mixing ratios field measurements from 102 sites, containing sampling frequency irregularities and gaps in time. Furthermore, the sampling platform types used by the data sets may influence the posterior flux estimates, namely fixed surface, tower, ship and aircraft sites. To explore the sensitivity of the posterior surface fluxes to the observation network geometry, inversions composed of all sites, only aircraft, only ship, only tower and only fixed surface sites, are performed and compared. Also, we investigate the sensitivity of the error reduction associated with the resolution of the GEOS-Chem simulation (4°×5° vs 2°×2.5°) used to calculate the response matrix. Using a higher resolution grid decreased the model-data error at most sites, thereby increasing the information at that site. These different inversions—event-level and interpolated data, higher and lower resolutions—are compared using an ensemble of descriptive and comparative statistics. Analyzing the sensitivity of the inverse model leads to more accurate estimates of the methane source category uncertainty.
NASA Astrophysics Data System (ADS)
Mazrou, H.; Bezoubiri, F.
2018-07-01
In this work, a new program developed under MATLAB environment and supported by the Bayesian software WinBUGS has been combined to the traditional unfolding codes namely MAXED and GRAVEL, to evaluate a neutron spectrum from the Bonner spheres measured counts obtained around a shielded 241AmBe based-neutron irradiator located at a Secondary Standards Dosimetry Laboratory (SSDL) at CRNA. In the first step, the results obtained by the standalone Bayesian program, using a parametric neutron spectrum model based on a linear superposition of three components namely: a thermal-Maxwellian distribution, an epithermal (1/E behavior) and a kind of a Watt fission and Evaporation models to represent the fast component, were compared to those issued from MAXED and GRAVEL assuming a Monte Carlo default spectrum. Through the selection of new upper limits for some free parameters, taking into account the physical characteristics of the irradiation source, of both considered models, good agreement was obtained for investigated integral quantities i.e. fluence rate and ambient dose equivalent rate compared to MAXED and GRAVEL results. The difference was generally below 4% for investigated parameters suggesting, thereby, the reliability of the proposed models. In the second step, the Bayesian results obtained from the previous calculations were used, as initial guess spectra, for the traditional unfolding codes, MAXED and GRAVEL to derive the solution spectra. Here again the results were in very good agreement, confirming the stability of the Bayesian solution.
NASA Astrophysics Data System (ADS)
Sheldrake, T. E.; Aspinall, W. P.; Odbert, H. M.; Wadge, G.; Sparks, R. S. J.
2017-07-01
Following a cessation in eruptive activity it is important to understand how a volcano will behave in the future and when it may next erupt. Such an assessment can be based on the volcano's long-term pattern of behaviour and insights into its current state via monitoring observations. We present a Bayesian network that integrates these two strands of evidence to forecast future eruptive scenarios using expert elicitation. The Bayesian approach provides a framework to quantify the magmatic causes in terms of volcanic effects (i.e., eruption and unrest). In October 2013, an expert elicitation was performed to populate a Bayesian network designed to help forecast future eruptive (in-)activity at Soufrière Hills Volcano. The Bayesian network was devised to assess the state of the shallow magmatic system, as a means to forecast the future eruptive activity in the context of the long-term behaviour at similar dome-building volcanoes. The findings highlight coherence amongst experts when interpreting the current behaviour of the volcano, but reveal considerable ambiguity when relating this to longer patterns of volcanism at dome-building volcanoes, as a class. By asking questions in terms of magmatic causes, the Bayesian approach highlights the importance of using short-term unrest indicators from monitoring data as evidence in long-term forecasts at volcanoes. Furthermore, it highlights potential biases in the judgements of volcanologists and identifies sources of uncertainty in terms of magmatic causes rather than scenario-based outcomes.
Flood quantile estimation at ungauged sites by Bayesian networks
NASA Astrophysics Data System (ADS)
Mediero, L.; Santillán, D.; Garrote, L.
2012-04-01
Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
NASA Astrophysics Data System (ADS)
Kubo, H.; Asano, K.; Iwata, T.; Aoi, S.
2014-12-01
Previous studies for the period-dependent source characteristics of the 2011 Tohoku earthquake (e.g., Koper et al., 2011; Lay et al., 2012) were based on the short and long period source models using different method. Kubo et al. (2013) obtained source models of the 2011 Tohoku earthquake using multi period-bands waveform data by a common inversion method and discussed its period-dependent source characteristics. In this study, to achieve more in detail spatiotemporal source rupture behavior of this event, we introduce a new fault surface model having finer sub-fault size and estimate the source models in multi period-bands using a Bayesian inversion method combined with a multi-time-window method. Three components of velocity waveforms at 25 stations of K-NET, KiK-net, and F-net of NIED are used in this analysis. The target period band is 10-100 s. We divide this period band into three period bands (10-25 s, 25-50 s, and 50-100 s) and estimate a kinematic source model in each period band using a Bayesian inversion method with MCMC sampling (e.g., Fukuda & Johnson, 2008; Minson et al., 2013, 2014). The parameterization of spatiotemporal slip distribution follows the multi-time-window method (Hartzell & Heaton, 1983). The Green's functions are calculated by the 3D FDM (GMS; Aoi & Fujiwara, 1999) using a 3D velocity structure model (JIVSM; Koketsu et al., 2012). The assumed fault surface model is based on the Pacific plate boundary of JIVSM and is divided into 384 subfaults of about 16 * 16 km^2. The estimated source models in multi period-bands show the following source image: (1) First deep rupture off Miyagi at 0-60 s toward down-dip mostly radiating relatively short period (10-25 s) seismic waves. (2) Shallow rupture off Miyagi at 45-90 s toward up-dip with long duration radiating long period (50-100 s) seismic wave. (3) Second deep rupture off Miyagi at 60-105 s toward down-dip radiating longer period seismic waves then that of the first deep rupture. (4) Deep rupture off Fukushima at 90-135 s. The dominant-period difference of the seismic-wave radiation between two deep ruptures off Miyagi may result from the mechanism that small-scale heterogeneities on the fault are removed by the first rupture. This difference can be also interpreted by the concept of multi-scale dynamic rupture (Ide & Aochi, 2005).
A comprehensive Probabilistic Tsunami Hazard Assessment for the city of Naples (Italy)
NASA Astrophysics Data System (ADS)
Anita, G.; Tonini, R.; Selva, J.; Sandri, L.; Pierdominici, S.; Faenza, L.; Zaccarelli, L.
2012-12-01
A comprehensive Probabilistic Tsunami Hazard Assessment (PTHA) should consider different tsunamigenic sources (seismic events, slide failures, volcanic eruptions) to calculate the hazard on given target sites. This implies a multi-disciplinary analysis of all natural tsunamigenic sources, in a multi-hazard/risk framework, which considers also the effects of interaction/cascade events. Our approach shows the ongoing effort to analyze the comprehensive PTHA for the city of Naples (Italy) including all types of sources located in the Tyrrhenian Sea, as developed within the Italian project ByMuR (Bayesian Multi-Risk Assessment). The project combines a multi-hazard/risk approach to treat the interactions among different hazards, and a Bayesian approach to handle the uncertainties. The natural potential tsunamigenic sources analyzed are: 1) submarine seismic sources located on active faults in the Tyrrhenian Sea and close to the Southern Italian shore line (also we consider the effects of the inshore seismic sources and the associated active faults which we provide their rapture properties), 2) mass failures and collapses around the target area (spatially identified on the basis of their propensity to failure), and 3) volcanic sources mainly identified by pyroclastic flows and collapses from the volcanoes in the Neapolitan area (Vesuvius, Campi Flegrei and Ischia). All these natural sources are here preliminary analyzed and combined, in order to provide a complete picture of a PTHA for the city of Naples. In addition, the treatment of interaction/cascade effects is formally discussed in the case of significant temporary variations in the short-term PTHA due to an earthquake.
UKIRT-2017-BLG-001Lb: A Giant Planet Detected through the Dust
NASA Astrophysics Data System (ADS)
Shvartzvald, Y.; Calchi Novati, S.; Gaudi, B. S.; Bryden, G.; Nataf, D. M.; Penny, M. T.; Beichman, C.; Henderson, C. B.; Jacklin, S.; Schlafly, E. F.; Huston, M. J.
2018-04-01
We report the discovery of a giant planet in event UKIRT-2017-BLG-001, detected by the United Kingdom Infrared Telescope (UKIRT) microlensing survey. The mass ratio between the planet and its host is q={1.50}-0.14+0.17× {10}-3, about 1.5 times the Jupiter/Sun mass ratio. The event lies 0.°35 from the Galactic center and suffers from high extinction of A K = 1.68. Therefore, it could be detected only by a near-infrared (NIR) survey. The field also suffers from large spatial differential extinction, which makes it difficult to estimate the source properties required to derive the angular Einstein radius. Nevertheless, we find evidence suggesting that the source is located in the far disk. If correct, this would be the first source star of a microlensing event to be identified as belonging to the far disk. We estimate the lens mass and distance using a Bayesian analysis to find that the planet’s mass is {1.28}-0.44+0.37 {M}J, and it orbits a {0.81}-0.27+0.21 {M}ȯ star at an instantaneous projected separation of {4.18}-0.88+0.96 au. The system is at a distance of {6.3}-2.1+1.6 kpc, and so likely resides in the Galactic bulge. In addition, we find a non-standard extinction curve in this field, in agreement with previous results toward high-extinction fields near the Galactic center.
sourceR: Classification and source attribution of infectious agents among heterogeneous populations
French, Nigel
2017-01-01
Zoonotic diseases are a major cause of morbidity, and productivity losses in both human and animal populations. Identifying the source of food-borne zoonoses (e.g. an animal reservoir or food product) is crucial for the identification and prioritisation of food safety interventions. For many zoonotic diseases it is difficult to attribute human cases to sources of infection because there is little epidemiological information on the cases. However, microbial strain typing allows zoonotic pathogens to be categorised, and the relative frequencies of the strain types among the sources and in human cases allows inference on the likely source of each infection. We introduce sourceR, an R package for quantitative source attribution, aimed at food-borne diseases. It implements a Bayesian model using strain-typed surveillance data from both human cases and source samples, capable of identifying important sources of infection. The model measures the force of infection from each source, allowing for varying survivability, pathogenicity and virulence of pathogen strains, and varying abilities of the sources to act as vehicles of infection. A Bayesian non-parametric (Dirichlet process) approach is used to cluster pathogen strain types by epidemiological behaviour, avoiding model overfitting and allowing detection of strain types associated with potentially high “virulence”. sourceR is demonstrated using Campylobacter jejuni isolate data collected in New Zealand between 2005 and 2008. Chicken from a particular poultry supplier was identified as the major source of campylobacteriosis, which is qualitatively similar to results of previous studies using the same dataset. Additionally, the software identifies a cluster of 9 multilocus sequence types with abnormally high ‘virulence’ in humans. sourceR enables straightforward attribution of cases of zoonotic infection to putative sources of infection. As sourceR develops, we intend it to become an important and flexible resource for food-borne disease attribution studies. PMID:28558033
Quantitative estimation of source complexity in tsunami-source inversion
NASA Astrophysics Data System (ADS)
Dettmer, Jan; Cummins, Phil R.; Hawkins, Rhys; Jakir Hossen, M.
2016-04-01
This work analyses tsunami waveforms to infer the spatiotemporal evolution of sea-surface displacement (the tsunami source) caused by earthquakes or other sources. Since the method considers sea-surface displacement directly, no assumptions about the fault or seafloor deformation are required. While this approach has no ability to study seismic aspects of rupture, it greatly simplifies the tsunami source estimation, making it much less dependent on subjective fault and deformation assumptions. This results in a more accurate sea-surface displacement evolution in the source region. The spatial discretization is by wavelet decomposition represented by a trans-D Bayesian tree structure. Wavelet coefficients are sampled by a reversible jump algorithm and additional coefficients are only included when required by the data. Therefore, source complexity is consistent with data information (parsimonious) and the method can adapt locally in both time and space. Since the source complexity is unknown and locally adapts, no regularization is required, resulting in more meaningful displacement magnitudes. By estimating displacement uncertainties in a Bayesian framework we can study the effect of parametrization choice on the source estimate. Uncertainty arises from observation errors and limitations in the parametrization to fully explain the observations. As a result, parametrization choice is closely related to uncertainty estimation and profoundly affects inversion results. Therefore, parametrization selection should be included in the inference process. Our inversion method is based on Bayesian model selection, a process which includes the choice of parametrization in the inference process and makes it data driven. A trans-dimensional (trans-D) model for the spatio-temporal discretization is applied here to include model selection naturally and efficiently in the inference by sampling probabilistically over parameterizations. The trans-D process results in better uncertainty estimates since the parametrization adapts parsimoniously (in both time and space) according to the local data resolving power and the uncertainty about the parametrization choice is included in the uncertainty estimates. We apply the method to the tsunami waveforms recorded for the great 2011 Japan tsunami. All data are recorded on high-quality sensors (ocean-bottom pressure sensors, GPS gauges, and DART buoys). The sea-surface Green's functions are computed by JAGURS and include linear dispersion effects. By treating the noise level at each gauge as unknown, individual gauge contributions to the source estimate are appropriately and objectively weighted. The results show previously unreported detail of the source, quantify uncertainty spatially, and produce excellent data fits. The source estimate shows an elongated peak trench-ward from the hypo centre that closely follows the trench, indicating significant sea-floor deformation near the trench. Also notable is a bi-modal (negative to positive) displacement feature in the northern part of the source near the trench. The feature has ~2 m amplitude and is clearly resolved by the data with low uncertainties.
The discounting model selector: Statistical software for delay discounting applications.
Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A
2017-05-01
Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.
Ultrafast current imaging by Bayesian inversion
Somnath, Suhas; Law, Kody J. H.; Morozovska, Anna; Maksymovych, Petro; Kim, Yunseok; Lu, Xiaoli; Alexe, Marin; Archibald, Richard K; Kalinin, Sergei V; Jesse, Stephen; Vasudevan, Rama K
2016-01-01
Spectroscopic measurements of current-voltage curves in scanning probe microscopy is the earliest and one of the most common methods for characterizing local energy-dependent electronic properties, providing insight into superconductive, semiconductor, and memristive behaviors. However, the quasistatic nature of these measurements renders them extremely slow. Here, we demonstrate a fundamentally new approach for dynamic spectroscopic current imaging via full information capture and Bayesian inference analysis. This "general-mode I-V"method allows three orders of magnitude faster rates than presently possible. The technique is demonstrated by acquiring I-V curves in ferroelectric nanocapacitors, yielding >100,000 I-V curves in <20 minutes. This allows detection of switching currents in the nanoscale capacitors, as well as determination of dielectric constant. These experiments show the potential for the use of full information capture and Bayesian inference towards extracting physics from rapid I-V measurements, and can be used for transport measurements in both atomic force and scanning tunneling microscopy. The data was analyzed using pycroscopy - an open-source python package available at https://github.com/pycroscopy/pycroscopy
A solution to the static frame validation challenge problem using Bayesian model selection
Grigoriu, M. D.; Field, R. V.
2007-12-23
Within this paper, we provide a solution to the static frame validation challenge problem (see this issue) in a manner that is consistent with the guidelines provided by the Validation Challenge Workshop tasking document. The static frame problem is constructed such that variability in material properties is known to be the only source of uncertainty in the system description, but there is ignorance on the type of model that best describes this variability. Hence both types of uncertainty, aleatoric and epistemic, are present and must be addressed. Our approach is to consider a collection of competing probabilistic models for themore » material properties, and calibrate these models to the information provided; models of different levels of complexity and numerical efficiency are included in the analysis. A Bayesian formulation is used to select the optimal model from the collection, which is then used for the regulatory assessment. Lastly, bayesian credible intervals are used to provide a measure of confidence to our regulatory assessment.« less
pyblocxs: Bayesian Low-Counts X-ray Spectral Analysis in Sherpa
NASA Astrophysics Data System (ADS)
Siemiginowska, A.; Kashyap, V.; Refsdal, B.; van Dyk, D.; Connors, A.; Park, T.
2011-07-01
Typical X-ray spectra have low counts and should be modeled using the Poisson distribution. However, χ2 statistic is often applied as an alternative and the data are assumed to follow the Gaussian distribution. A variety of weights to the statistic or a binning of the data is performed to overcome the low counts issues. However, such modifications introduce biases or/and a loss of information. Standard modeling packages such as XSPEC and Sherpa provide the Poisson likelihood and allow computation of rudimentary MCMC chains, but so far do not allow for setting a full Bayesian model. We have implemented a sophisticated Bayesian MCMC-based algorithm to carry out spectral fitting of low counts sources in the Sherpa environment. The code is a Python extension to Sherpa and allows to fit a predefined Sherpa model to high-energy X-ray spectral data and other generic data. We present the algorithm and discuss several issues related to the implementation, including flexible definition of priors and allowing for variations in the calibration information.
Model selection and Bayesian inference for high-resolution seabed reflection inversion.
Dettmer, Jan; Dosso, Stan E; Holland, Charles W
2009-02-01
This paper applies Bayesian inference, including model selection and posterior parameter inference, to inversion of seabed reflection data to resolve sediment structure at a spatial scale below the pulse length of the acoustic source. A practical approach to model selection is used, employing the Bayesian information criterion to decide on the number of sediment layers needed to sufficiently fit the data while satisfying parsimony to avoid overparametrization. Posterior parameter inference is carried out using an efficient Metropolis-Hastings algorithm for high-dimensional models, and results are presented as marginal-probability depth distributions for sound velocity, density, and attenuation. The approach is applied to plane-wave reflection-coefficient inversion of single-bounce data collected on the Malta Plateau, Mediterranean Sea, which indicate complex fine structure close to the water-sediment interface. This fine structure is resolved in the geoacoustic inversion results in terms of four layers within the upper meter of sediments. The inversion results are in good agreement with parameter estimates from a gravity core taken at the experiment site.
Bayesian Knowledge Fusion in Prognostics and Health Management—A Case Study
NASA Astrophysics Data System (ADS)
Rabiei, Masoud; Modarres, Mohammad; Mohammad-Djafari, Ali
2011-03-01
In the past few years, a research effort has been in progress at University of Maryland to develop a Bayesian framework based on Physics of Failure (PoF) for risk assessment and fleet management of aging airframes. Despite significant achievements in modelling of crack growth behavior using fracture mechanics, it is still of great interest to find practical techniques for monitoring the crack growth instances using nondestructive inspection and to integrate such inspection results with the fracture mechanics models to improve the predictions. The ultimate goal of this effort is to develop an integrated probabilistic framework for utilizing all of the available information to come up with enhanced (less uncertain) predictions for structural health of the aircraft in future missions. Such information includes material level fatigue models and test data, health monitoring measurements and inspection field data. In this paper, a case study of using Bayesian fusion technique for integrating information from multiple sources in a structural health management problem is presented.
Bayesian Source Attribution of Salmonellosis in South Australia.
Glass, K; Fearnley, E; Hocking, H; Raupach, J; Veitch, M; Ford, L; Kirk, M D
2016-03-01
Salmonellosis is a significant cause of foodborne gastroenteritis in Australia, and rates of illness have increased over recent years. We adopt a Bayesian source attribution model to estimate the contribution of different animal reservoirs to illness due to Salmonella spp. in South Australia between 2000 and 2010, together with 95% credible intervals (CrI). We excluded known travel associated cases and those of rare subtypes (fewer than 20 human cases or fewer than 10 isolates from included sources over the 11-year period), and the remaining 76% of cases were classified as sporadic or outbreak associated. Source-related parameters were included to allow for different handling and consumption practices. We attributed 35% (95% CrI: 20-49) of sporadic cases to chicken meat and 37% (95% CrI: 23-53) of sporadic cases to eggs. Of outbreak-related cases, 33% (95% CrI: 20-62) were attributed to chicken meat and 59% (95% CrI: 29-75) to eggs. A comparison of alternative model assumptions indicated that biases due to possible clustering of samples from sources had relatively minor effects on these estimates. Analysis of source-related parameters showed higher risk of illness from contaminated eggs than from contaminated chicken meat, suggesting that consumption and handling practices potentially play a bigger role in illness due to eggs, considering low Salmonella prevalence on eggs. Our results strengthen the evidence that eggs and chicken meat are important vehicles for salmonellosis in South Australia. © 2015 Society for Risk Analysis.
Sirichamorn, Yotsawate; Adema, Frits A C B; Gravendeel, Barbara; van Welzen, Peter C
2012-11-01
Palaeotropic Derris-like taxa (family Fabaceae, tribe Millettieae) comprise 6-9 genera. They are well known as important sources of rotenone toxin, which are used as organic insecticide and fish poison. However, their phylogenetic relationships and classification are still problematic due to insufficient sampling and high morphological variability. Fifty species of palaeotropic Derris-like taxa were sampled, which is more than in former studies. Three chloroplast genes (trnK-matK, trnL-F IGS, and psbA-trnH IGS) and nuclear ribosomal ITS /5.8S were analyzed using parsimony and Bayesian methods. Parsimony and Bayesian analyses of individual and combined markers show more or less similar tree topologies (only varying in terminal branches). The old-world monophyletic genera Aganope, Brachypterum, and Leptoderris are distinct from Derris s.s., and their generic status is here confirmed. Aganope may be classified into two or three subgeneric taxa. Paraderris has to be included in Derris s.s. to form a monophyletic group. The genera Philenoptera, Deguelia, and Lonchocarpus are monophyletic and distinct from each other and clearly separate from Derris s.s. Morphologically highly similar species of Derris s.s. are shown to be unrelated. Our study shows that previous infrageneric classifications of Derris are incorrect. Paraderris elliptica may contain several cryptic lineages that need further investigation. The concept of the genus Derris s.s. should be reorganized with a new generic circumscription by including Paraderris but excluding Brachypterum. Synapomorphic morphological features will be examined in future studies, and the status of the newly defined Derris and its closely related taxa will be formalized.
Coping with Trial-to-Trial Variability of Event Related Signals: A Bayesian Inference Approach
NASA Technical Reports Server (NTRS)
Ding, Mingzhou; Chen, Youghong; Knuth, Kevin H.; Bressler, Steven L.; Schroeder, Charles E.
2005-01-01
In electro-neurophysiology, single-trial brain responses to a sensory stimulus or a motor act are commonly assumed to result from the linear superposition of a stereotypic event-related signal (e.g. the event-related potential or ERP) that is invariant across trials and some ongoing brain activity often referred to as noise. To extract the signal, one performs an ensemble average of the brain responses over many identical trials to attenuate the noise. To date, h s simple signal-plus-noise (SPN) model has been the dominant approach in cognitive neuroscience. Mounting empirical evidence has shown that the assumptions underlying this model may be overly simplistic. More realistic models have been proposed that account for the trial-to-trial variability of the event-related signal as well as the possibility of multiple differentially varying components within a given ERP waveform. The variable-signal-plus-noise (VSPN) model, which has been demonstrated to provide the foundation for separation and characterization of multiple differentially varying components, has the potential to provide a rich source of information for questions related to neural functions that complement the SPN model. Thus, being able to estimate the amplitude and latency of each ERP component on a trial-by-trial basis provides a critical link between the perceived benefits of the VSPN model and its many concrete applications. In this paper we describe a Bayesian approach to deal with this issue and the resulting strategy is referred to as the differentially Variable Component Analysis (dVCA). We compare the performance of dVCA on simulated data with Independent Component Analysis (ICA) and analyze neurobiological recordings from monkeys performing cognitive tasks.
Roksandic, Mirjana; Nikitović, Dejana; Rodríguez Suárez, Roberto; Smith, David; Kanik, Nadine; García Jordá, Dailys; Buhay, William M.
2017-01-01
The general lack of well-preserved juvenile skeletal remains from Caribbean archaeological sites has, in the past, prevented evaluations of juvenile dietary changes. Canímar Abajo (Cuba), with a large number of well-preserved juvenile and adult skeletal remains, provided a unique opportunity to fully assess juvenile paleodiets from an ancient Caribbean population. Ages for the start and the end of weaning and possible food sources used for weaning were inferred by combining the results of two Bayesian probability models that help to reduce some of the uncertainties inherent to bone collagen isotope based paleodiet reconstructions. Bone collagen (31 juveniles, 18 adult females) was used for carbon and nitrogen isotope analyses. The isotope results were assessed using two Bayesian probability models: Weaning Ages Reconstruction with Nitrogen isotopes and Stable Isotope Analyses in R. Breast milk seems to have been the most important protein source until two years of age with some supplementary food such as tropical fruits and root cultigens likely introduced earlier. After two, juvenile diets were likely continuously supplemented by starch rich foods such as root cultigens and legumes. By the age of three, the model results suggest that the weaning process was completed. Additional indications suggest that animal marine/riverine protein and maize, while part of the Canímar Abajo female diets, were likely not used to supplement juvenile diets. The combined use of both models here provided a more complete assessment of the weaning process for an ancient Caribbean population, indicating not only the start and end ages of weaning but also the relative importance of different food sources for different age juveniles. PMID:28459816
Merging Digital Surface Models Implementing Bayesian Approaches
NASA Astrophysics Data System (ADS)
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
Chinique de Armas, Yadira; Roksandic, Mirjana; Nikitović, Dejana; Rodríguez Suárez, Roberto; Smith, David; Kanik, Nadine; García Jordá, Dailys; Buhay, William M
2017-01-01
The general lack of well-preserved juvenile skeletal remains from Caribbean archaeological sites has, in the past, prevented evaluations of juvenile dietary changes. Canímar Abajo (Cuba), with a large number of well-preserved juvenile and adult skeletal remains, provided a unique opportunity to fully assess juvenile paleodiets from an ancient Caribbean population. Ages for the start and the end of weaning and possible food sources used for weaning were inferred by combining the results of two Bayesian probability models that help to reduce some of the uncertainties inherent to bone collagen isotope based paleodiet reconstructions. Bone collagen (31 juveniles, 18 adult females) was used for carbon and nitrogen isotope analyses. The isotope results were assessed using two Bayesian probability models: Weaning Ages Reconstruction with Nitrogen isotopes and Stable Isotope Analyses in R. Breast milk seems to have been the most important protein source until two years of age with some supplementary food such as tropical fruits and root cultigens likely introduced earlier. After two, juvenile diets were likely continuously supplemented by starch rich foods such as root cultigens and legumes. By the age of three, the model results suggest that the weaning process was completed. Additional indications suggest that animal marine/riverine protein and maize, while part of the Canímar Abajo female diets, were likely not used to supplement juvenile diets. The combined use of both models here provided a more complete assessment of the weaning process for an ancient Caribbean population, indicating not only the start and end ages of weaning but also the relative importance of different food sources for different age juveniles.
Approximate Bayesian estimation of extinction rate in the Finnish Daphnia magna metapopulation.
Robinson, John D; Hall, David W; Wares, John P
2013-05-01
Approximate Bayesian computation (ABC) is useful for parameterizing complex models in population genetics. In this study, ABC was applied to simultaneously estimate parameter values for a model of metapopulation coalescence and test two alternatives to a strict metapopulation model in the well-studied network of Daphnia magna populations in Finland. The models shared four free parameters: the subpopulation genetic diversity (θS), the rate of gene flow among patches (4Nm), the founding population size (N0) and the metapopulation extinction rate (e) but differed in the distribution of extinction rates across habitat patches in the system. The three models had either a constant extinction rate in all populations (strict metapopulation), one population that was protected from local extinction (i.e. a persistent source), or habitat-specific extinction rates drawn from a distribution with specified mean and variance. Our model selection analysis favoured the model including a persistent source population over the two alternative models. Of the closest 750,000 data sets in Euclidean space, 78% were simulated under the persistent source model (estimated posterior probability = 0.769). This fraction increased to more than 85% when only the closest 150,000 data sets were considered (estimated posterior probability = 0.774). Approximate Bayesian computation was then used to estimate parameter values that might produce the observed set of summary statistics. Our analysis provided posterior distributions for e that included the point estimate obtained from previous data from the Finnish D. magna metapopulation. Our results support the use of ABC and population genetic data for testing the strict metapopulation model and parameterizing complex models of demography. © 2013 Blackwell Publishing Ltd.
NASA Astrophysics Data System (ADS)
Kiyan, Duygu; Rath, Volker; Delhaye, Robert
2017-04-01
The frequency- and time-domain airborne electromagnetic (AEM) data collected under the Tellus projects of the Geological Survey of Ireland (GSI) which represent a wealth of information on the multi-dimensional electrical structure of Ireland's near-surface. Our project, which was funded by GSI under the framework of their Short Call Research Programme, aims to develop and implement inverse techniques based on various Bayesian methods for these densely sampled data. We have developed a highly flexible toolbox using Python language for the one-dimensional inversion of AEM data along the flight lines. The computational core is based on an adapted frequency- and time-domain forward modelling core derived from the well-tested open-source code AirBeo, which was developed by the CSIRO (Australia) and the AMIRA consortium. Three different inversion methods have been implemented: (i) Tikhonov-type inversion including optimal regularisation methods (Aster el al., 2012; Zhdanov, 2015), (ii) Bayesian MAP inversion in parameter and data space (e.g. Tarantola, 2005), and (iii) Full Bayesian inversion with Markov Chain Monte Carlo (Sambridge and Mosegaard, 2002; Mosegaard and Sambridge, 2002), all including different forms of spatial constraints. The methods have been tested on synthetic and field data. This contribution will introduce the toolbox and present case studies on the AEM data from the Tellus projects.
Lead isotope ratios for bullets, forensic evaluation in a Bayesian paradigm.
Sjåstad, Knut-Endre; Lucy, David; Andersen, Tom
2016-01-01
Forensic science is a discipline concerned with collection, examination and evaluation of physical evidence related to criminal cases. The results from the activities of the forensic scientist may ultimately be presented to the court in such a way that the triers of fact understand the implications of the data. Forensic science has been, and still is, driven by development of new technology, and in the last two decades evaluation of evidence based on logical reasoning and Bayesian statistic has reached some level of general acceptance within the forensic community. Tracing of lead fragments of unknown origin to a given source of ammunition is a task that might be of interest for the Court. Use of data from lead isotope ratios analysis interpreted within a Bayesian framework has shown to be suitable method to guide the Court to draw their conclusion for such task. In this work we have used isotopic composition of lead from small arms projectiles (cal. .22) and developed an approach based on Bayesian statistics and likelihood ratio calculation. The likelihood ratio is a single quantity that provides a measure of the value of evidence that can be used in the deliberation of the court. Copyright © 2015 Elsevier B.V. All rights reserved.
Fully probabilistic earthquake source inversion on teleseismic scales
NASA Astrophysics Data System (ADS)
Stähler, Simon; Sigloch, Karin
2017-04-01
Seismic source inversion is a non-linear problem in seismology where not just the earthquake parameters but also estimates of their uncertainties are of great practical importance. We have developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. These unknowns are parameterised efficiently by harnessing as prior knowledge solutions from a large number of non-Bayesian inversions. The source time function is expressed as a weighted sum of a small number of empirical orthogonal functions, which were derived from a catalogue of >1000 source time functions (STFs) by a principal component analysis. We use a likelihood model based on the cross-correlation misfit between observed and predicted waveforms. The resulting ensemble of solutions provides full uncertainty and covariance information for the source parameters, and permits propagating these source uncertainties into travel time estimates used for seismic tomography. The computational effort is such that routine, global estimation of earthquake mechanisms and source time functions from teleseismic broadband waveforms is feasible. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. References: Stähler, S. C. and Sigloch, K.: Fully probabilistic seismic source inversion - Part 1: Efficient parameterisation, Solid Earth, 5, 1055-1069, doi:10.5194/se-5-1055-2014, 2014. Stähler, S. C. and Sigloch, K.: Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances, Solid Earth, 7, 1521-1536, doi:10.5194/se-7-1521-2016, 2016.
Drinking water treatment plants rely on purification of contaminated source waters to provide communities with potable water. One group of possible contaminants are enteric viruses. Measurement of viral quantities in environmental water systems are often performed using polymeras...
NASA Astrophysics Data System (ADS)
Arendt, Carli A.; Aciego, Sarah M.; Hetland, Eric A.
2015-05-01
The implementation of isotopic tracers as constraints on source contributions has become increasingly relevant to understanding Earth surface processes. Interpretation of these isotopic tracers has become more accessible with the development of Bayesian Monte Carlo (BMC) mixing models, which allow uncertainty in mixing end-members and provide methodology for systems with multicomponent mixing. This study presents an open source multiple isotope BMC mixing model that is applicable to Earth surface environments with sources exhibiting distinct end-member isotopic signatures. Our model is first applied to new δ18O and δD measurements from the Athabasca Glacier, which showed expected seasonal melt evolution trends and vigorously assessed the statistical relevance of the resulting fraction estimations. To highlight the broad applicability of our model to a variety of Earth surface environments and relevant isotopic systems, we expand our model to two additional case studies: deriving melt sources from δ18O, δD, and 222Rn measurements of Greenland Ice Sheet bulk water samples and assessing nutrient sources from ɛNd and 87Sr/86Sr measurements of Hawaiian soil cores. The model produces results for the Greenland Ice Sheet and Hawaiian soil data sets that are consistent with the originally published fractional contribution estimates. The advantage of this method is that it quantifies the error induced by variability in the end-member compositions, unrealized by the models previously applied to the above case studies. Results from all three case studies demonstrate the broad applicability of this statistical BMC isotopic mixing model for estimating source contribution fractions in a variety of Earth surface systems.
Benchmarking for Bayesian Reinforcement Learning
Ernst, Damien; Couëtoux, Adrien
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. PMID:27304891
Benchmarking for Bayesian Reinforcement Learning.
Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.
The 5-10 keV AGN luminosity function at 0.01 < z < 4.0
NASA Astrophysics Data System (ADS)
Fotopoulou, S.; Buchner, J.; Georgantopoulos, I.; Hasinger, G.; Salvato, M.; Georgakakis, A.; Cappelluti, N.; Ranalli, P.; Hsu, L. T.; Brusa, M.; Comastri, A.; Miyaji, T.; Nandra, K.; Aird, J.; Paltani, S.
2016-03-01
The active galactic nuclei (AGN) X-ray luminosity function traces actively accreting supermassive black holes and is essential for the study of the properties of the AGN population, black hole evolution, and galaxy-black hole coevolution. Up to now, the AGN luminosity function has been estimated several times in soft (0.5-2 keV) and hard X-rays (2-10 keV). AGN selection in these energy ranges often suffers from identification and redshift incompleteness and, at the same time, photoelectric absorption can obscure a significant amount of the X-ray radiation. We estimate the evolution of the luminosity function in the 5-10 keV band, where we effectively avoid the absorbed part of the spectrum, rendering absorption corrections unnecessary up to NH ~ 1023 cm-2. Our dataset is a compilation of six wide, and deep fields: MAXI, HBSS, XMM-COSMOS, Lockman Hole, XMM-CDFS, AEGIS-XD, Chandra-COSMOS, and Chandra-CDFS. This extensive sample of ~1110 AGN (0.01 < z < 4.0, 41 < log Lx < 46) is 98% redshift complete with 68% spectroscopic redshifts. For sources lacking a spectroscopic redshift estimation we use the probability distribution function of photometric redshift estimation specifically tuned for AGN, and a flat probability distribution function for sources with no redshift information. We use Bayesian analysis to select the best parametric model from simple pure luminosity and pure density evolution to more complicated luminosity and density evolution and luminosity-dependent density evolution (LDDE). We estimate the model parameters that describe best our dataset separately for each survey and for the combined sample. We show that, according to Bayesian model selection, the preferred model for our dataset is the LDDE. Our estimation of the AGN luminosity function does not require any assumption on the AGN absorption and is in good agreement with previous works in the 2-10 keV energy band based on X-ray hardness ratios to model the absorption in AGN up to redshift three. Our sample does not show evidence of a rapid decline of the AGN luminosity function up to redshift four.
NASA Astrophysics Data System (ADS)
Itter, M.; Finley, A. O.; Hooten, M.; Higuera, P. E.; Marlon, J. R.; McLachlan, J. S.; Kelly, R.
2016-12-01
Sediment charcoal records are used in paleoecological analyses to identify individual local fire events and to estimate fire frequency and regional biomass burned at centennial to millenial time scales. Methods to identify local fire events based on sediment charcoal records have been well developed over the past 30 years, however, an integrated statistical framework for fire identification is still lacking. We build upon existing paleoecological methods to develop a hierarchical Bayesian point process model for local fire identification and estimation of fire return intervals. The model is unique in that it combines sediment charcoal records from multiple lakes across a region in a spatially-explicit fashion leading to estimation of a joint, regional fire return interval in addition to lake-specific local fire frequencies. Further, the model estimates a joint regional charcoal deposition rate free from the effects of local fires that can be used as a measure of regional biomass burned over time. Finally, the hierarchical Bayesian approach allows for tractable error propagation such that estimates of fire return intervals reflect the full range of uncertainty in sediment charcoal records. Specific sources of uncertainty addressed include sediment age models, the separation of local versus regional charcoal sources, and generation of a composite charcoal record The model is applied to sediment charcoal records from a dense network of lakes in the Yukon Flats region of Alaska. The multivariate joint modeling approach results in improved estimates of regional charcoal deposition with reduced uncertainty in the identification of individual fire events and local fire return intervals compared to individual lake approaches. Modeled individual-lake fire return intervals range from 100 to 500 years with a regional interval of roughly 200 years. Regional charcoal deposition to the network of lakes is correlated up to 50 kilometers. Finally, the joint regional charcoal deposition rate exhibits changes over time coincident with major climatic and vegetation shifts over the past 10,000 years. Ongoing work will use the regional charcoal deposition rate to estimate changes in biomass burned as a function of climate variability and regional vegetation pattern.
MPN estimation of qPCR target sequence recoveries from whole cell calibrator samples.
Sivaganesan, Mano; Siefring, Shawn; Varma, Manju; Haugland, Richard A
2011-12-01
DNA extracts from enumerated target organism cells (calibrator samples) have been used for estimating Enterococcus cell equivalent densities in surface waters by a comparative cycle threshold (Ct) qPCR analysis method. To compare surface water Enterococcus density estimates from different studies by this approach, either a consistent source of calibrator cells must be used or the estimates must account for any differences in target sequence recoveries from different sources of calibrator cells. In this report we describe two methods for estimating target sequence recoveries from whole cell calibrator samples based on qPCR analyses of their serially diluted DNA extracts and most probable number (MPN) calculation. The first method employed a traditional MPN calculation approach. The second method employed a Bayesian hierarchical statistical modeling approach and a Monte Carlo Markov Chain (MCMC) simulation method to account for the uncertainty in these estimates associated with different individual samples of the cell preparations, different dilutions of the DNA extracts and different qPCR analytical runs. The two methods were applied to estimate mean target sequence recoveries per cell from two different lots of a commercially available source of enumerated Enterococcus cell preparations. The mean target sequence recovery estimates (and standard errors) per cell from Lot A and B cell preparations by the Bayesian method were 22.73 (3.4) and 11.76 (2.4), respectively, when the data were adjusted for potential false positive results. Means were similar for the traditional MPN approach which cannot comparably assess uncertainty in the estimates. Cell numbers and estimates of recoverable target sequences in calibrator samples prepared from the two cell sources were also used to estimate cell equivalent and target sequence quantities recovered from surface water samples in a comparative Ct method. Our results illustrate the utility of the Bayesian method in accounting for uncertainty, the high degree of precision attainable by the MPN approach and the need to account for the differences in target sequence recoveries from different calibrator sample cell sources when they are used in the comparative Ct method. Published by Elsevier B.V.
Algorithmic procedures for Bayesian MEG/EEG source reconstruction in SPM.
López, J D; Litvak, V; Espinosa, J J; Friston, K; Barnes, G R
2014-01-01
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (Minimum Norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy-an approximation to the marginal likelihood or evidence of the solution. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardise the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with the help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational Free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm. © 2013. Published by Elsevier Inc. All rights reserved.
Bayesian inference on EMRI signals using low frequency approximations
NASA Astrophysics Data System (ADS)
Ali, Asad; Christensen, Nelson; Meyer, Renate; Röver, Christian
2012-07-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions.
Multivariate neural biomarkers of emotional states are categorically distinct.
Kragel, Philip A; LaBar, Kevin S
2015-11-01
Understanding how emotions are represented neurally is a central aim of affective neuroscience. Despite decades of neuroimaging efforts addressing this question, it remains unclear whether emotions are represented as distinct entities, as predicted by categorical theories, or are constructed from a smaller set of underlying factors, as predicted by dimensional accounts. Here, we capitalize on multivariate statistical approaches and computational modeling to directly evaluate these theoretical perspectives. We elicited discrete emotional states using music and films during functional magnetic resonance imaging scanning. Distinct patterns of neural activation predicted the emotion category of stimuli and tracked subjective experience. Bayesian model comparison revealed that combining dimensional and categorical models of emotion best characterized the information content of activation patterns. Surprisingly, categorical and dimensional aspects of emotion experience captured unique and opposing sources of neural information. These results indicate that diverse emotional states are poorly differentiated by simple models of valence and arousal, and that activity within separable neural systems can be mapped to unique emotion categories. © The Author (2015). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Separation in Logistic Regression: Causes, Consequences, and Control.
Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg
2018-04-01
Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.
Gustafsson, Mats G; Wallman, Mikael; Wickenberg Bolin, Ulrika; Göransson, Hanna; Fryknäs, M; Andersson, Claes R; Isaksson, Anders
2010-06-01
Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (CI) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the CI is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice. It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples. Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets. An empirically derived ME prior seems promising for improving the Bayesian CI for the unknown error rate of a designed classifier. Copyright 2010 Elsevier B.V. All rights reserved.
Quantifying Uncertainty in Near Surface Electromagnetic Imaging Using Bayesian Methods
NASA Astrophysics Data System (ADS)
Blatter, D. B.; Ray, A.; Key, K.
2017-12-01
Geoscientists commonly use electromagnetic methods to image the Earth's near surface. Field measurements of EM fields are made (often with the aid an artificial EM source) and then used to infer near surface electrical conductivity via a process known as inversion. In geophysics, the standard inversion tool kit is robust and can provide an estimate of the Earth's near surface conductivity that is both geologically reasonable and compatible with the measured field data. However, standard inverse methods struggle to provide a sense of the uncertainty in the estimate they provide. This is because the task of finding an Earth model that explains the data to within measurement error is non-unique - that is, there are many, many such models; but the standard methods provide only one "answer." An alternative method, known as Bayesian inversion, seeks to explore the full range of Earth model parameters that can adequately explain the measured data, rather than attempting to find a single, "ideal" model. Bayesian inverse methods can therefore provide a quantitative assessment of the uncertainty inherent in trying to infer near surface conductivity from noisy, measured field data. This study applies a Bayesian inverse method (called trans-dimensional Markov chain Monte Carlo) to transient airborne EM data previously collected over Taylor Valley - one of the McMurdo Dry Valleys in Antarctica. Our results confirm the reasonableness of previous estimates (made using standard methods) of near surface conductivity beneath Taylor Valley. In addition, we demonstrate quantitatively the uncertainty associated with those estimates. We demonstrate that Bayesian inverse methods can provide quantitative uncertainty to estimates of near surface conductivity.
Prediction and assimilation of surf-zone processes using a Bayesian network: Part I: Forward models
Plant, Nathaniel G.; Holland, K. Todd
2011-01-01
Prediction of coastal processes, including waves, currents, and sediment transport, can be obtained from a variety of detailed geophysical-process models with many simulations showing significant skill. This capability supports a wide range of research and applied efforts that can benefit from accurate numerical predictions. However, the predictions are only as accurate as the data used to drive the models and, given the large temporal and spatial variability of the surf zone, inaccuracies in data are unavoidable such that useful predictions require corresponding estimates of uncertainty. We demonstrate how a Bayesian-network model can be used to provide accurate predictions of wave-height evolution in the surf zone given very sparse and/or inaccurate boundary-condition data. The approach is based on a formal treatment of a data-assimilation problem that takes advantage of significant reduction of the dimensionality of the model system. We demonstrate that predictions of a detailed geophysical model of the wave evolution are reproduced accurately using a Bayesian approach. In this surf-zone application, forward prediction skill was 83%, and uncertainties in the model inputs were accurately transferred to uncertainty in output variables. We also demonstrate that if modeling uncertainties were not conveyed to the Bayesian network (i.e., perfect data or model were assumed), then overly optimistic prediction uncertainties were computed. More consistent predictions and uncertainties were obtained by including model-parameter errors as a source of input uncertainty. Improved predictions (skill of 90%) were achieved because the Bayesian network simultaneously estimated optimal parameters while predicting wave heights.
Diagnosis of combined faults in Rotary Machinery by Non-Naive Bayesian approach
NASA Astrophysics Data System (ADS)
Asr, Mahsa Yazdanian; Ettefagh, Mir Mohammad; Hassannejad, Reza; Razavi, Seyed Naser
2017-02-01
When combined faults happen in different parts of the rotating machines, their features are profoundly dependent. Experts are completely familiar with individuals faults characteristics and enough data are available from single faults but the problem arises, when the faults combined and the separation of characteristics becomes complex. Therefore, the experts cannot declare exact information about the symptoms of combined fault and its quality. In this paper to overcome this drawback, a novel method is proposed. The core idea of the method is about declaring combined fault without using combined fault features as training data set and just individual fault features are applied in training step. For this purpose, after data acquisition and resampling the obtained vibration signals, Empirical Mode Decomposition (EMD) is utilized to decompose multi component signals to Intrinsic Mode Functions (IMFs). With the use of correlation coefficient, proper IMFs for feature extraction are selected. In feature extraction step, Shannon energy entropy of IMFs was extracted as well as statistical features. It is obvious that most of extracted features are strongly dependent. To consider this matter, Non-Naive Bayesian Classifier (NNBC) is appointed, which release the fundamental assumption of Naive Bayesian, i.e., the independence among features. To demonstrate the superiority of NNBC, other counterpart methods, include Normal Naive Bayesian classifier, Kernel Naive Bayesian classifier and Back Propagation Neural Networks were applied and the classification results are compared. An experimental vibration signals, collected from automobile gearbox, were used to verify the effectiveness of the proposed method. During the classification process, only the features, related individually to healthy state, bearing failure and gear failures, were assigned for training the classifier. But, combined fault features (combined gear and bearing failures) were examined as test data. The achieved probabilities for the test data show that the combined fault can be identified with high success rate.
The Chandra Source Catalog: X-ray Aperture Photometry
NASA Astrophysics Data System (ADS)
Kashyap, Vinay; Primini, F. A.; Glotfelty, K. J.; Anderson, C. S.; Bonaventura, N. R.; Chen, J. C.; Davis, J. E.; Doe, S. M.; Evans, I. N.; Evans, J. D.; Fabbiano, G.; Galle, E. C.; Gibbs, D. G., II; Grier, J. D.; Hain, R.; Hall, D. M.; Harbo, P. N.; He, X.; Houck, J. C.; Karovska, M.; Lauer, J.; McCollough, M. L.; McDowell, J. C.; Miller, J. B.; Mitschang, A. W.; Morgan, D. L.; Nichols, J. S.; Nowak, M. A.; Plummer, D. A.; Refsdal, B. L.; Rots, A. H.; Siemiginowska, A. L.; Sundheim, B. A.; Tibbetts, M. S.; van Stone, D. W.; Winkelman, S. L.; Zografou, P.
2009-09-01
The Chandra Source Catalog (CSC) represents a reanalysis of the entire ACIS and HRC imaging observations over the 9-year Chandra mission. We describe here the method by which fluxes are measured for detected sources. Source detection is carried out on a uniform basis, using the CIAO tool wavdetect. Source fluxes are estimated post-facto using a Bayesian method that accounts for background, spatial resolution effects, and contamination from nearby sources. We use gamma-function prior distributions, which could be either non-informative, or in case there exist previous observations of the same source, strongly informative. The current implementation is however limited to non-informative priors. The resulting posterior probability density functions allow us to report the flux and a robust credible range on it.
NASA Astrophysics Data System (ADS)
Norros, Veera; Laine, Marko; Lignell, Risto; Thingstad, Frede
2017-10-01
Methods for extracting empirically and theoretically sound parameter values are urgently needed in aquatic ecosystem modelling to describe key flows and their variation in the system. Here, we compare three Bayesian formulations for mechanistic model parameterization that differ in their assumptions about the variation in parameter values between various datasets: 1) global analysis - no variation, 2) separate analysis - independent variation and 3) hierarchical analysis - variation arising from a shared distribution defined by hyperparameters. We tested these methods, using computer-generated and empirical data, coupled with simplified and reasonably realistic plankton food web models, respectively. While all methods were adequate, the simulated example demonstrated that a well-designed hierarchical analysis can result in the most accurate and precise parameter estimates and predictions, due to its ability to combine information across datasets. However, our results also highlighted sensitivity to hyperparameter prior distributions as an important caveat of hierarchical analysis. In the more complex empirical example, hierarchical analysis was able to combine precise identification of parameter values with reasonably good predictive performance, although the ranking of the methods was less straightforward. We conclude that hierarchical Bayesian analysis is a promising tool for identifying key ecosystem-functioning parameters and their variation from empirical datasets.
NASA Astrophysics Data System (ADS)
Agapiou, Sergios; Burger, Martin; Dashti, Masoumeh; Helin, Tapio
2018-04-01
We consider the inverse problem of recovering an unknown functional parameter u in a separable Banach space, from a noisy observation vector y of its image through a known possibly non-linear map {{\\mathcal G}} . We adopt a Bayesian approach to the problem and consider Besov space priors (see Lassas et al (2009 Inverse Problems Imaging 3 87-122)), which are well-known for their edge-preserving and sparsity-promoting properties and have recently attracted wide attention especially in the medical imaging community. Our key result is to show that in this non-parametric setup the maximum a posteriori (MAP) estimates are characterized by the minimizers of a generalized Onsager-Machlup functional of the posterior. This is done independently for the so-called weak and strong MAP estimates, which as we show coincide in our context. In addition, we prove a form of weak consistency for the MAP estimators in the infinitely informative data limit. Our results are remarkable for two reasons: first, the prior distribution is non-Gaussian and does not meet the smoothness conditions required in previous research on non-parametric MAP estimates. Second, the result analytically justifies existing uses of the MAP estimate in finite but high dimensional discretizations of Bayesian inverse problems with the considered Besov priors.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
NASA Astrophysics Data System (ADS)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
NASA Astrophysics Data System (ADS)
Yin, Ping; Mu, Lan; Madden, Marguerite; Vena, John E.
2014-10-01
Lung cancer is the second most commonly diagnosed cancer in both men and women in Georgia, USA. However, the spatio-temporal patterns of lung cancer risk in Georgia have not been fully studied. Hierarchical Bayesian models are used here to explore the spatio-temporal patterns of lung cancer incidence risk by race and gender in Georgia for the period of 2000-2007. With the census tract level as the spatial scale and the 2-year period aggregation as the temporal scale, we compare a total of seven Bayesian spatio-temporal models including two under a separate modeling framework and five under a joint modeling framework. One joint model outperforms others based on the deviance information criterion. Results show that the northwest region of Georgia has consistently high lung cancer incidence risk for all population groups during the study period. In addition, there are inverse relationships between the socioeconomic status and the lung cancer incidence risk among all Georgian population groups, and the relationships in males are stronger than those in females. By mapping more reliable variations in lung cancer incidence risk at a relatively fine spatio-temporal scale for different Georgian population groups, our study aims to better support healthcare performance assessment, etiological hypothesis generation, and health policy making.
Shankle, William R.; Pooley, James P.; Steyvers, Mark; Hara, Junko; Mangrola, Tushar; Reisberg, Barry; Lee, Michael D.
2012-01-01
Determining how cognition affects functional abilities is important in Alzheimer’s disease and related disorders (ADRD). 280 patients (normal or ADRD) received a total of 1,514 assessments using the Functional Assessment Staging Test (FAST) procedure and the MCI Screen (MCIS). A hierarchical Bayesian cognitive processing (HBCP) model was created by embedding a signal detection theory (SDT) model of the MCIS delayed recognition memory task into a hierarchical Bayesian framework. The SDT model used latent parameters of discriminability (memory process) and response bias (executive function) to predict, simultaneously, recognition memory performance for each patient and each FAST severity group. The observed recognition memory data did not distinguish the six FAST severity stages, but the latent parameters completely separated them. The latent parameters were also used successfully to transform the ordinal FAST measure into a continuous measure reflecting the underlying continuum of functional severity. HBCP models applied to recognition memory data from clinical practice settings accurately translated a latent measure of cognition to a continuous measure of functional severity for both individuals and FAST groups. Such a translation links two levels of brain information processing, and may enable more accurate correlations with other levels, such as those characterized by biomarkers. PMID:22407225
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic systemmore » leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.« less
Bayesian Analysis of High Dimensional Classification
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Subhadeep; Liang, Faming
2009-12-01
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.
Zhe, Shandian; Xu, Zenglin; Qi, Yuan; Yu, Peng
2014-01-01
A key step for Alzheimer's disease (AD) study is to identify associations between genetic variations and intermediate phenotypes (e.g., brain structures). At the same time, it is crucial to develop a noninvasive means for AD diagnosis. Although these two tasks-association discovery and disease diagnosis-have been treated separately by a variety of approaches, they are tightly coupled due to their common biological basis. We hypothesize that the two tasks can potentially benefit each other by a joint analysis, because (i) the association study discovers correlated biomarkers from different data sources, which may help improve diagnosis accuracy, and (ii) the disease status may help identify disease-sensitive associations between genetic variations and MRI features. Based on this hypothesis, we present a new sparse Bayesian approach for joint association study and disease diagnosis. In this approach, common latent features are extracted from different data sources based on sparse projection matrices and used to predict multiple disease severity levels based on Gaussian process ordinal regression; in return, the disease status is used to guide the discovery of relationships between the data sources. The sparse projection matrices not only reveal the associations but also select groups of biomarkers related to AD. To learn the model from data, we develop an efficient variational expectation maximization algorithm. Simulation results demonstrate that our approach achieves higher accuracy in both predicting ordinal labels and discovering associations between data sources than alternative methods. We apply our approach to an imaging genetics dataset of AD. Our joint analysis approach not only identifies meaningful and interesting associations between genetic variations, brain structures, and AD status, but also achieves significantly higher accuracy for predicting ordinal AD stages than the competing methods.
The Chandra Source Catalog 2.0
NASA Astrophysics Data System (ADS)
Evans, Ian N.; Allen, Christopher E.; Anderson, Craig S.; Budynkiewicz, Jamie A.; Burke, Douglas; Chen, Judy C.; Civano, Francesca Maria; D'Abrusco, Raffaele; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Gibbs, Danny G., II; Glotfelty, Kenny J.; Graessle, Dale E.; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; Houck, John C.; Lauer, Jennifer L.; Laurino, Omar; Lee, Nicholas P.; Martínez-Galarza, Juan Rafael; McCollough, Michael L.; McDowell, Jonathan C.; McLaughlin, Warren; Miller, Joseph; Morgan, Douglas L.; Mossman, Amy E.; Nguyen, Dan T.; Nichols, Joy S.; Nowak, Michael A.; Paxson, Charles; Plummer, David A.; Primini, Francis Anthony; Rots, Arnold H.; Siemiginowska, Aneta; Sundheim, Beth A.; Tibbetts, Michael; Van Stone, David W.; Zografou, Panagoula
2018-01-01
The current version of the Chandra Source Catalog (CSC) continues to be well utilized by the astronomical community. Usage over the past year has continued to average more than 15,000 searches per month. Version 1.1 of the CSC, released in 2010, includes properties and data for 158,071 detections, corresponding to 106,586 distinct X-ray sources on the sky. The second major release of the catalog, CSC 2.0, will be made available to the user community in early 2018, and preliminary lists of detections and sources are available now. Release 2.0 will roughly triple the size of the current version of the catalog to an estimated 375,000 detections, corresponding to ~315,000 unique X-ray sources. Compared to release 1.1, the limiting sensitivity for compact sources in CSC 2.0 is significantly enhanced. This improvement is achieved by using a two-stage approach that involves stacking (co-adding) multiple observations of the same field prior to source detection, and then using an improved source detection approach that enables us to detect point source down to ~5 net counts on-axis for exposures shorter than ~15 ks. In addition to enhanced source detection capabilities, improvements to the Bayesian aperture photometry code included in release 2.0 provides robust photometric probability density functions (PDFs) in crowded fields even for low count detections. All post-aperture photometry properties (e.g., hardness ratios, source variability) work directly from the PDFs in release 2.0. CSC 2.0 also adds a Bayesian Blocks analysis of the multi-band aperture photometry PDFs to identify multiple observations of the same source that have similar photometric properties, and therefore can be analyzed simultaneously to improve S/N.We briefly describe these and other updates that significantly enhance the scientific utility of CSC 2.0 when compared to the earlier catalog release.This work has been supported by NASA under contract NAS 8-03060 to the Smithsonian Astrophysical Observatory for operation of the Chandra X-ray Center.
Bonomi, Massimiliano; Pellarin, Riccardo; Kim, Seung Joong; Russel, Daniel; Sundin, Bryan A.; Riffle, Michael; Jaschob, Daniel; Ramsden, Richard; Davis, Trisha N.; Muller, Eric G. D.; Sali, Andrej
2014-01-01
The use of in vivo Förster resonance energy transfer (FRET) data to determine the molecular architecture of a protein complex in living cells is challenging due to data sparseness, sample heterogeneity, signal contributions from multiple donors and acceptors, unequal fluorophore brightness, photobleaching, flexibility of the linker connecting the fluorophore to the tagged protein, and spectral cross-talk. We addressed these challenges by using a Bayesian approach that produces the posterior probability of a model, given the input data. The posterior probability is defined as a function of the dependence of our FRET metric FRETR on a structure (forward model), a model of noise in the data, as well as prior information about the structure, relative populations of distinct states in the sample, forward model parameters, and data noise. The forward model was validated against kinetic Monte Carlo simulations and in vivo experimental data collected on nine systems of known structure. In addition, our Bayesian approach was validated by a benchmark of 16 protein complexes of known structure. Given the structures of each subunit of the complexes, models were computed from synthetic FRETR data with a distance root-mean-squared deviation error of 14 to 17 Å. The approach is implemented in the open-source Integrative Modeling Platform, allowing us to determine macromolecular structures through a combination of in vivo FRETR data and data from other sources, such as electron microscopy and chemical cross-linking. PMID:25139910
Vernon, Ian; Liu, Junli; Goldstein, Michael; Rowe, James; Topping, Jen; Lindsey, Keith
2018-01-02
Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data of various forms. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embeded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources of uncertainty. The approach is demonstrated via application to a model of hormonal crosstalk in Arabidopsis root development, which has 32 rate parameters, for which we identify the sets of rate parameter values that lead to acceptable matches between model output and observed trend data. The multiple insights into the model's structure that this analysis provides are discussed. The methodology is applied to a second related model, and the biological consequences of the resulting comparison, including the evaluation of gene functions, are described. Bayesian uncertainty analysis for complex models using both emulators and history matching is shown to be a powerful technique that can greatly aid the study of a large class of systems biology models. It both provides insight into model behaviour and identifies the sets of rate parameters of interest.
NASA Astrophysics Data System (ADS)
Roostaee, M.; Deng, Z.
2017-12-01
The states' environmental agencies are required by The Clean Water Act to assess all waterbodies and evaluate potential sources of impairments. Spatial and temporal distributions of water quality parameters are critical in identifying Critical Source Areas (CSAs). However, due to limitations in monetary resources and a large number of waterbodies, available monitoring stations are typically sparse with intermittent periods of data collection. Hence, scarcity of water quality data is a major obstacle in addressing sources of pollution through management strategies. In this study spatiotemporal Bayesian Maximum Entropy method (BME) is employed to model the inherent temporal and spatial variability of measured water quality indicators such as Dissolved Oxygen (DO) concentration for Turkey Creek Watershed. Turkey Creek is located in northern Louisiana and has been listed in 303(d) list for DO impairment since 2014 in Louisiana Water Quality Inventory Reports due to agricultural practices. BME method is proved to provide more accurate estimates than the methods of purely spatial analysis by incorporating space/time distribution and uncertainty in available measured soft and hard data. This model would be used to estimate DO concentration at unmonitored locations and times and subsequently identifying CSAs. The USDA's crop-specific land cover data layers of the watershed were then used to determine those practices/changes that led to low DO concentration in identified CSAs. Primary results revealed that cultivation of corn and soybean as well as urban runoff are main contributing sources in low dissolved oxygen in Turkey Creek Watershed.
Capturing changes in flood risk with Bayesian approaches for flood damage assessment
NASA Astrophysics Data System (ADS)
Vogel, Kristin; Schröter, Kai; Kreibich, Heidi; Thieken, Annegret; Müller, Meike; Sieg, Tobias; Laudan, Jonas; Kienzler, Sarah; Weise, Laura; Merz, Bruno; Scherbaum, Frank
2016-04-01
Flood risk is a function of hazard as well as of exposure and vulnerability. All three components are under change over space and time and have to be considered for reliable damage estimations and risk analyses, since this is the basis for an efficient, adaptable risk management. Hitherto, models for estimating flood damage are comparatively simple and cannot sufficiently account for changing conditions. The Bayesian network approach allows for a multivariate modeling of complex systems without relying on expert knowledge about physical constraints. In a Bayesian network each model component is considered to be a random variable. The way of interactions between those variables can be learned from observations or be defined by expert knowledge. Even a combination of both is possible. Moreover, the probabilistic framework captures uncertainties related to the prediction and provides a probability distribution for the damage instead of a point estimate. The graphical representation of Bayesian networks helps to study the change of probabilities for changing circumstances and may thus simplify the communication between scientists and public authorities. In the framework of the DFG-Research Training Group "NatRiskChange" we aim to develop Bayesian networks for flood damage and vulnerability assessments of residential buildings and companies under changing conditions. A Bayesian network learned from data, collected over the last 15 years in flooded regions in the Elbe and Danube catchments (Germany), reveals the impact of many variables like building characteristics, precaution and warning situation on flood damage to residential buildings. While the handling of incomplete and hybrid (discrete mixed with continuous) data are the most challenging issues in the study on residential buildings, a similar study, that focuses on the vulnerability of small to medium sized companies, bears new challenges. Relying on a much smaller data set for the determination of the model parameters, overly complex models should be avoided. A so called Markov Blanket approach aims at the identification of the most relevant factors and constructs a Bayesian network based on those findings. With our approach we want to exploit a major advantage of Bayesian networks which is their ability to consider dependencies not only pairwise, but to capture the joint effects and interactions of driving forces. Hence, the flood damage network does not only show the impact of precaution on the building damage separately, but also reveals the mutual effects of precaution and the quality of warning for a variety of flood settings. Thus, it allows for a consideration of changing conditions and different courses of action and forms a novel and valuable tool for decision support. This study is funded by the Deutsche Forschungsgemeinschaft (DFG) within the research training program GRK 2043/1 "NatRiskChange - Natural hazards and risks in a changing world" at the University of Potsdam.
MixSIAR: A Bayesian stable isotope mixing model for characterizing intrapopulation niche variation
Background/Question/Methods The science of stable isotope mixing models has tended towards the development of modeling products (e.g. IsoSource, MixSIR, SIAR), where methodological advances or syntheses of the current state of the art are published in parity with software packa...
Space Object Detection and Tracking Within a Finite Set Statistics Framework
2017-04-13
Software for source extraction. Astronomy and Astrophysics Supplement Series, 117(2):393–404, 1996. [4] William M. Bolstad. Introduction to Bayesian...Urban, T Corbin, G Wycoff, Ulrich Bastian, Peter Schwekendiek, and A Wicenec. The tycho-2 catalogue of the 2.5 million brightest stars. Astronomy and
Why environmental scientists are becoming Bayesians
James S. Clark
2005-01-01
Advances in computational statistics provide a general framework for the high dimensional models typically needed for ecological inference and prediction. Hierarchical Bayes (HB) represents a modelling structure with capacity to exploit diverse sources of information, to accommodate influences that are unknown (or unknowable), and to draw inference on large numbers of...
This paper addresses the general problem of estimating at arbitrary locations the value of an unobserved quantity that varies over space, such as ozone concentration in air or nitrate concentrations in surface groundwater, on the basis of approximate measurements of the quantity ...
Modeling Statistical Insensitivity: Sources of Suboptimal Behavior
ERIC Educational Resources Information Center
Gagliardi, Annie; Feldman, Naomi H.; Lidz, Jeffrey
2017-01-01
Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the…
Tackling environmental, economic, and social sustainability issues with community stakeholders will often lead to choices that are costly, complex and uncertain. A formal process with proper guidance is needed to understand the issues, identify sources of disagreement, consider t...
Case studies in Bayesian microbial risk assessments.
Kennedy, Marc C; Clough, Helen E; Turner, Joanne
2009-12-21
The quantification of uncertainty and variability is a key component of quantitative risk analysis. Recent advances in Bayesian statistics make it ideal for integrating multiple sources of information, of different types and quality, and providing a realistic estimate of the combined uncertainty in the final risk estimates. We present two case studies related to foodborne microbial risks. In the first, we combine models to describe the sequence of events resulting in illness from consumption of milk contaminated with VTEC O157. We used Monte Carlo simulation to propagate uncertainty in some of the inputs to computer models describing the farm and pasteurisation process. Resulting simulated contamination levels were then assigned to consumption events from a dietary survey. Finally we accounted for uncertainty in the dose-response relationship and uncertainty due to limited incidence data to derive uncertainty about yearly incidences of illness in young children. Options for altering the risk were considered by running the model with different hypothetical policy-driven exposure scenarios. In the second case study we illustrate an efficient Bayesian sensitivity analysis for identifying the most important parameters of a complex computer code that simulated VTEC O157 prevalence within a managed dairy herd. This was carried out in 2 stages, first to screen out the unimportant inputs, then to perform a more detailed analysis on the remaining inputs. The method works by building a Bayesian statistical approximation to the computer code using a number of known code input/output pairs (training runs). We estimated that the expected total number of children aged 1.5-4.5 who become ill due to VTEC O157 in milk is 8.6 per year, with 95% uncertainty interval (0,11.5). The most extreme policy we considered was banning on-farm pasteurisation of milk, which reduced the estimate to 6.4 with 95% interval (0,11). In the second case study the effective number of inputs was reduced from 30 to 7 in the screening stage, and just 2 inputs were found to explain 82.8% of the output variance. A combined total of 500 runs of the computer code were used. These case studies illustrate the use of Bayesian statistics to perform detailed uncertainty and sensitivity analyses, integrating multiple information sources in a way that is both rigorous and efficient.
Bayesian network interface for assisting radiology interpretation and education
NASA Astrophysics Data System (ADS)
Duda, Jeffrey; Botzolakis, Emmanuel; Chen, Po-Hao; Mohan, Suyash; Nasrallah, Ilya; Rauschecker, Andreas; Rudie, Jeffrey; Bryan, R. Nick; Gee, James; Cook, Tessa
2018-03-01
In this work, we present the use of Bayesian networks for radiologist decision support during clinical interpretation. This computational approach has the advantage of avoiding incorrect diagnoses that result from known human cognitive biases such as anchoring bias, framing effect, availability bias, and premature closure. To integrate Bayesian networks into clinical practice, we developed an open-source web application that provides diagnostic support for a variety of radiology disease entities (e.g., basal ganglia diseases, bone lesions). The Clinical tool presents the user with a set of buttons representing clinical and imaging features of interest. These buttons are used to set the value for each observed feature. As features are identified, the conditional probabilities for each possible diagnosis are updated in real time. Additionally, using sensitivity analysis, the interface may be set to inform the user which remaining imaging features provide maximum discriminatory information to choose the most likely diagnosis. The Case Submission tools allow the user to submit a validated case and the associated imaging features to a database, which can then be used for future tuning/testing of the Bayesian networks. These submitted cases are then reviewed by an assigned expert using the provided QC tool. The Research tool presents users with cases with previously labeled features and a chosen diagnosis, for the purpose of performance evaluation. Similarly, the Education page presents cases with known features, but provides real time feedback on feature selection.
Alderman, Phillip D.; Stanfill, Bryan
2016-10-06
Recent international efforts have brought renewed emphasis on the comparison of different agricultural systems models. Thus far, analysis of model-ensemble simulated results has not clearly differentiated between ensemble prediction uncertainties due to model structural differences per se and those due to parameter value uncertainties. Additionally, despite increasing use of Bayesian parameter estimation approaches with field-scale crop models, inadequate attention has been given to the full posterior distributions for estimated parameters. The objectives of this study were to quantify the impact of parameter value uncertainty on prediction uncertainty for modeling spring wheat phenology using Bayesian analysis and to assess the relativemore » contributions of model-structure-driven and parameter-value-driven uncertainty to overall prediction uncertainty. This study used a random walk Metropolis algorithm to estimate parameters for 30 spring wheat genotypes using nine phenology models based on multi-location trial data for days to heading and days to maturity. Across all cases, parameter-driven uncertainty accounted for between 19 and 52% of predictive uncertainty, while model-structure-driven uncertainty accounted for between 12 and 64%. Here, this study demonstrated the importance of quantifying both model-structure- and parameter-value-driven uncertainty when assessing overall prediction uncertainty in modeling spring wheat phenology. More generally, Bayesian parameter estimation provided a useful framework for quantifying and analyzing sources of prediction uncertainty.« less
Chowdhury, Rasheda Arman; Lina, Jean Marc; Kobayashi, Eliane; Grova, Christophe
2013-01-01
Localizing the generators of epileptic activity in the brain using Electro-EncephaloGraphy (EEG) or Magneto-EncephaloGraphy (MEG) signals is of particular interest during the pre-surgical investigation of epilepsy. Epileptic discharges can be detectable from background brain activity, provided they are associated with spatially extended generators. Using realistic simulations of epileptic activity, this study evaluates the ability of distributed source localization methods to accurately estimate the location of the generators and their sensitivity to the spatial extent of such generators when using MEG data. Source localization methods based on two types of realistic models have been investigated: (i) brain activity may be modeled using cortical parcels and (ii) brain activity is assumed to be locally smooth within each parcel. A Data Driven Parcellization (DDP) method was used to segment the cortical surface into non-overlapping parcels and diffusion-based spatial priors were used to model local spatial smoothness within parcels. These models were implemented within the Maximum Entropy on the Mean (MEM) and the Hierarchical Bayesian (HB) source localization frameworks. We proposed new methods in this context and compared them with other standard ones using Monte Carlo simulations of realistic MEG data involving sources of several spatial extents and depths. Detection accuracy of each method was quantified using Receiver Operating Characteristic (ROC) analysis and localization error metrics. Our results showed that methods implemented within the MEM framework were sensitive to all spatial extents of the sources ranging from 3 cm(2) to 30 cm(2), whatever were the number and size of the parcels defining the model. To reach a similar level of accuracy within the HB framework, a model using parcels larger than the size of the sources should be considered.
Chowdhury, Rasheda Arman; Lina, Jean Marc; Kobayashi, Eliane; Grova, Christophe
2013-01-01
Localizing the generators of epileptic activity in the brain using Electro-EncephaloGraphy (EEG) or Magneto-EncephaloGraphy (MEG) signals is of particular interest during the pre-surgical investigation of epilepsy. Epileptic discharges can be detectable from background brain activity, provided they are associated with spatially extended generators. Using realistic simulations of epileptic activity, this study evaluates the ability of distributed source localization methods to accurately estimate the location of the generators and their sensitivity to the spatial extent of such generators when using MEG data. Source localization methods based on two types of realistic models have been investigated: (i) brain activity may be modeled using cortical parcels and (ii) brain activity is assumed to be locally smooth within each parcel. A Data Driven Parcellization (DDP) method was used to segment the cortical surface into non-overlapping parcels and diffusion-based spatial priors were used to model local spatial smoothness within parcels. These models were implemented within the Maximum Entropy on the Mean (MEM) and the Hierarchical Bayesian (HB) source localization frameworks. We proposed new methods in this context and compared them with other standard ones using Monte Carlo simulations of realistic MEG data involving sources of several spatial extents and depths. Detection accuracy of each method was quantified using Receiver Operating Characteristic (ROC) analysis and localization error metrics. Our results showed that methods implemented within the MEM framework were sensitive to all spatial extents of the sources ranging from 3 cm2 to 30 cm2, whatever were the number and size of the parcels defining the model. To reach a similar level of accuracy within the HB framework, a model using parcels larger than the size of the sources should be considered. PMID:23418485
Bayesian analysis of the break in DAMPE lepton spectra
NASA Astrophysics Data System (ADS)
Niu, Jia-Shu; Li, Tianjun; Ding, Ran; Zhu, Bin; Xue, Hui-Fang; Wang, Yang
2018-04-01
Recently, DAMPE has released its first results on the high-energy cosmic-ray electrons and positrons (CREs) from about 25 GeV to 4.6 TeV, which directly detect a break at ˜1 TeV . This result gives us an excellent opportunity to study the source of the CREs excess. In this work, we used the data for proton and helium flux (from AMS-02 and CREAM), p ¯/p ratio (from AMS-02), positron flux (from AMS-02) and CREs flux (from DAMPE without the peak signal point at ˜1.4 TeV ) to do global fitting simultaneously, which can account for the influence from the propagation model, the nuclei and electron primary source injection, and the secondary lepton production precisely. For an extra source to interpret the excess in lepton spectrum, we consider two separate scenarios (pulsar and dark matter annihilation via leptonic channels) to construct the bump (≳100 GeV ) and the break at ˜1 TeV . The result shows that (i) in the pulsar scenario, the spectral index of the injection should be νpsr˜0.65 and the cut-off should be Rc˜650 GV ; (ii) in dark matter scenario, the dark matter particle's mass is mχ˜1208 GeV , and the cross section is ⟨σ v ⟩˜1.48 ×10-23 cm3 s-1 . Moreover, in the dark matter scenario, the τ τ ¯ annihilation channel is highly suppressed, and a DM model is built to satisfy the fitting results.
Dictionary Learning Algorithms for Sparse Representation
Kreutz-Delgado, Kenneth; Murray, Joseph F.; Rao, Bhaskar D.; Engan, Kjersti; Lee, Te-Won; Sejnowski, Terrence J.
2010-01-01
Algorithms for data-driven learning of domain-specific overcomplete dictionaries are developed to obtain maximum likelihood and maximum a posteriori dictionary estimates based on the use of Bayesian models with concave/Schur-concave (CSC) negative log priors. Such priors are appropriate for obtaining sparse representations of environmental signals within an appropriately chosen (environmentally matched) dictionary. The elements of the dictionary can be interpreted as concepts, features, or words capable of succinct expression of events encountered in the environment (the source of the measured signals). This is a generalization of vector quantization in that one is interested in a description involving a few dictionary entries (the proverbial “25 words or less”), but not necessarily as succinct as one entry. To learn an environmentally adapted dictionary capable of concise expression of signals generated by the environment, we develop algorithms that iterate between a representative set of sparse representations found by variants of FOCUSS and an update of the dictionary using these sparse representations. Experiments were performed using synthetic data and natural images. For complete dictionaries, we demonstrate that our algorithms have improved performance over other independent component analysis (ICA) methods, measured in terms of signal-to-noise ratios of separated sources. In the overcomplete case, we show that the true underlying dictionary and sparse sources can be accurately recovered. In tests with natural images, learned overcomplete dictionaries are shown to have higher coding efficiency than complete dictionaries; that is, images encoded with an over-complete dictionary have both higher compression (fewer bits per pixel) and higher accuracy (lower mean square error). PMID:12590811
NASA Astrophysics Data System (ADS)
Yao, Jiachi; Xiang, Yang; Qian, Sichong; Li, Shengyang; Wu, Shaowei
2017-11-01
In order to separate and identify the combustion noise and the piston slap noise of a diesel engine, a noise source separation and identification method that combines a binaural sound localization method and blind source separation method is proposed. During a diesel engine noise and vibration test, because a diesel engine has many complex noise sources, a lead covering method was carried out on a diesel engine to isolate other interference noise from the No. 1-5 cylinders. Only the No. 6 cylinder parts were left bare. Two microphones that simulated the human ears were utilized to measure the radiated noise signals 1 m away from the diesel engine. First, a binaural sound localization method was adopted to separate the noise sources that are in different places. Then, for noise sources that are in the same place, a blind source separation method is utilized to further separate and identify the noise sources. Finally, a coherence function method, continuous wavelet time-frequency analysis method, and prior knowledge of the diesel engine are combined to further identify the separation results. The results show that the proposed method can effectively separate and identify the combustion noise and the piston slap noise of a diesel engine. The frequency of the combustion noise and the piston slap noise are respectively concentrated at 4350 Hz and 1988 Hz. Compared with the blind source separation method, the proposed method has superior separation and identification effects, and the separation results have fewer interference components from other noise.
NASA Astrophysics Data System (ADS)
Wang, Q. J.; Robertson, D. E.; Haines, C. L.
2009-02-01
Irrigation is important to many agricultural businesses but also has implications for catchment health. A considerable body of knowledge exists on how irrigation management affects farm business and catchment health. However, this knowledge is fragmentary; is available in many forms such as qualitative and quantitative; is dispersed in scientific literature, technical reports, and the minds of individuals; and is of varying degrees of certainty. Bayesian networks allow the integration of dispersed knowledge into quantitative systems models. This study describes the development, validation, and application of a Bayesian network model of farm irrigation in the Shepparton Irrigation Region of northern Victoria, Australia. In this first paper we describe the process used to integrate a range of sources of knowledge to develop a model of farm irrigation. We describe the principal model components and summarize the reaction to the model and its development process by local stakeholders. Subsequent papers in this series describe model validation and the application of the model to assess the regional impact of historical and future management intervention.
VizieR Online Data Catalog: Giant HII regions BOND abundances (Vale Asari+, 2016)
NASA Astrophysics Data System (ADS)
Vale Asari, N.; Stasinska, G.; Morisset, C.; Cid Fernandes, R.
2017-10-01
BOND determines nitrogen and oxygen gas-phase abundances by using strong and semistrong lines and comparing them to a grid of photoionization models in a Bayesian framework. The code is written in python and its source is publicly available at http://bond.ufsc.br. The grid of models presented here is included in the 3MdB data base (Morisset, Delgado-Inglada & Flores-Fajardo 2015RMxAA..51..103M, see https://sites.google.com/site/mexicanmillionmodels/) under the reference 'BOND'. The Bayesian posterior probability calculated by bond stands on two pillars: our grid of models and our choice of observational constraints (from which we calculate our likelihoods). We discuss each of these in turn. (2 data files).
Kinematic Structural Modelling in Bayesian Networks
NASA Astrophysics Data System (ADS)
Schaaf, Alexander; de la Varga, Miguel; Florian Wellmann, J.
2017-04-01
We commonly capture our knowledge about the spatial distribution of distinct geological lithologies in the form of 3-D geological models. Several methods exist to create these models, each with its own strengths and limitations. We present here an approach to combine the functionalities of two modeling approaches - implicit interpolation and kinematic modelling methods - into one framework, while explicitly considering parameter uncertainties and thus model uncertainty. In recent work, we proposed an approach to implement implicit modelling algorithms into Bayesian networks. This was done to address the issues of input data uncertainty and integration of geological information from varying sources in the form of geological likelihood functions. However, one general shortcoming of implicit methods is that they usually do not take any physical constraints into consideration, which can result in unrealistic model outcomes and artifacts. On the other hand, kinematic structural modelling intends to reconstruct the history of a geological system based on physically driven kinematic events. This type of modelling incorporates simplified, physical laws into the model, at the cost of a substantial increment of usable uncertain parameters. In the work presented here, we show an integration of these two different modelling methodologies, taking advantage of the strengths of both of them. First, we treat the two types of models separately, capturing the information contained in the kinematic models and their specific parameters in the form of likelihood functions, in order to use them in the implicit modelling scheme. We then go further and combine the two modelling approaches into one single Bayesian network. This enables the direct flow of information between the parameters of the kinematic modelling step and the implicit modelling step and links the exclusive input data and likelihoods of the two different modelling algorithms into one probabilistic inference framework. In addition, we use the capabilities of Noddy to analyze the topology of structural models to demonstrate how topological information, such as the connectivity of two layers across an unconformity, can be used as a likelihood function. In an application to a synthetic case study, we show that our approach leads to a successful combination of the two different modelling concepts. Specifically, we show that we derive ensemble realizations of implicit models that now incorporate the knowledge of the kinematic aspects, representing an important step forward in the integration of knowledge and a corresponding estimation of uncertainties in structural geological models.
Mapping malaria risk among children in Côte d'Ivoire using Bayesian geo-statistical models.
Raso, Giovanna; Schur, Nadine; Utzinger, Jürg; Koudou, Benjamin G; Tchicaya, Emile S; Rohner, Fabian; N'goran, Eliézer K; Silué, Kigbafori D; Matthys, Barbara; Assi, Serge; Tanner, Marcel; Vounatsou, Penelope
2012-05-09
In Côte d'Ivoire, an estimated 767,000 disability-adjusted life years are due to malaria, placing the country at position number 14 with regard to the global burden of malaria. Risk maps are important to guide control interventions, and hence, the aim of this study was to predict the geographical distribution of malaria infection risk in children aged <16 years in Côte d'Ivoire at high spatial resolution. Using different data sources, a systematic review was carried out to compile and geo-reference survey data on Plasmodium spp. infection prevalence in Côte d'Ivoire, focusing on children aged <16 years. The period from 1988 to 2007 was covered. A suite of Bayesian geo-statistical logistic regression models was fitted to analyse malaria risk. Non-spatial models with and without exchangeable random effect parameters were compared to stationary and non-stationary spatial models. Non-stationarity was modelled assuming that the underlying spatial process is a mixture of separate stationary processes in each ecological zone. The best fitting model based on the deviance information criterion was used to predict Plasmodium spp. infection risk for entire Côte d'Ivoire, including uncertainty. Overall, 235 data points at 170 unique survey locations with malaria prevalence data for individuals aged <16 years were extracted. Most data points (n = 182, 77.4%) were collected between 2000 and 2007. A Bayesian non-stationary regression model showed the best fit with annualized rainfall and maximum land surface temperature identified as significant environmental covariates. This model was used to predict malaria infection risk at non-sampled locations. High-risk areas were mainly found in the north-central and western area, while relatively low-risk areas were located in the north at the country border, in the north-east, in the south-east around Abidjan, and in the central-west between two high prevalence areas. The malaria risk map at high spatial resolution gives an important overview of the geographical distribution of the disease in Côte d'Ivoire. It is a useful tool for the national malaria control programme and can be utilized for spatial targeting of control interventions and rational resource allocation.
Mapping malaria risk among children in Côte d’Ivoire using Bayesian geo-statistical models
2012-01-01
Background In Côte d’Ivoire, an estimated 767,000 disability-adjusted life years are due to malaria, placing the country at position number 14 with regard to the global burden of malaria. Risk maps are important to guide control interventions, and hence, the aim of this study was to predict the geographical distribution of malaria infection risk in children aged <16 years in Côte d’Ivoire at high spatial resolution. Methods Using different data sources, a systematic review was carried out to compile and geo-reference survey data on Plasmodium spp. infection prevalence in Côte d’Ivoire, focusing on children aged <16 years. The period from 1988 to 2007 was covered. A suite of Bayesian geo-statistical logistic regression models was fitted to analyse malaria risk. Non-spatial models with and without exchangeable random effect parameters were compared to stationary and non-stationary spatial models. Non-stationarity was modelled assuming that the underlying spatial process is a mixture of separate stationary processes in each ecological zone. The best fitting model based on the deviance information criterion was used to predict Plasmodium spp. infection risk for entire Côte d’Ivoire, including uncertainty. Results Overall, 235 data points at 170 unique survey locations with malaria prevalence data for individuals aged <16 years were extracted. Most data points (n = 182, 77.4%) were collected between 2000 and 2007. A Bayesian non-stationary regression model showed the best fit with annualized rainfall and maximum land surface temperature identified as significant environmental covariates. This model was used to predict malaria infection risk at non-sampled locations. High-risk areas were mainly found in the north-central and western area, while relatively low-risk areas were located in the north at the country border, in the north-east, in the south-east around Abidjan, and in the central-west between two high prevalence areas. Conclusion The malaria risk map at high spatial resolution gives an important overview of the geographical distribution of the disease in Côte d’Ivoire. It is a useful tool for the national malaria control programme and can be utilized for spatial targeting of control interventions and rational resource allocation. PMID:22571469
NASA Astrophysics Data System (ADS)
Raj, R.; Hamm, N. A. S.; van der Tol, C.; Stein, A.
2015-08-01
Gross primary production (GPP), separated from flux tower measurements of net ecosystem exchange (NEE) of CO2, is used increasingly to validate process-based simulators and remote sensing-derived estimates of simulated GPP at various time steps. Proper validation should include the uncertainty associated with this separation at different time steps. This can be achieved by using a Bayesian framework. In this study, we estimated the uncertainty in GPP at half hourly time steps. We used a non-rectangular hyperbola (NRH) model to separate GPP from flux tower measurements of NEE at the Speulderbos forest site, The Netherlands. The NRH model included the variables that influence GPP, in particular radiation, and temperature. In addition, the NRH model provided a robust empirical relationship between radiation and GPP by including the degree of curvature of the light response curve. Parameters of the NRH model were fitted to the measured NEE data for every 10-day period during the growing season (April to October) in 2009. Adopting a Bayesian approach, we defined the prior distribution of each NRH parameter. Markov chain Monte Carlo (MCMC) simulation was used to update the prior distribution of each NRH parameter. This allowed us to estimate the uncertainty in the separated GPP at half-hourly time steps. This yielded the posterior distribution of GPP at each half hour and allowed the quantification of uncertainty. The time series of posterior distributions thus obtained allowed us to estimate the uncertainty at daily time steps. We compared the informative with non-informative prior distributions of the NRH parameters. The results showed that both choices of prior produced similar posterior distributions GPP. This will provide relevant and important information for the validation of process-based simulators in the future. Furthermore, the obtained posterior distributions of NEE and the NRH parameters are of interest for a range of applications.
Bayesian estimation of a source term of radiation release with approximately known nuclide ratios
NASA Astrophysics Data System (ADS)
Tichý, Ondřej; Šmídl, Václav; Hofman, Radek
2016-04-01
We are concerned with estimation of a source term in case of an accidental release from a known location, e.g. a power plant. Usually, the source term of an accidental release of radiation comprises of a mixture of nuclide. The gamma dose rate measurements do not provide a direct information on the source term composition. However, physical properties of respective nuclide (deposition properties, decay half-life) can be used when uncertain information on nuclide ratios is available, e.g. from known reactor inventory. The proposed method is based on linear inverse model where the observation vector y arise as a linear combination y = Mx of a source-receptor-sensitivity (SRS) matrix M and the source term x. The task is to estimate the unknown source term x. The problem is ill-conditioned and further regularization is needed to obtain a reasonable solution. In this contribution, we assume that nuclide ratios of the release is known with some degree of uncertainty. This knowledge is used to form the prior covariance matrix of the source term x. Due to uncertainty in the ratios the diagonal elements of the covariance matrix are considered to be unknown. Positivity of the source term estimate is guaranteed by using multivariate truncated Gaussian distribution. Following Bayesian approach, we estimate all parameters of the model from the data so that y, M, and known ratios are the only inputs of the method. Since the inference of the model is intractable, we follow the Variational Bayes method yielding an iterative algorithm for estimation of all model parameters. Performance of the method is studied on simulated 6 hour power plant release where 3 nuclide are released and 2 nuclide ratios are approximately known. The comparison with method with unknown nuclide ratios will be given to prove the usefulness of the proposed approach. This research is supported by EEA/Norwegian Financial Mechanism under project MSMT-28477/2014 Source-Term Determination of Radionuclide Releases by Inverse Atmospheric Dispersion Modelling (STRADI).
Pires, Sara M; Hald, Tine
2010-02-01
Salmonella is a major cause of human gastroenteritis worldwide. To prioritize interventions and assess the effectiveness of efforts to reduce illness, it is important to attribute salmonellosis to the responsible sources. Studies have suggested that some Salmonella subtypes have a higher health impact than others. Likewise, some food sources appear to have a higher impact than others. Knowledge of variability in the impact of subtypes and sources may provide valuable added information for research, risk management, and public health strategies. We developed a Bayesian model that attributes illness to specific sources and allows for a better estimation of the differences in the ability of Salmonella subtypes and food types to result in reported salmonellosis. The model accommodates data for multiple years and is based on the Danish Salmonella surveillance. The number of sporadic cases caused by different Salmonella subtypes is estimated as a function of the prevalence of these subtypes in the animal-food sources, the amount of food consumed, subtype-related factors, and source-related factors. Our results showed relative differences between Salmonella subtypes in their ability to cause disease. These differences presumably represent multiple factors, such as differences in survivability through the food chain and/or pathogenicity. The relative importance of the source-dependent factors varied considerably over the years, reflecting, among others, variability in the surveillance programs for the different animal sources. The presented model requires estimation of fewer parameters than a previously developed model, and thus allows for a better estimation of these factors to result in reported human disease. In addition, a comparison of the results of the same model using different sets of typing data revealed that the model can be applied to data with less discriminatory power, which is the only data available in many countries. In conclusion, the model allows for the estimation of relative differences between Salmonella subtypes and sources, providing results that will benefit future risk assessment or risk ranking purposes.
Determining X-ray source intensity and confidence bounds in crowded fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Primini, F. A.; Kashyap, V. L., E-mail: fap@head.cfa.harvard.edu
We present a rigorous description of the general problem of aperture photometry in high-energy astrophysics photon-count images, in which the statistical noise model is Poisson, not Gaussian. We compute the full posterior probability density function for the expected source intensity for various cases of interest, including the important cases in which both source and background apertures contain contributions from the source, and when multiple source apertures partially overlap. A Bayesian approach offers the advantages of allowing one to (1) include explicit prior information on source intensities, (2) propagate posterior distributions as priors for future observations, and (3) use Poisson likelihoods,more » making the treatment valid in the low-counts regime. Elements of this approach have been implemented in the Chandra Source Catalog.« less
Wilcox, Thomas P; Zwickl, Derrick J; Heath, Tracy A; Hillis, David M
2002-11-01
Four New World genera of dwarf boas (Exiliboa, Trachyboa, Tropidophis, and Ungaliophis) have been placed by many systematists in a single group (traditionally called Tropidophiidae). However, the monophyly of this group has been questioned in several studies. Moreover, the overall relationships among basal snake lineages, including the placement of the dwarf boas, are poorly understood. We obtained mtDNA sequence data for 12S, 16S, and intervening tRNA-val genes from 23 species of snakes representing most major snake lineages, including all four genera of New World dwarf boas. We then examined the phylogenetic position of these species by estimating the phylogeny of the basal snakes. Our phylogenetic analysis suggests that New World dwarf boas are not monophyletic. Instead, we find Exiliboa and Ungaliophis to be most closely related to sand boas (Erycinae), boas (Boinae), and advanced snakes (Caenophidea), whereas Tropidophis and Trachyboa form an independent clade that separated relatively early in snake radiation. Our estimate of snake phylogeny differs significantly in other ways from some previous estimates of snake phylogeny. For instance, pythons do not cluster with boas and sand boas, but instead show a strong relationship with Loxocemus and Xenopeltis. Additionally, uropeltids cluster strongly with Cylindrophis, and together are embedded in what has previously been considered the macrostomatan radiation. These relationships are supported by both bootstrapping (parametric and nonparametric approaches) and Bayesian analysis, although Bayesian support values are consistently higher than those obtained from nonparametric bootstrapping. Simulations show that Bayesian support values represent much better estimates of phylogenetic accuracy than do nonparametric bootstrap support values, at least under the conditions of our study. Copyright 2002 Elsevier Science (USA)
Gajic-Veljanoski, Olga; Cheung, Angela M; Bayoumi, Ahmed M; Tomlinson, George
2016-05-30
Bivariate random-effects meta-analysis (BVMA) is a method of data synthesis that accounts for treatment effects measured on two outcomes. BVMA gives more precise estimates of the population mean and predicted values than two univariate random-effects meta-analyses (UVMAs). BVMA also addresses bias from incomplete reporting of outcomes. A few tutorials have covered technical details of BVMA of categorical or continuous outcomes. Limited guidance is available on how to analyze datasets that include trials with mixed continuous-binary outcomes where treatment effects on one outcome or the other are not reported. Given the advantages of Bayesian BVMA for handling missing outcomes, we present a tutorial for Bayesian BVMA of incompletely reported treatment effects on mixed bivariate outcomes. This step-by-step approach can serve as a model for our intended audience, the methodologist familiar with Bayesian meta-analysis, looking for practical advice on fitting bivariate models. To facilitate application of the proposed methods, we include our WinBUGS code. As an example, we use aggregate-level data from published trials to demonstrate the estimation of the effects of vitamin K and bisphosphonates on two correlated bone outcomes, fracture, and bone mineral density. We present datasets where reporting of the pairs of treatment effects on both outcomes was 'partially' complete (i.e., pairs completely reported in some trials), and we outline steps for modeling the incompletely reported data. To assess what is gained from the additional work required by BVMA, we compare the resulting estimates to those from separate UVMAs. We discuss methodological findings and make four recommendations. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Xu, Wei-Wei; Hu, Shen-Jiang; Wu, Tao
2017-07-01
Antithrombotic therapy using new oral anticoagulants (NOACs) in patients with atrial fibrillation (AF) has been generally shown to have a favorable risk-benefit profile. Since there has been dispute about the risks of gastrointestinal bleeding (GIB) and intracranial hemorrhage (ICH), we sought to conduct a systematic review and network meta-analysis using Bayesian inference to analyze the risks of GIB and ICH in AF patients taking NOACs. We analyzed data from 20 randomized controlled trials of 91 671 AF patients receiving anticoagulants, antiplatelet drugs, or placebo. Bayesian network meta-analysis of two different evidence networks was performed using a binomial likelihood model, based on a network in which different agents (and doses) were treated as separate nodes. Odds ratios (ORs) and 95% confidence intervals (CIs) were modeled using Markov chain Monte Carlo methods. Indirect comparisons with the Bayesian model confirmed that aspirin+clopidogrel significantly increased the risk of GIB in AF patients compared to the placebo (OR 0.33, 95% CI 0.01-0.92). Warfarin was identified as greatly increasing the risk of ICH compared to edoxaban 30 mg (OR 3.42, 95% CI 1.22-7.24) and dabigatran 110 mg (OR 3.56, 95% CI 1.10-8.45). We further ranked the NOACs for the lowest risk of GIB (apixaban 5 mg) and ICH (apixaban 5 mg, dabigatran 110 mg, and edoxaban 30 mg). Bayesian network meta-analysis of treatment of non-valvular AF patients with anticoagulants suggested that NOACs do not increase risks of GIB and/or ICH, compared to each other.
NASA Astrophysics Data System (ADS)
O'Shaughnessy, Richard; Lange, Jacob; Healy, James; Carlos, Lousto; Shoemaker, Deirdre; Lovelace, Geoffrey; Scheel, Mark
2016-03-01
In this talk, we apply a procedure to reconstruct the parameters of sufficiently massive coalescing compact binaries via direct comparison with numerical relativity simulations. We illustrate how to use only comparisons between synthetic data and these simulations to reconstruct properties of a synthetic candidate source. We demonstrate using selected examples that we can reconstruct posterior distributions obtained by other Bayesian methods with our sparse grid. We describe how followup simulations can corroborate and improve our understanding of a candidate signal.
Incorporating Biological Knowledge into Evaluation of Casual Regulatory Hypothesis
NASA Technical Reports Server (NTRS)
Chrisman, Lonnie; Langley, Pat; Bay, Stephen; Pohorille, Andrew; DeVincenzi, D. (Technical Monitor)
2002-01-01
Biological data can be scarce and costly to obtain. The small number of samples available typically limits statistical power and makes reliable inference of causal relations extremely difficult. However, we argue that statistical power can be increased substantially by incorporating prior knowledge and data from diverse sources. We present a Bayesian framework that combines information from different sources and we show empirically that this lets one make correct causal inferences with small sample sizes that otherwise would be impossible.
pytc: Open-Source Python Software for Global Analyses of Isothermal Titration Calorimetry Data.
Duvvuri, Hiranmayi; Wheeler, Lucas C; Harms, Michael J
2018-05-08
Here we describe pytc, an open-source Python package for global fits of thermodynamic models to multiple isothermal titration calorimetry experiments. Key features include simplicity, the ability to implement new thermodynamic models, a robust maximum likelihood fitter, a fast Bayesian Markov-Chain Monte Carlo sampler, rigorous implementation, extensive documentation, and full cross-platform compatibility. pytc fitting can be done using an application program interface or via a graphical user interface. It is available for download at https://github.com/harmslab/pytc .
An All-Sky Search for Wide Binaries in the SUPERBLINK Proper Motion Catalog
NASA Astrophysics Data System (ADS)
Hartman, Zachary; Lepine, Sebastien
2017-01-01
We present initial results from an all-sky search for Common Proper Motion (CPM) binaries in the SUPERBLINK all-sky proper motion catalog of 2.8 million stars with proper motions greater than 40 mas/yr, which has been recently enhanced with data from the GAIA mission. We initially search the SUPERBLINK catalog for pairs of stars with angular separations up to 1 degree and proper motion difference less than 40 mas/yr. In order to determine which of these pairs are real binaries, we develop a Bayesian analysis to calculate probabilities of true companionship based on a combination of proper motion magnitude, angular separation, and proper motion differences. The analysis reveals that the SUPERBLINK catalog most likely contains ~40,000 genuine common proper motion binaries. We provide initial estimates of the distances and projected physical separations of these wide binaries.
Restrictive loads powered by separate or by common electrical sources
NASA Technical Reports Server (NTRS)
Appelbaum, J.
1989-01-01
In designing a multiple load electrical system, the designer may wish to compare the performance of two setups: a common electrical source powering all loads, or separate electrical sources powering individual loads. Three types of electrical sources: an ideal voltage source, an ideal current source, and solar cell source powering resistive loads were analyzed for their performances in separate and common source systems. A mathematical proof is given, for each case, indicating the merit of the separate or common source system. The main conclusions are: (1) identical resistive loads powered by ideal voltage sources perform the same in both system setups, (2) nonidentical resistive loads powered by ideal voltage sources perform the same in both system setups, (3) nonidentical resistive loads powered by ideal current sources have higher performance in separate source systems, and (4) nonidentical resistive loads powered by solar cells have higher performance in a common source system for a wide range of load resistances.
Fossen, Erlend I.; Ekrem, Torbjørn; Nilsson, Anders N.; Bergsten, Johannes
2016-01-01
Abstract The chiefly Holarctic Hydrobius species complex (Coleoptera, Hydrophilidae) currently consists of Hydrobius arcticus Kuwert, 1890, and three morphological variants of Hydrobius fuscipes (Linnaeus, 1758): var. fuscipes, var. rottenbergii and var. subrotundus in northern Europe. Here molecular and morphological data are used to test the species boundaries in this species complex. Three gene segments (COI, H3 and ITS2) were sequenced and analyzed with Bayesian methods to infer phylogenetic relationships. The Generalized Mixed Yule Coalescent (GMYC) model and two versions of the Bayesian species delimitation method BPP, with or without an a priori defined guide tree (v2.2 & v3.0), were used to evaluate species limits. External and male genital characters of primarily Fennoscandian specimens were measured and statistically analyzed to test for significant differences in quantitative morphological characters. The four morphotypes formed separate genetic clusters on gene trees and were delimited as separate species by GMYC and by both versions of BPP, despite specimens of Hydrobius fuscipes var. fuscipes and Hydrobius fuscipes var. subrotundus being sympatric. Hydrobius arcticus and Hydrobius fuscipes var. rottenbergii could only be separated genetically with ITS2, and were delimited statistically with GMYC on ITS2 and with BPP on the combined data. In addition, six or seven potentially cryptic species of the Hydrobius fuscipes complex from regions outside northern Europe were delimited genetically. Although some overlap was found, the mean values of six male genital characters were significantly different between the morphotypes (p < 0.001). Morphological characters previously presumed to be diagnostic were less reliable to separate Hydrobius fuscipes var. fuscipes from Hydrobius fuscipes var. subrotundus, but characters in the literature for Hydrobius arcticus and Hydrobius fuscipes var. rottenbergii were diagnostic. Overall, morphological and molecular evidence strongly suggest that Hydrobius arcticus and the three morphological variants of Hydrobius fuscipes are separate species and Hydrobius rottenbergii Gerhardt, 1872, stat. n. and Hydrobius subrotundus Stephens, 1829, stat. n. are elevated to valid species. An identification key to northern European species of Hydrobius is provided. PMID:27081333
NASA Astrophysics Data System (ADS)
Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens
2018-06-01
This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.
Statistical Inference in the Learning of Novel Phonetic Categories
ERIC Educational Resources Information Center
Zhao, Yuan
2010-01-01
Learning a phonetic category (or any linguistic category) requires integrating different sources of information. A crucial unsolved problem for phonetic learning is how this integration occurs: how can we update our previous knowledge about a phonetic category as we hear new exemplars of the category? One model of learning is Bayesian Inference,…
Bayesian Analysis of Recognition Memory: The Case of the List-Length Effect
ERIC Educational Resources Information Center
Dennis, Simon; Lee, Michael D.; Kinnell, Angela
2008-01-01
Recognition memory experiments are an important source of empirical constraints for theories of memory. Unfortunately, standard methods for analyzing recognition memory data have problems that are often severe enough to prevent clear answers being obtained. A key example is whether longer lists lead to poorer recognition performance. The presence…
Analogical and category-based inference: a theoretical integration with Bayesian causal models.
Holyoak, Keith J; Lee, Hee Seung; Lu, Hongjing
2010-11-01
A fundamental issue for theories of human induction is to specify constraints on potential inferences. For inferences based on shared category membership, an analogy, and/or a relational schema, it appears that the basic goal of induction is to make accurate and goal-relevant inferences that are sensitive to uncertainty. People can use source information at various levels of abstraction (including both specific instances and more general categories), coupled with prior causal knowledge, to build a causal model for a target situation, which in turn constrains inferences about the target. We propose a computational theory in the framework of Bayesian inference and test its predictions (parameter-free for the cases we consider) in a series of experiments in which people were asked to assess the probabilities of various causal predictions and attributions about a target on the basis of source knowledge about generative and preventive causes. The theory proved successful in accounting for systematic patterns of judgments about interrelated types of causal inferences, including evidence that analogical inferences are partially dissociable from overall mapping quality.
Experimental Verification of Bayesian Planet Detection Algorithms with a Shaped Pupil Coronagraph
NASA Astrophysics Data System (ADS)
Savransky, D.; Groff, T. D.; Kasdin, N. J.
2010-10-01
We evaluate the feasibility of applying Bayesian detection techniques to discovering exoplanets using high contrast laboratory data with simulated planetary signals. Background images are generated at the Princeton High Contrast Imaging Lab (HCIL), with a coronagraphic system utilizing a shaped pupil and two deformable mirrors (DMs) in series. Estimates of the electric field at the science camera are used to correct for quasi-static speckle and produce symmetric high contrast dark regions in the image plane. Planetary signals are added in software, or via a physical star-planet simulator which adds a second off-axis point source before the coronagraph with a beam recombiner, calibrated to a fixed contrast level relative to the source. We produce a variety of images, with varying integration times and simulated planetary brightness. We then apply automated detection algorithms such as matched filtering to attempt to extract the planetary signals. This allows us to evaluate the efficiency of these techniques in detecting planets in a high noise regime and eliminating false positives, as well as to test existing algorithms for calculating the required integration times for these techniques to be applicable.
Functional Interaction Network Construction and Analysis for Disease Discovery.
Wu, Guanming; Haw, Robin
2017-01-01
Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
NASA Astrophysics Data System (ADS)
Hopcroft, Peter O.; Valdes, Paul J.; Kaplan, Jed O.
2018-04-01
The observed rise in atmospheric methane (CH4) from 375 ppbv during the Last Glacial Maximum (LGM: 21,000 years ago) to 680 ppbv during the late preindustrial era is not well understood. Atmospheric chemistry considerations implicate an increase in CH4 sources, but process-based estimates fail to reproduce the required amplitude. CH4 stable isotopes provide complementary information that can help constrain the underlying causes of the increase. We combine Earth System model simulations of the late preindustrial and LGM CH4 cycles, including process-based estimates of the isotopic discrimination of vegetation, in a box model of atmospheric CH4 and its isotopes. Using a Bayesian approach, we show how model-based constraints and ice core observations may be combined in a consistent probabilistic framework. The resultant posterior distributions point to a strong reduction in wetland and other biogenic CH4 emissions during the LGM, with a modest increase in the geological source, or potentially natural or anthropogenic fires, accounting for the observed enrichment of δ13CH4.
Evolutionary history and spatiotemporal dynamics of dengue virus type 1 in Asia.
Sun, Yan; Meng, Shengli
2013-06-01
Previous studies showed that DENV-1 transmitted from monkeys to humans approximately 125 years ago. However, there is no comprehensive analysis about phylogeography and population dynamics of Asian DENV-1. Here, we adopt a Bayesian phylogeographic approach to investigate the evolutionary history and phylogeography of Asian DENV-1 using envelope (E) protein gene sequences of 450 viruses isolated from 1954 to 2010 throughout 18 Asian countries and regions. Bayesian phylogeographic analyses indicate that the high rates of viral migration possibly follows long-distance travel for humans in Southeast Asia. Our study highlights that Southeast Asian countries have acted as the main viral sources of the dengue epidemics in East Asia. The results reveal that the time to the most recent common ancestor (TMRCA) of Asian DENV-1 is 1906 (95% HPD, years 1897-1915). We show that the spatial dissemination of virus is the major source of DENV-1 outbreaks in the different localities and leads to subsequent establishment and expansion of the virus in these areas. Copyright © 2013 Elsevier B.V. All rights reserved.
Using a pseudo-dynamic source inversion approach to improve earthquake source imaging
NASA Astrophysics Data System (ADS)
Zhang, Y.; Song, S. G.; Dalguer, L. A.; Clinton, J. F.
2014-12-01
Imaging a high-resolution spatio-temporal slip distribution of an earthquake rupture is a core research goal in seismology. In general we expect to obtain a higher quality source image by improving the observational input data (e.g. using more higher quality near-source stations). However, recent studies show that increasing the surface station density alone does not significantly improve source inversion results (Custodio et al. 2005; Zhang et al. 2014). We introduce correlation structures between the kinematic source parameters: slip, rupture velocity, and peak slip velocity (Song et al. 2009; Song and Dalguer 2013) in the non-linear source inversion. The correlation structures are physical constraints derived from rupture dynamics that effectively regularize the model space and may improve source imaging. We name this approach pseudo-dynamic source inversion. We investigate the effectiveness of this pseudo-dynamic source inversion method by inverting low frequency velocity waveforms from a synthetic dynamic rupture model of a buried vertical strike-slip event (Mw 6.5) in a homogeneous half space. In the inversion, we use a genetic algorithm in a Bayesian framework (Moneli et al. 2008), and a dynamically consistent regularized Yoffe function (Tinti, et al. 2005) was used for a single-window slip velocity function. We search for local rupture velocity directly in the inversion, and calculate the rupture time using a ray-tracing technique. We implement both auto- and cross-correlation of slip, rupture velocity, and peak slip velocity in the prior distribution. Our results suggest that kinematic source model estimates capture the major features of the target dynamic model. The estimated rupture velocity closely matches the target distribution from the dynamic rupture model, and the derived rupture time is smoother than the one we searched directly. By implementing both auto- and cross-correlation of kinematic source parameters, in comparison to traditional smoothing constraints, we are in effect regularizing the model space in a more physics-based manner without loosing resolution of the source image. Further investigation is needed to tune the related parameters of pseudo-dynamic source inversion and relative weighting between the prior and the likelihood function in the Bayesian inversion.
Lyons, Michael A.; Yang, Raymond S.H.; Mayeno, Arthur N.; Reisfeld, Brad
2008-01-01
Background One problem of interpreting population-based biomonitoring data is the reconstruction of corresponding external exposure in cases where no such data are available. Objectives We demonstrate the use of a computational framework that integrates physiologically based pharmacokinetic (PBPK) modeling, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of environmental chloroform source concentrations consistent with human biomonitoring data. The biomonitoring data consist of chloroform blood concentrations measured as part of the Third National Health and Nutrition Examination Survey (NHANES III), and for which no corresponding exposure data were collected. Methods We used a combined PBPK and shower exposure model to consider several routes and sources of exposure: ingestion of tap water, inhalation of ambient household air, and inhalation and dermal absorption while showering. We determined posterior distributions for chloroform concentration in tap water and ambient household air using U.S. Environmental Protection Agency Total Exposure Assessment Methodology (TEAM) data as prior distributions for the Bayesian analysis. Results Posterior distributions for exposure indicate that 95% of the population represented by the NHANES III data had likely chloroform exposures ≤ 67 μg/L in tap water and ≤ 0.02 μg/L in ambient household air. Conclusions Our results demonstrate the application of computer simulation to aid in the interpretation of human biomonitoring data in the context of the exposure–health evaluation–risk assessment continuum. These results should be considered as a demonstration of the method and can be improved with the addition of more detailed data. PMID:18709138
A Bayesian Approach to Real-Time Earthquake Phase Association
NASA Astrophysics Data System (ADS)
Benz, H.; Johnson, C. E.; Earle, P. S.; Patton, J. M.
2014-12-01
Real-time location of seismic events requires a robust and extremely efficient means of associating and identifying seismic phases with hypothetical sources. An association algorithm converts a series of phase arrival times into a catalog of earthquake hypocenters. The classical approach based on time-space stacking of the locus of possible hypocenters for each phase arrival using the principal of acoustic reciprocity has been in use now for many years. One of the most significant problems that has emerged over time with this approach is related to the extreme variations in seismic station density throughout the global seismic network. To address this problem we have developed a novel, Bayesian association algorithm, which looks at the association problem as a dynamically evolving complex system of "many to many relationships". While the end result must be an array of one to many relations (one earthquake, many phases), during the association process the situation is quite different. Both the evolving possible hypocenters and the relationships between phases and all nascent hypocenters is many to many (many earthquakes, many phases). The computational framework we are using to address this is a responsive, NoSQL graph database where the earthquake-phase associations are represented as intersecting Bayesian Learning Networks. The approach directly addresses the network inhomogeneity issue while at the same time allowing the inclusion of other kinds of data (e.g., seismic beams, station noise characteristics, priors on estimated location of the seismic source) by representing the locus of intersecting hypothetical loci for a given datum as joint probability density functions.
Real-time inversions for finite fault slip models and rupture geometry based on high-rate GPS data
Minson, Sarah E.; Murray, Jessica R.; Langbein, John O.; Gomberg, Joan S.
2015-01-01
We present an inversion strategy capable of using real-time high-rate GPS data to simultaneously solve for a distributed slip model and fault geometry in real time as a rupture unfolds. We employ Bayesian inference to find the optimal fault geometry and the distribution of possible slip models for that geometry using a simple analytical solution. By adopting an analytical Bayesian approach, we can solve this complex inversion problem (including calculating the uncertainties on our results) in real time. Furthermore, since the joint inversion for distributed slip and fault geometry can be computed in real time, the time required to obtain a source model of the earthquake does not depend on the computational cost. Instead, the time required is controlled by the duration of the rupture and the time required for information to propagate from the source to the receivers. We apply our modeling approach, called Bayesian Evidence-based Fault Orientation and Real-time Earthquake Slip, to the 2011 Tohoku-oki earthquake, 2003 Tokachi-oki earthquake, and a simulated Hayward fault earthquake. In all three cases, the inversion recovers the magnitude, spatial distribution of slip, and fault geometry in real time. Since our inversion relies on static offsets estimated from real-time high-rate GPS data, we also present performance tests of various approaches to estimating quasi-static offsets in real time. We find that the raw high-rate time series are the best data to use for determining the moment magnitude of the event, but slightly smoothing the raw time series helps stabilize the inversion for fault geometry.
NASA Astrophysics Data System (ADS)
Jakkareddy, Pradeep S.; Balaji, C.
2016-09-01
This paper employs the Bayesian based Metropolis Hasting - Markov Chain Monte Carlo algorithm to solve inverse heat transfer problem of determining the spatially varying heat transfer coefficient from a flat plate with flush mounted discrete heat sources with measured temperatures at the bottom of the plate. The Nusselt number is assumed to be of the form Nu = aReb(x/l)c . To input reasonable values of ’a’ and ‘b’ into the inverse problem, first limited two dimensional conjugate convection simulations were done with Comsol. Based on the guidance from this different values of ‘a’ and ‘b’ are input to a computationally less complex problem of conjugate conduction in the flat plate (15mm thickness) and temperature distributions at the bottom of the plate which is a more convenient location for measuring the temperatures without disturbing the flow were obtained. Since the goal of this work is to demonstrate the eficiacy of the Bayesian approach to accurately retrieve ‘a’ and ‘b’, numerically generated temperatures with known values of ‘a’ and ‘b’ are treated as ‘surrogate’ experimental data. The inverse problem is then solved by repeatedly using the forward solutions together with the MH-MCMC aprroach. To speed up the estimation, the forward model is replaced by an artificial neural network. The mean, maximum-a-posteriori and standard deviation of the estimated parameters ‘a’ and ‘b’ are reported. The robustness of the proposed method is examined, by synthetically adding noise to the temperatures.
NASA Astrophysics Data System (ADS)
Tonini, R.; Anita, G.
2011-12-01
In both worldwide and regional historical catalogues, most of the tsunamis are caused by earthquakes and a minor percentage is represented by all the other non-seismic sources. On the other hand, tsunami hazard and risk studies are often applied to very specific areas, where this global trend can be different or even inverted, depending on the kind of potential tsunamigenic sources which characterize the case study. So far, few probabilistic approaches consider the contribution of landslides and/or phenomena derived by volcanic activity, i.e. pyroclastic flows and flank collapses, as predominant in the PTHA, also because of the difficulties to estimate the correspondent recurrence time. These considerations are valid, for example, for the city of Naples, Italy, which is surrounded by a complex active volcanic system (Vesuvio, Campi Flegrei, Ischia) that presents a significant number of potential tsunami sources of non-seismic origin compared to the seismic ones. In this work we present the preliminary results of a probabilistic multi-source tsunami hazard assessment applied to Naples. The method to estimate the uncertainties will be based on Bayesian inference. This is the first step towards a more comprehensive task which will provide a tsunami risk quantification for this town in the frame of the Italian national project ByMuR (http://bymur.bo.ingv.it). This three years long ongoing project has the final objective of developing a Bayesian multi-risk methodology to quantify the risk related to different natural hazards (volcanoes, earthquakes and tsunamis) applied to the city of Naples.
Brayanov, Jordan B.
2010-01-01
Which is heavier: a pound of lead or a pound of feathers? This classic trick question belies a simple but surprising truth: when lifted, the pound of lead feels heavier—a phenomenon known as the size–weight illusion. To estimate the weight of an object, our CNS combines two imperfect sources of information: a prior expectation, based on the object's appearance, and direct sensory information from lifting it. Bayes' theorem (or Bayes' law) defines the statistically optimal way to combine multiple information sources for maximally accurate estimation. Here we asked whether the mechanisms for combining these information sources produce statistically optimal weight estimates for both perceptions and actions. We first studied the ability of subjects to hold one hand steady when the other removed an object from it, under conditions in which sensory information about the object's weight sometimes conflicted with prior expectations based on its size. Since the ability to steady the supporting hand depends on the generation of a motor command that accounts for lift timing and object weight, hand motion can be used to gauge biases in weight estimation by the motor system. We found that these motor system weight estimates reflected the integration of prior expectations with real-time proprioceptive information in a Bayesian, statistically optimal fashion that discounted unexpected sensory information. This produces a motor size–weight illusion that consistently biases weight estimates toward prior expectations. In contrast, when subjects compared the weights of two objects, their perceptions defied Bayes' law, exaggerating the value of unexpected sensory information. This produces a perceptual size–weight illusion that biases weight perceptions away from prior expectations. We term this effect “anti-Bayesian” because the bias is opposite that seen in Bayesian integration. Our findings suggest that two fundamentally different strategies for the integration of prior expectations with sensory information coexist in the nervous system for weight estimation. PMID:20089821
Sokhey, Taegh; Gaebler-Spira, Deborah; Kording, Konrad P.
2017-01-01
Background It is important to understand the motor deficits of children with Cerebral Palsy (CP). Our understanding of this motor disorder can be enriched by computational models of motor control. One crucial stage in generating movement involves combining uncertain information from different sources, and deficits in this process could contribute to reduced motor function in children with CP. Healthy adults can integrate previously-learned information (prior) with incoming sensory information (likelihood) in a close-to-optimal way when estimating object location, consistent with the use of Bayesian statistics. However, there are few studies investigating how children with CP perform sensorimotor integration. We compare sensorimotor estimation in children with CP and age-matched controls using a model-based analysis to understand the process. Methods and findings We examined Bayesian sensorimotor integration in children with CP, aged between 5 and 12 years old, with Gross Motor Function Classification System (GMFCS) levels 1–3 and compared their estimation behavior with age-matched typically-developing (TD) children. We used a simple sensorimotor estimation task which requires participants to combine probabilistic information from different sources: a likelihood distribution (current sensory information) with a prior distribution (learned target information). In order to examine sensorimotor integration, we quantified how participants weighed statistical information from the two sources (prior and likelihood) and compared this to the statistical optimal weighting. We found that the weighing of statistical information in children with CP was as statistically efficient as that of TD children. Conclusions We conclude that Bayesian sensorimotor integration is not impaired in children with CP and therefore, does not contribute to their motor deficits. Future research has the potential to enrich our understanding of motor disorders by investigating the stages of motor processing set out by computational models. Therapeutic interventions should exploit the ability of children with CP to use statistical information. PMID:29186196
Revisiting the 2004 Sumatra-Andaman earthquake in a Bayesian framework
NASA Astrophysics Data System (ADS)
Bletery, Q.; Sladen, A.; Jiang, J.; Simons, M.
2015-12-01
The 2004 Mw 9.25 Sumatra-Andaman earthquake is the largest seismic event of the modern instrumental era. Despite considerable effort to analyze the characteristics of its rupture, the different available observations have proven difficult to simultaneously integrate jointly into a finite-fault slip model. In particular, the critical near-field geodetic records contain variable and significant post-seismic signal (between 2 weeks and 2 months) while the satellite altimetry records of the associated tsunami are affected by various sources of uncertainties (e.g. source rupture velocity, meso-scale oceanic currents). In this study, we investigate the quasi-static slip distribution of the Sumatra-Andaman earthquake by carefully accounting for the different sources of uncertainties in the joint inversion of an extended set of geodetic and tsunami data. To do so, we use non-diagonal covariance matrices reflecting both data and model uncertainties in a fully Bayesian inversion framework. As model errors are particularly large for mega-earthquakes, we also rely on advanced simulation codes (normal mode theory on a layered spherical Earth for the static displacement field and non-hydrostatic equations for the tsunami) and account for the 3D curvature of the megathrust interface to reduce the associated epistemic uncertainties. The fully Bayesian inversion framework then enables us to derive the families of possible models compatible with the unevenly distributed and sometimes ambiguous measurements. We find two regions of high slip at latitudes 3°-4°N and 7°-8°N with amplitudes that probably reached values as large as 40 m and possibly larger. Such amounts of slip were not proposed by previous studies, which might have been biased by smoothing regularizations. We also find significant slip (around 20 m) offshore Andaman islands absent in earlier studies. Furthermore, we find that the rupture very likely involved shallow slip, with the possibility of reaching the trench.
Probabilistic selection of high-redshift quasars
NASA Astrophysics Data System (ADS)
Mortlock, Daniel J.; Patel, Mitesh; Warren, Stephen J.; Hewett, Paul C.; Venemans, Bram P.; McMahon, Richard G.; Simpson, Chris
2012-01-01
High-redshift quasars (HZQs) with redshifts of z ≳ 6 are so rare that any photometrically selected sample of sources with HZQ-like colours is likely to be dominated by Galactic stars and brown dwarfs scattered from the stellar locus. It is impractical to re-observe all such candidates, so an alternative approach was developed in which Bayesian model comparison techniques are used to calculate the probability that a candidate is a HZQ, Pq, by combining models of the quasar and star populations with the photometric measurements of the object. This method was motivated specifically by the large number of HZQ candidates identified by cross-matching the UKIRT (United Kingdom Infrared Telescope) Infrared Deep Sky Survey (UKIDSS) Large Area Survey (LAS) to the Sloan Digital Sky Survey (SDSS): in the ? covered by the LAS in the UKIDSS Eighth Data Release (DR8) there are ˜9 × 103 real astronomical point sources with the measured colours of the target quasars, of which only ˜10 are expected to be HZQs. Applying Bayesian model comparison to the sample reveals that most sources with HZQ-like colours have Pq≲ 0.1 and can be confidently rejected without the need for any further observations. In the case of the UKIDSS DR8 LAS, there were just 107 candidates with Pq≥ 0.1; these objects were prioritized for re-observation by ranking according to Pq (and their likely redshift, which was also inferred from the photometric data). Most candidates were rejected after one or two (moderate-depth) photometric measurements by recalculating Pq using the new data. That left 12 confirmed HZQs, six of which were previously identified in the SDSS and six of which were new UKIDSS discoveries. The high efficiency of this Bayesian selection method suggests that it could usefully be extended to other HZQ surveys (e.g. searches by the Panoramic Survey Telescope And Rapid Response System, Pan-STARRS, or the Visible and Infrared Survey Telescope for Astronomy, VISTA) as well as to other searches for rare objects.
NASA Astrophysics Data System (ADS)
Dai, H.; Chen, X.; Ye, M.; Song, X.; Zachara, J. M.
2016-12-01
Sensitivity analysis has been an important tool in groundwater modeling to identify the influential parameters. Among various sensitivity analysis methods, the variance-based global sensitivity analysis has gained popularity for its model independence characteristic and capability of providing accurate sensitivity measurements. However, the conventional variance-based method only considers uncertainty contribution of single model parameters. In this research, we extended the variance-based method to consider more uncertainty sources and developed a new framework to allow flexible combinations of different uncertainty components. We decompose the uncertainty sources into a hierarchical three-layer structure: scenario, model and parametric. Furthermore, each layer of uncertainty source is capable of containing multiple components. An uncertainty and sensitivity analysis framework was then constructed following this three-layer structure using Bayesian network. Different uncertainty components are represented as uncertain nodes in this network. Through the framework, variance-based sensitivity analysis can be implemented with great flexibility of using different grouping strategies for uncertainty components. The variance-based sensitivity analysis thus is improved to be able to investigate the importance of an extended range of uncertainty sources: scenario, model, and other different combinations of uncertainty components which can represent certain key model system processes (e.g., groundwater recharge process, flow reactive transport process). For test and demonstration purposes, the developed methodology was implemented into a test case of real-world groundwater reactive transport modeling with various uncertainty sources. The results demonstrate that the new sensitivity analysis method is able to estimate accurate importance measurements for any uncertainty sources which were formed by different combinations of uncertainty components. The new methodology can provide useful information for environmental management and decision-makers to formulate policies and strategies.
The Chandra Source Catalog: X-ray Aperture Photometry
NASA Astrophysics Data System (ADS)
Kashyap, Vinay; Primini, F. A.; Glotfelty, K. J.; Anderson, C. S.; Bonaventura, N. R.; Chen, J. C.; Davis, J. E.; Doe, S. M.; Evans, I. N.; Evans, J. D.; Fabbiano, G.; Galle, E.; Gibbs, D. G.; Grier, J. D.; Hain, R.; Hall, D. M.; Harbo, P. N.; He, X.; Houck, J. C.; Karovska, M.; Lauer, J.; McCollough, M. L.; McDowell, J. C.; Miller, J. B.; Mitschang, A. W.; Morgan, D. L.; Nichols, J. S.; Nowak, M. A.; Plummer, D. A.; Refsdal, B. L.; Rots, A. H.; Siemiginowska, A. L.; Sundheim, B. A.; Tibbetts, M. S.; Van Stone, D. W.; Winkelman, S. L.; Zografou, P.
2009-01-01
The Chandra Source Catalog represents a reanalysis of the entire ACIS and HRC imaging observations over the 9-year Chandra mission. Source detection is carried out on a uniform basis, using the CIAO tool wavdetect, and source fluxes are estimated post-facto using a Bayesian method that accounts for background, spatial resolution effects, and contamination from nearby sources. We use gamma-function prior distributions, which could be either non-informative, or in case there exist previous observations of the same source, strongly informative. The resulting posterior probability density functions allow us to report the flux and a robust credible range on it. We also determine limiting sensitivities at arbitrary locations in the field using the same formulation. This work was supported by CXC NASA contracts NAS8-39073 (VK) and NAS8-03060 (CSC).
Chee, S Y
2015-05-25
The mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) gene has been universally and successfully utilized as a barcoding gene, mainly because it can be amplified easily, applied across a wide range of taxa, and results can be obtained cheaply and quickly. However, in rare cases, the gene can fail to distinguish between species, particularly when exposed to highly sensitive methods of data analysis, such as the Bayesian method, or when taxa have undergone introgressive hybridization, over-splitting, or incomplete lineage sorting. Such cases require the use of alternative markers, and nuclear DNA markers are commonly used. In this study, a dendrogram produced by Bayesian analysis of an mtDNA COI dataset was compared with that of a nuclear DNA ATPS-α dataset, in order to evaluate the efficiency of COI in barcoding Malaysian nerites (Neritidae). In the COI dendrogram, most of the species were in individual clusters, except for two species: Nerita chamaeleon and N. histrio. These two species were placed in the same subcluster, whereas in the ATPS-α dendrogram they were in their own subclusters. Analysis of the ATPS-α gene also placed the two genera of nerites (Nerita and Neritina) in separate clusters, whereas COI gene analysis placed both genera in the same cluster. Therefore, in the case of the Neritidae, the ATPS-α gene is a better barcoding gene than the COI gene.
Bayesian estimation of multicomponent relaxation parameters in magnetic resonance fingerprinting.
McGivney, Debra; Deshmane, Anagha; Jiang, Yun; Ma, Dan; Badve, Chaitra; Sloan, Andrew; Gulani, Vikas; Griswold, Mark
2018-07-01
To estimate multiple components within a single voxel in magnetic resonance fingerprinting when the number and types of tissues comprising the voxel are not known a priori. Multiple tissue components within a single voxel are potentially separable with magnetic resonance fingerprinting as a result of differences in signal evolutions of each component. The Bayesian framework for inverse problems provides a natural and flexible setting for solving this problem when the tissue composition per voxel is unknown. Assuming that only a few entries from the dictionary contribute to a mixed signal, sparsity-promoting priors can be placed upon the solution. An iterative algorithm is applied to compute the maximum a posteriori estimator of the posterior probability density to determine the magnetic resonance fingerprinting dictionary entries that contribute most significantly to mixed or pure voxels. Simulation results show that the algorithm is robust in finding the component tissues of mixed voxels. Preliminary in vivo data confirm this result, and show good agreement in voxels containing pure tissue. The Bayesian framework and algorithm shown provide accurate solutions for the partial-volume problem in magnetic resonance fingerprinting. The flexibility of the method will allow further study into different priors and hyperpriors that can be applied in the model. Magn Reson Med 80:159-170, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md
2014-01-01
Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803
Probabilistic dose-response modeling: case study using dichloromethane PBPK model results.
Marino, Dale J; Starr, Thomas B
2007-12-01
A revised assessment of dichloromethane (DCM) has recently been reported that examines the influence of human genetic polymorphisms on cancer risks using deterministic PBPK and dose-response modeling in the mouse combined with probabilistic PBPK modeling in humans. This assessment utilized Bayesian techniques to optimize kinetic variables in mice and humans with mean values from posterior distributions used in the deterministic modeling in the mouse. To supplement this research, a case study was undertaken to examine the potential impact of probabilistic rather than deterministic PBPK and dose-response modeling in mice on subsequent unit risk factor (URF) determinations. Four separate PBPK cases were examined based on the exposure regimen of the NTP DCM bioassay. These were (a) Same Mouse (single draw of all PBPK inputs for both treatment groups); (b) Correlated BW-Same Inputs (single draw of all PBPK inputs for both treatment groups except for bodyweights (BWs), which were entered as correlated variables); (c) Correlated BW-Different Inputs (separate draws of all PBPK inputs for both treatment groups except that BWs were entered as correlated variables); and (d) Different Mouse (separate draws of all PBPK inputs for both treatment groups). Monte Carlo PBPK inputs reflect posterior distributions from Bayesian calibration in the mouse that had been previously reported. A minimum of 12,500 PBPK iterations were undertaken, in which dose metrics, i.e., mg DCM metabolized by the GST pathway/L tissue/day for lung and liver were determined. For dose-response modeling, these metrics were combined with NTP tumor incidence data that were randomly selected from binomial distributions. Resultant potency factors (0.1/ED(10)) were coupled with probabilistic PBPK modeling in humans that incorporated genetic polymorphisms to derive URFs. Results show that there was relatively little difference, i.e., <10% in central tendency and upper percentile URFs, regardless of the case evaluated. Independent draws of PBPK inputs resulted in the slightly higher URFs. Results were also comparable to corresponding values from the previously reported deterministic mouse PBPK and dose-response modeling approach that used LED(10)s to derive potency factors. This finding indicated that the adjustment from ED(10) to LED(10) in the deterministic approach for DCM compensated for variability resulting from probabilistic PBPK and dose-response modeling in the mouse. Finally, results show a similar degree of variability in DCM risk estimates from a number of different sources including the current effort even though these estimates were developed using very different techniques. Given the variety of different approaches involved, 95th percentile-to-mean risk estimate ratios of 2.1-4.1 represent reasonable bounds on variability estimates regarding probabilistic assessments of DCM.
NASA Astrophysics Data System (ADS)
Caticha, Ariel
2007-11-01
What is information? Is it physical? We argue that in a Bayesian theory the notion of information must be defined in terms of its effects on the beliefs of rational agents. Information is whatever constrains rational beliefs and therefore it is the force that induces us to change our minds. This problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), which is designed for updating from arbitrary priors given information in the form of arbitrary constraints, includes as special cases both MaxEnt (which allows arbitrary constraints) and Bayes' rule (which allows arbitrary priors). Thus, ME unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme that allows us to handle problems that lie beyond the reach of either of the two methods separately. I conclude with a couple of simple illustrative examples.
A Bayesian technique for improving the sensitivity of the atmospheric neutrino L/E analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blake, A. S. T.; Chapman, J. D.; Thomson, M. A.
Tmore » his paper outlines a method for improving the precision of atmospheric neutrino oscillation measurements. One experimental signature for these oscillations is an observed deficit in the rate of ν μ charged-current interactions with an oscillatory dependence on L ν / E ν , where L ν is the neutrino propagation distance and E mrow is="true"> ν is the neutrino energy. For contained-vertex atmospheric neutrino interactions, the L ν / E ν resolution varies significantly from event to event. he precision of the oscillation measurement can be improved by incorporating information on L ν / E ν resolution into the oscillation analysis. In the analysis presented here, a Bayesian technique is used to estimate the L ν / E ν resolution of observed atmospheric neutrinos on an event-by-event basis. By separating the events into bins of L ν / E ν resolution in the oscillation analysis, a significant improvement in oscillation sensitivity can be achieved.« less
NASA Technical Reports Server (NTRS)
Wiegmann, Douglas A.a
2005-01-01
The NASA Aviation Safety Program (AvSP) has defined several products that will potentially modify airline and/or ATC operations, enhance aircraft systems, and improve the identification of potential hazardous situations within the National Airspace System (NAS). Consequently, there is a need to develop methods for evaluating the potential safety benefit of each of these intervention products so that resources can be effectively invested to produce the judgments to develop Bayesian Belief Networks (BBN's) that model the potential impact that specific interventions may have. Specifically, the present report summarizes methodologies for improving the elicitation of probability estimates during expert evaluations of AvSP products for use in BBN's. The work involved joint efforts between Professor James Luxhoj from Rutgers University and researchers at the University of Illinois. The Rutgers' project to develop BBN's received funding by NASA entitled "Probabilistic Decision Support for Evaluating Technology Insertion and Assessing Aviation Safety System Risk." The proposed project was funded separately but supported the existing Rutgers' program.
Radiation Source Mapping with Bayesian Inverse Methods
Hykes, Joshua M.; Azmy, Yousry Y.
2017-03-22
In this work, we present a method to map the spectral and spatial distributions of radioactive sources using a limited number of detectors. Locating and identifying radioactive materials is important for border monitoring, in accounting for special nuclear material in processing facilities, and in cleanup operations following a radioactive material spill. Most methods to analyze these types of problems make restrictive assumptions about the distribution of the source. In contrast, the source mapping method presented here allows an arbitrary three-dimensional distribution in space and a gamma peak distribution in energy. To apply the method, the problem is cast as anmore » inverse problem where the system’s geometry and material composition are known and fixed, while the radiation source distribution is sought. A probabilistic Bayesian approach is used to solve the resulting inverse problem since the system of equations is ill-posed. The posterior is maximized with a Newton optimization method. The probabilistic approach also provides estimates of the confidence in the final source map prediction. A set of adjoint, discrete ordinates flux solutions, obtained in this work by the Denovo code, is required to efficiently compute detector responses from a candidate source distribution. These adjoint fluxes form the linear mapping from the state space to the response space. The test of the method’s success is simultaneously locating a set of 137Cs and 60Co gamma sources in a room. This test problem is solved using experimental measurements that we collected for this purpose. Because of the weak sources available for use in the experiment, some of the expected photopeaks were not distinguishable from the Compton continuum. However, by supplanting 14 flawed measurements (out of a total of 69) with synthetic responses computed by MCNP, the proof-of-principle source mapping was successful. The locations of the sources were predicted within 25 cm for two of the sources and 90 cm for the third, in a room with an ~4-x 4-m floor plan. Finally, the predicted source intensities were within a factor of ten of their true value.« less
Assessing Requirements Volatility and Risk Using Bayesian Networks
NASA Technical Reports Server (NTRS)
Russell, Michael S.
2010-01-01
There are many factors that affect the level of requirements volatility a system experiences over its lifecycle and the risk that volatility imparts. Improper requirements generation, undocumented user expectations, conflicting design decisions, and anticipated / unanticipated world states are representative of these volatility factors. Combined, these volatility factors can increase programmatic risk and adversely affect successful system development. This paper proposes that a Bayesian Network can be used to support reasonable judgments concerning the most likely sources and types of requirements volatility a developing system will experience prior to starting development and by doing so it is possible to predict the level of requirements volatility the system will experience over its lifecycle. This assessment offers valuable insight to the system's developers, particularly by providing a starting point for risk mitigation planning and execution.
Chen, Haibin; Yang, Yan; Jiang, Wei; Song, Mengjie; Wang, Ying; Xiang, Tiantian
2017-02-01
A case study on the source separation of municipal solid waste (MSW) was performed in Changsha, the capital city of Hunan Province, China. The objective of this study is to analyze the effects of different separation methods and compare their effects with citizens' attitudes and inclination. An effect evaluation method based on accuracy rate and miscellany rate was proposed to study the performance of different separation methods. A large-scale questionnaire survey was conducted to determine citizens' attitudes and inclination toward source separation. Survey result shows that the vast majority of respondents hold consciously positive attitudes toward participation in source separation. Moreover, the respondents ignore the operability of separation methods and would rather choose the complex separation method involving four or more subclassed categories. For the effects of separation methods, the site experiment result demonstrates that the relatively simple separation method involving two categories (food waste and other waste) achieves the best effect with the highest accuracy rate (83.1%) and the lowest miscellany rate (16.9%) among the proposed experimental alternatives. The outcome reflects the inconsistency between people's environmental awareness and behavior. Such inconsistency and conflict may be attributed to the lack of environmental knowledge. Environmental education is assumed to be a fundamental solution to improve the effect of source separation of MSW in Changsha. Important management tips on source separation, including the reformation of the current pay-as-you-throw (PAYT) system, are presented in this work. A case study on the source separation of municipal solid waste was performed in Changsha. An effect evaluation method based on accuracy rate and miscellany rate was proposed to study the performance of different separation methods. The site experiment result demonstrates that the two-category (food waste and other waste) method achieves the best effect. The inconsistency between people's inclination and the effect of source separation exists. The proposed method can be expanded to other cities to determine the most effective separation method during planning stages or to evaluate the performance of running source separation systems.
A Bayesian approach for parameter estimation and prediction using a computationally intensive model
Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...
2015-02-05
Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less
NASA Astrophysics Data System (ADS)
Mustać, Marija; Tkalčić, Hrvoje; Burky, Alexander L.
2018-01-01
Moment tensor (MT) inversion studies of events in The Geysers geothermal field mostly focused on microseismicity and found a large number of earthquakes with significant non-double-couple (non-DC) seismic radiation. Here we concentrate on the largest events in the area in recent years using a hierarchical Bayesian MT inversion. Initially, we show that the non-DC components of the MT can be reliably retrieved using regional waveform data from a small number of stations. Subsequently, we present results for a number of events and show that accounting for noise correlations can lead to retrieval of a lower isotropic (ISO) component and significantly different focal mechanisms. We compute the Bayesian evidence to compare solutions obtained with different assumptions of the noise covariance matrix. Although a diagonal covariance matrix produces a better waveform fit, inversions that account for noise correlations via an empirically estimated noise covariance matrix account for interdependences of data errors and are preferred from a Bayesian point of view. This implies that improper treatment of data noise in waveform inversions can result in fitting the noise and misinterpreting the non-DC components. Finally, one of the analyzed events is characterized as predominantly DC, while the others still have significant non-DC components, probably as a result of crack opening, which is a reasonable hypothesis for The Geysers geothermal field geological setting.
NASA Astrophysics Data System (ADS)
Chen, Po-Hao; Botzolakis, Emmanuel; Mohan, Suyash; Bryan, R. N.; Cook, Tessa
2016-03-01
In radiology, diagnostic errors occur either through the failure of detection or incorrect interpretation. Errors are estimated to occur in 30-35% of all exams and contribute to 40-54% of medical malpractice litigations. In this work, we focus on reducing incorrect interpretation of known imaging features. Existing literature categorizes cognitive bias leading a radiologist to an incorrect diagnosis despite having correctly recognized the abnormal imaging features: anchoring bias, framing effect, availability bias, and premature closure. Computational methods make a unique contribution, as they do not exhibit the same cognitive biases as a human. Bayesian networks formalize the diagnostic process. They modify pre-test diagnostic probabilities using clinical and imaging features, arriving at a post-test probability for each possible diagnosis. To translate Bayesian networks to clinical practice, we implemented an entirely web-based open-source software tool. In this tool, the radiologist first selects a network of choice (e.g. basal ganglia). Then, large, clearly labeled buttons displaying salient imaging features are displayed on the screen serving both as a checklist and for input. As the radiologist inputs the value of an extracted imaging feature, the conditional probabilities of each possible diagnosis are updated. The software presents its level of diagnostic discrimination using a Pareto distribution chart, updated with each additional imaging feature. Active collaboration with the clinical radiologist is a feasible approach to software design and leads to design decisions closely coupling the complex mathematics of conditional probability in Bayesian networks with practice.
Exoplanet Biosignatures: A Framework for Their Assessment.
Catling, David C; Krissansen-Totton, Joshua; Kiang, Nancy Y; Crisp, David; Robinson, Tyler D; DasSarma, Shiladitya; Rushby, Andrew J; Del Genio, Anthony; Bains, William; Domagal-Goldman, Shawn
2018-04-20
Finding life on exoplanets from telescopic observations is an ultimate goal of exoplanet science. Life produces gases and other substances, such as pigments, which can have distinct spectral or photometric signatures. Whether or not life is found with future data must be expressed with probabilities, requiring a framework of biosignature assessment. We present a framework in which we advocate using biogeochemical "Exo-Earth System" models to simulate potential biosignatures in spectra or photometry. Given actual observations, simulations are used to find the Bayesian likelihoods of those data occurring for scenarios with and without life. The latter includes "false positives" wherein abiotic sources mimic biosignatures. Prior knowledge of factors influencing planetary inhabitation, including previous observations, is combined with the likelihoods to give the Bayesian posterior probability of life existing on a given exoplanet. Four components of observation and analysis are necessary. (1) Characterization of stellar (e.g., age and spectrum) and exoplanetary system properties, including "external" exoplanet parameters (e.g., mass and radius), to determine an exoplanet's suitability for life. (2) Characterization of "internal" exoplanet parameters (e.g., climate) to evaluate habitability. (3) Assessment of potential biosignatures within the environmental context (components 1-2), including corroborating evidence. (4) Exclusion of false positives. We propose that resulting posterior Bayesian probabilities of life's existence map to five confidence levels, ranging from "very likely" (90-100%) to "very unlikely" (<10%) inhabited. Key Words: Bayesian statistics-Biosignatures-Drake equation-Exoplanets-Habitability-Planetary science. Astrobiology 18, xxx-xxx.
Sharpe, J Danielle; Hopkins, Richard S; Cook, Robert L; Striley, Catherine W
2016-10-20
Traditional influenza surveillance relies on influenza-like illness (ILI) syndrome that is reported by health care providers. It primarily captures individuals who seek medical care and misses those who do not. Recently, Web-based data sources have been studied for application to public health surveillance, as there is a growing number of people who search, post, and tweet about their illnesses before seeking medical care. Existing research has shown some promise of using data from Google, Twitter, and Wikipedia to complement traditional surveillance for ILI. However, past studies have evaluated these Web-based sources individually or dually without comparing all 3 of them, and it would be beneficial to know which of the Web-based sources performs best in order to be considered to complement traditional methods. The objective of this study is to comparatively analyze Google, Twitter, and Wikipedia by examining which best corresponds with Centers for Disease Control and Prevention (CDC) ILI data. It was hypothesized that Wikipedia will best correspond with CDC ILI data as previous research found it to be least influenced by high media coverage in comparison with Google and Twitter. Publicly available, deidentified data were collected from the CDC, Google Flu Trends, HealthTweets, and Wikipedia for the 2012-2015 influenza seasons. Bayesian change point analysis was used to detect seasonal changes, or change points, in each of the data sources. Change points in Google, Twitter, and Wikipedia that occurred during the exact week, 1 preceding week, or 1 week after the CDC's change points were compared with the CDC data as the gold standard. All analyses were conducted using the R package "bcp" version 4.0.0 in RStudio version 0.99.484 (RStudio Inc). In addition, sensitivity and positive predictive values (PPV) were calculated for Google, Twitter, and Wikipedia. During the 2012-2015 influenza seasons, a high sensitivity of 92% was found for Google, whereas the PPV for Google was 85%. A low sensitivity of 50% was calculated for Twitter; a low PPV of 43% was found for Twitter also. Wikipedia had the lowest sensitivity of 33% and lowest PPV of 40%. Of the 3 Web-based sources, Google had the best combination of sensitivity and PPV in detecting Bayesian change points in influenza-related data streams. Findings demonstrated that change points in Google, Twitter, and Wikipedia data occasionally aligned well with change points captured in CDC ILI data, yet these sources did not detect all changes in CDC data and should be further studied and developed.
Turi, Christina E; Murch, Susan J
2013-07-09
Ethnobotanical research and the study of plants used for rituals, ceremonies and to connect with the spirit world have led to the discovery of many novel psychoactive compounds such as nicotine, caffeine, and cocaine. In North America, spiritual and ceremonial uses of plants are well documented and can be accessed online via the University of Michigan's Native American Ethnobotany Database. The objective of the study was to compare Residual, Bayesian, Binomial and Imprecise Dirichlet Model (IDM) analyses of ritual, ceremonial and spiritual plants in Moerman's ethnobotanical database and to identify genera that may be good candidates for the discovery of novel psychoactive compounds. The database was queried with the following format "Family Name AND Ceremonial OR Spiritual" for 263 North American botanical families. Spiritual and ceremonial flora consisted of 86 families with 517 species belonging to 292 genera. Spiritual taxa were then grouped further into ceremonial medicines and items categories. Residual, Bayesian, Binomial and IDM analysis were performed to identify over and under-utilized families. The 4 statistical approaches were in good agreement when identifying under-utilized families but large families (>393 species) were underemphasized by Binomial, Bayesian and IDM approaches for over-utilization. Residual, Binomial, and IDM analysis identified similar families as over-utilized in the medium (92-392 species) and small (<92 species) classes. The families Apiaceae, Asteraceae, Ericacea, Pinaceae and Salicaceae were identified as significantly over-utilized as ceremonial medicines in medium and large sized families. Analysis of genera within the Apiaceae and Asteraceae suggest that the genus Ligusticum and Artemisia are good candidates for facilitating the discovery of novel psychoactive compounds. The 4 statistical approaches were not consistent in the selection of over-utilization of flora. Residual analysis revealed overall trends that were supported by Binomial analysis when separated into small, medium and large families. The Bayesian, Binomial and IDM approaches identified different genera as potentially important. Species belonging to the genus Artemisia and Ligusticum were most consistently identified and may be valuable in future studies of the ethnopharmacology. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Dawson, Colin; Gerken, LouAnn
2011-01-01
While many constraints on learning must be relatively experience-independent, past experience provides a rich source of guidance for subsequent learning. Discovering structure in some domain can inform a learner's future hypotheses about that domain. If a general property accounts for particular sub-patterns, a rational learner should not…
A Study of Bayesian Estimation and Comparison of Response Time Models in Item Response Theory
ERIC Educational Resources Information Center
Suh, Hongwook
2010-01-01
Response time has been regarded as an important source for investigating the relationship between human performance and response speed. It is important to examine the relationship between response time and item characteristics, especially in the perspective of the relationship between response time and various factors that affect examinee's…
USDA-ARS?s Scientific Manuscript database
The correct identification of the source population of an invasive species is a prerequisite for defining and testing different hypotheses concerning the environmental and evolutionary factors responsible for biological invasions. The native area of invasive species may be large, barely known and/or...
Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances
NASA Astrophysics Data System (ADS)
Stähler, Simon C.; Sigloch, Karin
2016-11-01
Seismic source inversion, a central task in seismology, is concerned with the estimation of earthquake source parameters and their uncertainties. Estimating uncertainties is particularly challenging because source inversion is a non-linear problem. In a companion paper, Stähler and Sigloch (2014) developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements, a problem we address here. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D = 1 - CC of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. By identifying and quantifying this likelihood function, we make D and thus waveform cross-correlation measurements usable for fully probabilistic sampling strategies, in source inversion and related applications such as seismic tomography.
Mejia Tobar, Alejandra; Hyoudou, Rikiya; Kita, Kahori; Nakamura, Tatsuhiro; Kambara, Hiroyuki; Ogata, Yousuke; Hanakawa, Takashi; Koike, Yasuharu; Yoshimura, Natsue
2017-01-01
The classification of ankle movements from non-invasive brain recordings can be applied to a brain-computer interface (BCI) to control exoskeletons, prosthesis, and functional electrical stimulators for the benefit of patients with walking impairments. In this research, ankle flexion and extension tasks at two force levels in both legs, were classified from cortical current sources estimated by a hierarchical variational Bayesian method, using electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) recordings. The hierarchical prior for the current source estimation from EEG was obtained from activated brain areas and their intensities from an fMRI group (second-level) analysis. The fMRI group analysis was performed on regions of interest defined over the primary motor cortex, the supplementary motor area, and the somatosensory area, which are well-known to contribute to movement control. A sparse logistic regression method was applied for a nine-class classification (eight active tasks and a resting control task) obtaining a mean accuracy of 65.64% for time series of current sources, estimated from the EEG and the fMRI signals using a variational Bayesian method, and a mean accuracy of 22.19% for the classification of the pre-processed of EEG sensor signals, with a chance level of 11.11%. The higher classification accuracy of current sources, when compared to EEG classification accuracy, was attributed to the high number of sources and the different signal patterns obtained in the same vertex for different motor tasks. Since the inverse filter estimation for current sources can be done offline with the present method, the present method is applicable to real-time BCIs. Finally, due to the highly enhanced spatial distribution of current sources over the brain cortex, this method has the potential to identify activation patterns to design BCIs for the control of an affected limb in patients with stroke, or BCIs from motor imagery in patients with spinal cord injury.
Bayesian source term determination with unknown covariance of measurements
NASA Astrophysics Data System (ADS)
Belal, Alkomiet; Tichý, Ondřej; Šmídl, Václav
2017-04-01
Determination of a source term of release of a hazardous material into the atmosphere is a very important task for emergency response. We are concerned with the problem of estimation of the source term in the conventional linear inverse problem, y = Mx, where the relationship between the vector of observations y is described using the source-receptor-sensitivity (SRS) matrix M and the unknown source term x. Since the system is typically ill-conditioned, the problem is recast as an optimization problem minR,B(y - Mx)TR-1(y - Mx) + xTB-1x. The first term minimizes the error of the measurements with covariance matrix R, and the second term is a regularization of the source term. There are different types of regularization arising for different choices of matrices R and B, for example, Tikhonov regularization assumes covariance matrix B as the identity matrix multiplied by scalar parameter. In this contribution, we adopt a Bayesian approach to make inference on the unknown source term x as well as unknown R and B. We assume prior on x to be a Gaussian with zero mean and unknown diagonal covariance matrix B. The covariance matrix of the likelihood R is also unknown. We consider two potential choices of the structure of the matrix R. First is the diagonal matrix and the second is a locally correlated structure using information on topology of the measuring network. Since the inference of the model is intractable, iterative variational Bayes algorithm is used for simultaneous estimation of all model parameters. The practical usefulness of our contribution is demonstrated on an application of the resulting algorithm to real data from the European Tracer Experiment (ETEX). This research is supported by EEA/Norwegian Financial Mechanism under project MSMT-28477/2014 Source-Term Determination of Radionuclide Releases by Inverse Atmospheric Dispersion Modelling (STRADI).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marleau, Peter; Monterial, Mateusz; Clarke, Shaun
A Bayesian approach is proposed for pulse shape discrimination of photons and neutrons in liquid organic scinitillators. Instead of drawing a decision boundary, each pulse is assigned a photon or neutron confidence probability. In addition, this allows for photon and neutron classification on an event-by-event basis. The sum of those confidence probabilities is used to estimate the number of photon and neutron instances in the data. An iterative scheme, similar to an expectation-maximization algorithm for Gaussian mixtures, is used to infer the ratio of photons-to-neutrons in each measurement. Therefore, the probability space adapts to data with varying photon-to-neutron ratios. Amore » time-correlated measurement of Am–Be and separate measurements of 137Cs, 60Co and 232Th photon sources were used to construct libraries of neutrons and photons. These libraries were then used to produce synthetic data sets with varying ratios of photons-to-neutrons. Probability weighted method that we implemented was found to maintain neutron acceptance rate of up to 90% up to photon-to-neutron ratio of 2000, and performed 9% better than the decision boundary approach. Furthermore, the iterative approach appropriately changed the probability space with an increasing number of photons which kept the neutron population estimate from unrealistically increasing.« less
Joint Bayesian inference for near-surface explosion yield
NASA Astrophysics Data System (ADS)
Bulaevskaya, V.; Ford, S. R.; Ramirez, A. L.; Rodgers, A. J.
2016-12-01
A near-surface explosion generates seismo-acoustic motion that is related to its yield. However, the recorded motion is affected by near-source effects such as depth-of-burial, and propagation-path effects such as variable geology. We incorporate these effects in a forward model relating yield to seismo-acoustic motion, and use Bayesian inference to estimate yield given recordings of the seismo-acoustic wavefield. The Bayesian approach to this inverse problem allows us to obtain the probability distribution of plausible yield values and thus quantify the uncertainty in the yield estimate. Moreover, the sensitivity of the acoustic signal falls as a function of the depth-of-burial, while the opposite relationship holds for the seismic signal. Therefore, using both the acoustic and seismic wavefield data allows us to avoid the trade-offs associated with using only one of these signals alone. In addition, our inference framework allows for correlated features of the same data type (seismic or acoustic) to be incorporated in the estimation of yield in order to make use of as much information from the same waveform as possible. We demonstrate our approach with a historical dataset and a contemporary field experiment.
BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data
Ji, Yuan; Xu, Yanxun; Zhang, Qiong; Tsui, Kam-Wah; Yuan, Yuan; Norris, Clift; Liang, Shoudan; Liang, Han
2011-01-01
Summary Next-generation sequencing (NGS) technology generates millions of short reads, which provide valuable information for various aspects of cellular activities and biological functions. A key step in NGS applications (e.g., RNA-Seq) is to map short reads to correct genomic locations within the source genome. While most reads are mapped to a unique location, a significant proportion of reads align to multiple genomic locations with equal or similar numbers of mismatches; these are called multireads. The ambiguity in mapping the multireads may lead to bias in downstream analyses. Currently, most practitioners discard the multireads in their analysis, resulting in a loss of valuable information, especially for the genes with similar sequences. To refine the read mapping, we develop a Bayesian model that computes the posterior probability of mapping a multiread to each competing location. The probabilities are used for downstream analyses, such as the quantification of gene expression. We show through simulation studies and RNA-Seq analysis of real life data that the Bayesian method yields better mapping than the current leading methods. We provide a C++ program for downloading that is being packaged into a user-friendly software. PMID:21517792
Bayes-LQAS: classifying the prevalence of global acute malnutrition
2010-01-01
Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications. PMID:20534159
Bayes-LQAS: classifying the prevalence of global acute malnutrition.
Olives, Casey; Pagano, Marcello
2010-06-09
Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications.
Fuzzy Bayesian Network-Bow-Tie Analysis of Gas Leakage during Biomass Gasification
Yan, Fang; Xu, Kaili; Yao, Xiwen; Li, Yang
2016-01-01
Biomass gasification technology has been rapidly developed recently. But fire and poisoning accidents caused by gas leakage restrict the development and promotion of biomass gasification. Therefore, probabilistic safety assessment (PSA) is necessary for biomass gasification system. Subsequently, Bayesian network-bow-tie (BN-bow-tie) analysis was proposed by mapping bow-tie analysis into Bayesian network (BN). Causes of gas leakage and the accidents triggered by gas leakage can be obtained by bow-tie analysis, and BN was used to confirm the critical nodes of accidents by introducing corresponding three importance measures. Meanwhile, certain occurrence probability of failure was needed in PSA. In view of the insufficient failure data of biomass gasification, the occurrence probability of failure which cannot be obtained from standard reliability data sources was confirmed by fuzzy methods based on expert judgment. An improved approach considered expert weighting to aggregate fuzzy numbers included triangular and trapezoidal numbers was proposed, and the occurrence probability of failure was obtained. Finally, safety measures were indicated based on the obtained critical nodes. The theoretical occurrence probabilities in one year of gas leakage and the accidents caused by it were reduced to 1/10.3 of the original values by these safety measures. PMID:27463975
SIG-VISA: Signal-based Vertically Integrated Seismic Monitoring
NASA Astrophysics Data System (ADS)
Moore, D.; Mayeda, K. M.; Myers, S. C.; Russell, S.
2013-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software; however, while such detections may constitute a useful summary of station activity, they discard large amounts of information present in the original recorded signal. We present SIG-VISA (Signal-based Vertically Integrated Seismic Analysis), a system for seismic monitoring through Bayesian inference on seismic signals. By directly modeling the recorded signal, our approach incorporates additional information unavailable to detection-based methods, enabling higher sensitivity and more accurate localization using techniques such as waveform matching. SIG-VISA's Bayesian forward model of seismic signal envelopes includes physically-derived models of travel times and source characteristics as well as Gaussian process (kriging) statistical models of signal properties that combine interpolation of historical data with extrapolation of learned physical trends. Applying Bayesian inference, we evaluate the model on earthquakes as well as the 2009 DPRK test event, demonstrating a waveform matching effect as part of the probabilistic inference, along with results on event localization and sensitivity. In particular, we demonstrate increased sensitivity from signal-based modeling, in which the SIGVISA signal model finds statistical evidence for arrivals even at stations for which the IMS station processing failed to register any detection.
Bayesian Atmospheric Radiative Transfer (BART) Code and Application to WASP-43b
NASA Astrophysics Data System (ADS)
Blecic, Jasmina; Harrington, Joseph; Cubillos, Patricio; Bowman, Oliver; Rojo, Patricio; Stemm, Madison; Lust, Nathaniel B.; Challener, Ryan; Foster, Austin James; Foster, Andrew S.; Blumenthal, Sarah D.; Bruce, Dylan
2016-01-01
We present a new open-source Bayesian radiative-transfer framework, Bayesian Atmospheric Radiative Transfer (BART, https://github.com/exosports/BART), and its application to WASP-43b. BART initializes a model for the atmospheric retrieval calculation, generates thousands of theoretical model spectra using parametrized pressure and temperature profiles and line-by-line radiative-transfer calculation, and employs a statistical package to compare the models with the observations. It consists of three self-sufficient modules available to the community under the reproducible-research license, the Thermochemical Equilibrium Abundances module (TEA, https://github.com/dzesmin/TEA, Blecic et al. 2015}, the radiative-transfer module (Transit, https://github.com/exosports/transit), and the Multi-core Markov-chain Monte Carlo statistical module (MCcubed, https://github.com/pcubillos/MCcubed, Cubillos et al. 2015). We applied BART on all available WASP-43b secondary eclipse data from the space- and ground-based observations constraining the temperature-pressure profile and molecular abundances of the dayside atmosphere of WASP-43b. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G. JB holds a NASA Earth and Space Science Fellowship.
NASA Astrophysics Data System (ADS)
Mahieux, Arnaud; Goldstein, David B.; Varghese, Philip; Trafton, Laurence M.
2017-10-01
The vapor and particulate plumes arising from the southern polar regions of Enceladus are a key signature of what lies below the surface. Multiple Cassini instruments (INMS, CDA, CAPS, MAG, UVIS, VIMS, ISS) measured the gas-particle plume over the warm Tiger Stripe region and there have been several close flybys. Numerous observations also exist of the near-vent regions in the visible and the IR. The most likely source for these extensive geysers is a subsurface liquid reservoir of somewhat saline water and other volatiles boiling off through crevasse-like conduits into the vacuum of space.In this work, we use a DSMC code to simulate the plume as it exits a vent, considering axisymmetric conditions, in a vertical domain extending up to 10 km. Above 10 km altitude, the flow is collisionless and well modeled in a separate free molecular code. We perform a DSMC parametric and sensitivity study of the following vent parameters: vent diameter, outgassed flow density, water gas/water ice mass flow ratio, gas and ice speed, and ice grain diameter. We build parametric expressions of the plume characteristics at the 10 km upper boundary (number density, temperature, velocity) that will be used in a Bayesian inversion algorithm in order to constrain source conditions from fits to plume observations by various instruments on board the Cassini spacecraft and assess the parametric sensitivity study.
Visuospatial working memory mediates inhibitory and facilitatory guidance in preview search.
Barrett, Doug J K; Shimozaki, Steven S; Jensen, Silke; Zobay, Oliver
2016-10-01
Visual search is faster and more accurate when a subset of distractors is presented before the display containing the target. This "preview benefit" has been attributed to separate inhibitory and facilitatory guidance mechanisms during search. In the preview task the temporal cues thought to elicit inhibition and facilitation provide complementary sources of information about the likely location of the target. In this study, we use a Bayesian observer model to compare sensitivity when the temporal cues eliciting inhibition and facilitation produce complementary, and competing, sources of information. Observers searched for T-shaped targets among L-shaped distractors in 2 standard and 2 preview conditions. In the standard conditions, all the objects in the display appeared at the same time. In the preview conditions, the initial subset of distractors either stayed on the screen or disappeared before the onset of the search display, which contained the target when present. In the latter, the synchronous onset of old and new objects negates the predictive utility of stimulus-driven capture during search. The results indicate observers combine memory-driven inhibition and sensory-driven capture to reduce spatial uncertainty about the target's likely location during search. In the absence of spatially predictive onsets, memory-driven inhibition at old locations persists despite irrelevant sensory change at previewed locations. This result is consistent with a bias toward unattended objects during search via the active suppression of irrelevant capture at previously attended locations. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Leube, Philipp; Geiges, Andreas; Nowak, Wolfgang
2010-05-01
Incorporating hydrogeological data, such as head and tracer data, into stochastic models of subsurface flow and transport helps to reduce prediction uncertainty. Considering limited financial resources available for the data acquisition campaign, information needs towards the prediction goal should be satisfied in a efficient and task-specific manner. For finding the best one among a set of design candidates, an objective function is commonly evaluated, which measures the expected impact of data on prediction confidence, prior to their collection. An appropriate approach to this task should be stochastically rigorous, master non-linear dependencies between data, parameters and model predictions, and allow for a wide variety of different data types. Existing methods fail to fulfill all these requirements simultaneously. For this reason, we introduce a new method, denoted as CLUE (Cross-bred Likelihood Uncertainty Estimator), that derives the essential distributions and measures of data utility within a generalized, flexible and accurate framework. The method makes use of Bayesian GLUE (Generalized Likelihood Uncertainty Estimator) and extends it to an optimal design method by marginalizing over the yet unknown data values. Operating in a purely Bayesian Monte-Carlo framework, CLUE is a strictly formal information processing scheme free of linearizations. It provides full flexibility associated with the type of measurements (linear, non-linear, direct, indirect) and accounts for almost arbitrary sources of uncertainty (e.g. heterogeneity, geostatistical assumptions, boundary conditions, model concepts) via stochastic simulation and Bayesian model averaging. This helps to minimize the strength and impact of possible subjective prior assumptions, that would be hard to defend prior to data collection. Our study focuses on evaluating two different uncertainty measures: (i) expected conditional variance and (ii) expected relative entropy of a given prediction goal. The applicability and advantages are shown in a synthetic example. Therefor, we consider a contaminant source, posing a threat on a drinking water well in an aquifer. Furthermore, we assume uncertainty in geostatistical parameters, boundary conditions and hydraulic gradient. The two mentioned measures evaluate the sensitivity of (1) general prediction confidence and (2) exceedance probability of a legal regulatory threshold value on sampling locations.
Anand, Vibha; Rosenman, Marc B; Downs, Stephen M
2013-09-01
To develop a map of disease associations exclusively using two publicly available genetic sources: the catalog of single nucleotide polymorphisms (SNPs) from the HapMap, and the catalog of Genome Wide Association Studies (GWAS) from the NHGRI, and to evaluate it with a large, long-standing electronic medical record (EMR). A computational model, In Silico Bayesian Integration of GWAS (IsBIG), was developed to learn associations among diseases using a Bayesian network (BN) framework, using only genetic data. The IsBIG model (I-Model) was re-trained using data from our EMR (M-Model). Separately, another clinical model (C-Model) was learned from this training dataset. The I-Model was compared with both the M-Model and the C-Model for power to discriminate a disease given other diseases using a test dataset from our EMR. Area under receiver operator characteristics curve was used as a performance measure. Direct associations between diseases in the I-Model were also searched in the PubMed database and in classes of the Human Disease Network (HDN). On the basis of genetic information alone, the I-Model linked a third of diseases from our EMR. When compared to the M-Model, the I-Model predicted diseases given other diseases with 94% specificity, 33% sensitivity, and 80% positive predictive value. The I-Model contained 117 direct associations between diseases. Of those associations, 20 (17%) were absent from the searches of the PubMed database; one of these was present in the C-Model. Of the direct associations in the I-Model, 7 (35%) were absent from disease classes of HDN. Using only publicly available genetic sources we have mapped associations in GWAS to a human disease map using an in silico approach. Furthermore, we have validated this disease map using phenotypic data from our EMR. Models predicting disease associations on the basis of known genetic associations alone are specific but not sensitive. Genetic data, as it currently exists, can only explain a fraction of the risk of a disease. Our approach makes a quantitative statement about disease variation that can be explained in an EMR on the basis of genetic associations described in the GWAS. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Open-source Software for Exoplanet Atmospheric Modeling
NASA Astrophysics Data System (ADS)
Cubillos, Patricio; Blecic, Jasmina; Harrington, Joseph
2018-01-01
I will present a suite of self-standing open-source tools to model and retrieve exoplanet spectra implemented for Python. These include: (1) a Bayesian-statistical package to run Levenberg-Marquardt optimization and Markov-chain Monte Carlo posterior sampling, (2) a package to compress line-transition data from HITRAN or Exomol without loss of information, (3) a package to compute partition functions for HITRAN molecules, (4) a package to compute collision-induced absorption, and (5) a package to produce radiative-transfer spectra of transit and eclipse exoplanet observations and atmospheric retrievals.
NASA Astrophysics Data System (ADS)
Butman, D. E.; Holtgrieve, G. W.
2017-12-01
Recent modelling studies in large catchments have estimated that in excess of 74% of the dissolved carbon dioxide found in first and second order streams originate from allochthonous sources. Stable isotopes of carbon-13 in carbon dioxide have been used to identify ground water seeps in stream systems, where decreases in δ13CO2 occur along gaining stream reaches, suggesting that carbon dioxide in ground water is more depleted than what is found in surface water due to fractionation of CO2 during emissions across the air water interface. Although isotopes represent a chemical tracer in stream systems for potential groundwater contribution, the temporal resolution of discrete samples make partitioning allochthonous versus autochthonous sources of CO2 difficult on hydrologically relevant time scales. Here we show results of field deployments of high frequent dissolved CO2, O2, PAR, Temperature and pH from the Thornton Creek Watershed, the largest urban watershed in Seattle, WA. We present an exploration into using high resolution time series of dissolved oxygen and carbon dioxide in a dual gas approach to separate the contribution of in stream respiration from external sources. We extend upon previous efforts to model stream metabolism across diel cycles by incorporating simultaneous direct measurements of dissolved oxygen, PCO2, and pH within an inverse modeling framework and Bayesian parameter estimation. With an initial assumption of a stoichiometric ratio of 1:1 for O2 and CO2 for autochthonous driven metabolism, we investigate positive or negative departures from this ratio as an indicator of external CO2 to the stream (terrestrial or atmospheric) and factors contributing to this flux.
NASA Astrophysics Data System (ADS)
Dutta, Rishabh; Jónsson, Sigurjón; Wang, Teng; Vasyura-Bathke, Hannes
2018-04-01
Several researchers have studied the source parameters of the 2005 Fukuoka (northwestern Kyushu Island, Japan) earthquake (Mw 6.6) using teleseismic, strong motion and geodetic data. However, in all previous studies, errors of the estimated fault solutions have been neglected, making it impossible to assess the reliability of the reported solutions. We use Bayesian inference to estimate the location, geometry and slip parameters of the fault and their uncertainties using Interferometric Synthetic Aperture Radar and Global Positioning System data. The offshore location of the earthquake makes the fault parameter estimation challenging, with geodetic data coverage mostly to the southeast of the earthquake. To constrain the fault parameters, we use a priori constraints on the magnitude of the earthquake and the location of the fault with respect to the aftershock distribution and find that the estimated fault slip ranges from 1.5 to 2.5 m with decreasing probability. The marginal distributions of the source parameters show that the location of the western end of the fault is poorly constrained by the data whereas that of the eastern end, located closer to the shore, is better resolved. We propagate the uncertainties of the fault model and calculate the variability of Coulomb failure stress changes for the nearby Kego fault, located directly below Fukuoka city, showing that the main shock increased stress on the fault and brought it closer to failure.
Thibodeau, C; Monette, F; Glaus, M; Laflamme, C B
2011-01-01
The black water and grey water source-separation sanitation system aims at efficient use of energy (biogas), water and nutrients but currently lacks evidence of economic viability to be considered a credible alternative to the conventional system. This study intends to demonstrate economic viability, identify main cost contributors and assess critical influencing factors. A technico-economic model was built based on a new neighbourhood in a Canadian context. Three implementation scales of source-separation system are defined: 500, 5,000 and 50,000 inhabitants. The results show that the source-separation system is 33% to 118% more costly than the conventional system, with the larger cost differential obtained by lower source-separation system implementation scales. A sensitivity analysis demonstrates that vacuum toilet flow reduction from 1.0 to 0.25 L/flush decreases source-separation system cost between 23 and 27%. It also shows that high resource costs can be beneficial or unfavourable to the source-separation system depending on whether the vacuum toilet flow is low or normal. Therefore, the future of this configuration of the source-separation system lies mainly in vacuum toilet flow reduction or the introduction of new efficient effluent volume reduction processes (e.g. reverse osmosis).
Wagner, Michael M.; Cooper, Gregory F.; Ferraro, Jeffrey P.; Su, Howard; Gesteland, Per H.; Haug, Peter J.; Millett, Nicholas E.; Aronis, John M.; Nowalk, Andrew J.; Ruiz, Victor M.; López Pineda, Arturo; Shi, Lingyun; Van Bree, Rudy; Ginter, Thomas; Tsui, Fuchiang
2017-01-01
Objectives This study evaluates the accuracy and transferability of Bayesian case detection systems (BCD) that use clinical notes from emergency department (ED) to detect influenza cases. Methods A BCD uses natural language processing (NLP) to infer the presence or absence of clinical findings from ED notes, which are fed into a Bayesain network classifier (BN) to infer patients’ diagnoses. We developed BCDs at the University of Pittsburgh Medical Center (BCDUPMC) and Intermountain Healthcare in Utah (BCDIH). At each site, we manually built a rule-based NLP and trained a Bayesain network classifier from over 40,000 ED encounters between Jan. 2008 and May. 2010 using feature selection, machine learning, and expert debiasing approach. Transferability of a BCD in this study may be impacted by seven factors: development (source) institution, development parser, application (target) institution, application parser, NLP transfer, BN transfer, and classification task. We employed an ANOVA analysis to study their impacts on BCD performance. Results Both BCDs discriminated well between influenza and non-influenza on local test cases (AUCs > 0.92). When tested for transferability using the other institution’s cases, BCDUPMC discriminations declined minimally (AUC decreased from 0.95 to 0.94, p<0.01), and BCDIH discriminations declined more (from 0.93 to 0.87, p<0.0001). We attributed the BCDIH decline to the lower recall of the IH parser on UPMC notes. The ANOVA analysis showed five significant factors: development parser, application institution, application parser, BN transfer, and classification task. Conclusion We demonstrated high influenza case detection performance in two large healthcare systems in two geographically separated regions, providing evidentiary support for the use of automated case detection from routinely collected electronic clinical notes in national influenza surveillance. The transferability could be improved by training Bayesian network classifier locally and increasing the accuracy of the NLP parser. PMID:28380048
Ye, Ye; Wagner, Michael M; Cooper, Gregory F; Ferraro, Jeffrey P; Su, Howard; Gesteland, Per H; Haug, Peter J; Millett, Nicholas E; Aronis, John M; Nowalk, Andrew J; Ruiz, Victor M; López Pineda, Arturo; Shi, Lingyun; Van Bree, Rudy; Ginter, Thomas; Tsui, Fuchiang
2017-01-01
This study evaluates the accuracy and transferability of Bayesian case detection systems (BCD) that use clinical notes from emergency department (ED) to detect influenza cases. A BCD uses natural language processing (NLP) to infer the presence or absence of clinical findings from ED notes, which are fed into a Bayesain network classifier (BN) to infer patients' diagnoses. We developed BCDs at the University of Pittsburgh Medical Center (BCDUPMC) and Intermountain Healthcare in Utah (BCDIH). At each site, we manually built a rule-based NLP and trained a Bayesain network classifier from over 40,000 ED encounters between Jan. 2008 and May. 2010 using feature selection, machine learning, and expert debiasing approach. Transferability of a BCD in this study may be impacted by seven factors: development (source) institution, development parser, application (target) institution, application parser, NLP transfer, BN transfer, and classification task. We employed an ANOVA analysis to study their impacts on BCD performance. Both BCDs discriminated well between influenza and non-influenza on local test cases (AUCs > 0.92). When tested for transferability using the other institution's cases, BCDUPMC discriminations declined minimally (AUC decreased from 0.95 to 0.94, p<0.01), and BCDIH discriminations declined more (from 0.93 to 0.87, p<0.0001). We attributed the BCDIH decline to the lower recall of the IH parser on UPMC notes. The ANOVA analysis showed five significant factors: development parser, application institution, application parser, BN transfer, and classification task. We demonstrated high influenza case detection performance in two large healthcare systems in two geographically separated regions, providing evidentiary support for the use of automated case detection from routinely collected electronic clinical notes in national influenza surveillance. The transferability could be improved by training Bayesian network classifier locally and increasing the accuracy of the NLP parser.
Guo, Yanyong; Li, Zhibin; Wu, Yao; Xu, Chengcheng
2018-06-01
Bicyclists running the red light at crossing facilities increase the potential of colliding with motor vehicles. Exploring the contributing factors could improve the prediction of running red-light probability and develop countermeasures to reduce such behaviors. However, individuals could have unobserved heterogeneities in running a red light, which make the accurate prediction more challenging. Traditional models assume that factor parameters are fixed and cannot capture the varying impacts on red-light running behaviors. In this study, we employed the full Bayesian random parameters logistic regression approach to account for the unobserved heterogeneous effects. Two types of crossing facilities were considered which were the signalized intersection crosswalks and the road segment crosswalks. Electric and conventional bikes were distinguished in the modeling. Data were collected from 16 crosswalks in urban area of Nanjing, China. Factors such as individual characteristics, road geometric design, environmental features, and traffic variables were examined. Model comparison indicates that the full Bayesian random parameters logistic regression approach is statistically superior to the standard logistic regression model. More red-light runners are predicted at signalized intersection crosswalks than at road segment crosswalks. Factors affecting red-light running behaviors are gender, age, bike type, road width, presence of raised median, separation width, signal type, green ratio, bike and vehicle volume, and average vehicle speed. Factors associated with the unobserved heterogeneity are gender, bike type, signal type, separation width, and bike volume. Copyright © 2018 Elsevier Ltd. All rights reserved.
Self-evaluation of decision-making: A general Bayesian framework for metacognitive computation.
Fleming, Stephen M; Daw, Nathaniel D
2017-01-01
People are often aware of their mistakes, and report levels of confidence in their choices that correlate with objective performance. These metacognitive assessments of decision quality are important for the guidance of behavior, particularly when external feedback is absent or sporadic. However, a computational framework that accounts for both confidence and error detection is lacking. In addition, accounts of dissociations between performance and metacognition have often relied on ad hoc assumptions, precluding a unified account of intact and impaired self-evaluation. Here we present a general Bayesian framework in which self-evaluation is cast as a "second-order" inference on a coupled but distinct decision system, computationally equivalent to inferring the performance of another actor. Second-order computation may ensue whenever there is a separation between internal states supporting decisions and confidence estimates over space and/or time. We contrast second-order computation against simpler first-order models in which the same internal state supports both decisions and confidence estimates. Through simulations we show that second-order computation provides a unified account of different types of self-evaluation often considered in separate literatures, such as confidence and error detection, and generates novel predictions about the contribution of one's own actions to metacognitive judgments. In addition, the model provides insight into why subjects' metacognition may sometimes be better or worse than task performance. We suggest that second-order computation may underpin self-evaluative judgments across a range of domains. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Self-Evaluation of Decision-Making: A General Bayesian Framework for Metacognitive Computation
2017-01-01
People are often aware of their mistakes, and report levels of confidence in their choices that correlate with objective performance. These metacognitive assessments of decision quality are important for the guidance of behavior, particularly when external feedback is absent or sporadic. However, a computational framework that accounts for both confidence and error detection is lacking. In addition, accounts of dissociations between performance and metacognition have often relied on ad hoc assumptions, precluding a unified account of intact and impaired self-evaluation. Here we present a general Bayesian framework in which self-evaluation is cast as a “second-order” inference on a coupled but distinct decision system, computationally equivalent to inferring the performance of another actor. Second-order computation may ensue whenever there is a separation between internal states supporting decisions and confidence estimates over space and/or time. We contrast second-order computation against simpler first-order models in which the same internal state supports both decisions and confidence estimates. Through simulations we show that second-order computation provides a unified account of different types of self-evaluation often considered in separate literatures, such as confidence and error detection, and generates novel predictions about the contribution of one’s own actions to metacognitive judgments. In addition, the model provides insight into why subjects’ metacognition may sometimes be better or worse than task performance. We suggest that second-order computation may underpin self-evaluative judgments across a range of domains. PMID:28004960
Dai, Sheng-Yun; Xu, Bing; Zhang, Yi; Li, Jian-Yu; Sun, Fei; Shi, Xin-Yuan; Qiao, Yan-Jiang
2016-09-01
Coptis chinensis (Huanglian) is a commonly used traditional Chinese medicine (TCM) herb and alkaloids are the most important chemical constituents in it. In the present study, an isocratic reverse phase high performance liquid chromatography (RP-HPLC) method allowing the separation of six alkaloids in Huanglian was for the first time developed under the quality by design (QbD) principles. First, five chromatographic parameters were identified to construct a Plackett-Burman experimental design. The critical resolution, analysis time, and peak width were responses modeled by multivariate linear regression. The results showed that the percentage of acetonitrile, concentration of sodium dodecyl sulfate, and concentration of potassium phosphate monobasic were statistically significant parameters (P < 0.05). Then, the Box-Behnken experimental design was applied to further evaluate the interactions between the three parameters on selected responses. Full quadratic models were built and used to establish the analytical design space. Moreover, the reliability of design space was estimated by the Bayesian posterior predictive distribution. The optimal separation was predicted at 40% acetonitrile, 1.7 g·mL(-1) of sodium dodecyl sulfate and 0.03 mol·mL(-1) of potassium phosphate monobasic. Finally, the accuracy profile methodology was used to validate the established HPLC method. The results demonstrated that the QbD concept could be efficiently used to develop a robust RP-HPLC analytical method for Huanglian. Copyright © 2016 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Causal Inference for Spatial Constancy across Saccades
Atsma, Jeroen; Maij, Femke; Koppen, Mathieu; Irwin, David E.; Medendorp, W. Pieter
2016-01-01
Our ability to interact with the environment hinges on creating a stable visual world despite the continuous changes in retinal input. To achieve visual stability, the brain must distinguish the retinal image shifts caused by eye movements and shifts due to movements of the visual scene. This process appears not to be flawless: during saccades, we often fail to detect whether visual objects remain stable or move, which is called saccadic suppression of displacement (SSD). How does the brain evaluate the memorized information of the presaccadic scene and the actual visual feedback of the postsaccadic visual scene in the computations for visual stability? Using a SSD task, we test how participants localize the presaccadic position of the fixation target, the saccade target or a peripheral non-foveated target that was displaced parallel or orthogonal during a horizontal saccade, and subsequently viewed for three different durations. Results showed different localization errors of the three targets, depending on the viewing time of the postsaccadic stimulus and its spatial separation from the presaccadic location. We modeled the data through a Bayesian causal inference mechanism, in which at the trial level an optimal mixing of two possible strategies, integration vs. separation of the presaccadic memory and the postsaccadic sensory signals, is applied. Fits of this model generally outperformed other plausible decision strategies for producing SSD. Our findings suggest that humans exploit a Bayesian inference process with two causal structures to mediate visual stability. PMID:26967730
A Three-Body Simulation of Kepler-91: A Potential Trojan System
NASA Astrophysics Data System (ADS)
D'Angelo, Bryan Daniel
This paper presents a three-body simulation of Kepler-91 (KIC 8219268) using parameters generated by the EXONEST software package. EXONEST uses Bayesian model testing and Bayesian parameter estimation to model photometric variations and three-body motion. A close examination of the Kepler-91 light curve reveals what appears to be a third dimming event that occurs 60° out of phase with the primary transit of the conrmed planet Kepler-91b, which makes a Trojan planet in the L4 or L5 Lagrange point an enticing explanation. EXONEST is also used to model the radial velocity of Kepler-91 based on the three-body motion. The three-body analysis by EXONEST predicts a Jovian planet with mass 2:54 +/- 0:27MJ and radius 2:37 +/- 0:25RJ , and Trojan planet with mass 0:44 +/- 0:26MJ and radius 0:86 +/- 0:14R J that orbits an average of 60:39 +/- 3:74° out of phase with the Jovian, with a maximum separation angle of 68:4 +/- 43:74° and minimum separation angle of 52:33 +/- 3:74°. Both planets are predicted to have an inclination angle of 67:76 +/- 2:26° and eccentricity 0:073 +/- 0:004. The three-body motion predicts Kepler-91 to have a radial velocity semi-amplitude of 66:75 +/- 38:22 m/s and reduced mass times the sine of the inclination angle (mu sin i) of 0:732 +/- 0:385MJ.
Imaging Anisotropic Layering with Bayesian Inversion of Multiple Data Types
NASA Astrophysics Data System (ADS)
Bodin, T.; Leiva, J.; Romanowicz, B. A.; Maupin, V.; Yuan, H.
2015-12-01
Anisotropic images of the upper-mantle are usually obtained by analyzing different types of seismic observables, such as surface wave dispersion curves or waveforms, SKS splitting data, or receiver functions. These different data types sample different volumes of the earth, they are sensitive to separate length-scales, and hence are associated with various levels of uncertainties. They are traditionally interpreted separately, and often result in incompatible models. We present a Bayesian inversion approach to jointly invert these different data types. Seismograms for SKS and P phases are directly inverted, thus avoiding intermediate processing steps such as numerical deconvolution or computation of splitting parameters. Probabilistic 1D profiles are obtained with a transdimensional Markov chain Monte Carlo scheme, in which the number of layers, as well as the presence or absence of anisotropy in each layer, are treated as unknown parameters. In this way, seismic anisotropy is only introduced if required by the data. The algorithm is used to resolve both isotropic and anisotropic layering down to a depth of 350 km beneath two seismic stations in North America in two different tectonic settings: the stable Canadian shield (station FFC), and the tectonically active southern Basin and Range Province (station TA-214A). In both cases, the lithosphere-asthenosphere boundary is clearly visible, and marked by a change in direction of the fast axis of anisotropy. Our study confirms that azimuthal anisotropy is a powerful tool for detecting layering in the upper mantle.
NASA Technical Reports Server (NTRS)
Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.
1990-01-01
Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.
On the Occurrence of Wide Binaries in the Local Disk and Halo Populations
NASA Astrophysics Data System (ADS)
Hartman, Zachary; Lepine, Sebastien
2018-01-01
We present results from our search for wide binaries in the SUPERBLINK+GAIA all-sky catalog of 2.8 million high proper motion stars (μ>40 mas/yr). Through a Bayesian analysis of common proper motion pairs, we have identified highly probable wide binary/multiple systems based on statistics of their proper motion differences and angular separations. Using a reduced proper motion diagram, we determine whether these wide are part of the young disk, old disk, or Galactic halo population. We examine the relative occurrence rate for very wide companions in these respective populations. All groups are found to contain a significant number of wide binary systems, with about 1 percent of the stars in each group having pairs with separations >1,000 AU.
Bayesian data analysis for newcomers.
Kruschke, John K; Liddell, Torrin M
2018-02-01
This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.
NASA Astrophysics Data System (ADS)
Wellen, Christopher; Arhonditsis, George B.; Labencki, Tanya; Boyd, Duncan
2012-10-01
Regression-type, hybrid empirical/process-based models (e.g., SPARROW, PolFlow) have assumed a prominent role in efforts to estimate the sources and transport of nutrient pollution at river basin scales. However, almost no attempts have been made to explicitly accommodate interannual nutrient loading variability in their structure, despite empirical and theoretical evidence indicating that the associated source/sink processes are quite variable at annual timescales. In this study, we present two methodological approaches to accommodate interannual variability with the Spatially Referenced Regressions on Watershed attributes (SPARROW) nonlinear regression model. The first strategy uses the SPARROW model to estimate a static baseline load and climatic variables (e.g., precipitation) to drive the interannual variability. The second approach allows the source/sink processes within the SPARROW model to vary at annual timescales using dynamic parameter estimation techniques akin to those used in dynamic linear models. Model parameterization is founded upon Bayesian inference techniques that explicitly consider calibration data and model uncertainty. Our case study is the Hamilton Harbor watershed, a mixed agricultural and urban residential area located at the western end of Lake Ontario, Canada. Our analysis suggests that dynamic parameter estimation is the more parsimonious of the two strategies tested and can offer insights into the temporal structural changes associated with watershed functioning. Consistent with empirical and theoretical work, model estimated annual in-stream attenuation rates varied inversely with annual discharge. Estimated phosphorus source areas were concentrated near the receiving water body during years of high in-stream attenuation and dispersed along the main stems of the streams during years of low attenuation, suggesting that nutrient source areas are subject to interannual variability.
NASA Astrophysics Data System (ADS)
Zielke, Olaf; McDougall, Damon; Mai, Martin; Babuska, Ivo
2014-05-01
Seismic, often augmented with geodetic data, are frequently used to invert for the spatio-temporal evolution of slip along a rupture plane. The resulting images of the slip evolution for a single event, inferred by different research teams, often vary distinctly, depending on the adopted inversion approach and rupture model parameterization. This observation raises the question, which of the provided kinematic source inversion solutions is most reliable and most robust, and — more generally — how accurate are fault parameterization and solution predictions? These issues are not included in "standard" source inversion approaches. Here, we present a statistical inversion approach to constrain kinematic rupture parameters from teleseismic body waves. The approach is based a) on a forward-modeling scheme that computes synthetic (body-)waves for a given kinematic rupture model, and b) on the QUESO (Quantification of Uncertainty for Estimation, Simulation, and Optimization) library that uses MCMC algorithms and Bayes theorem for sample selection. We present Bayesian inversions for rupture parameters in synthetic earthquakes (i.e. for which the exact rupture history is known) in an attempt to identify the cross-over at which further model discretization (spatial and temporal resolution of the parameter space) is no longer attributed to a decreasing misfit. Identification of this cross-over is of importance as it reveals the resolution power of the studied data set (i.e. teleseismic body waves), enabling one to constrain kinematic earthquake rupture histories of real earthquakes at a resolution that is supported by data. In addition, the Bayesian approach allows for mapping complete posterior probability density functions of the desired kinematic source parameters, thus enabling us to rigorously assess the uncertainties in earthquake source inversions.
Common source-multiple load vs. separate source-individual load photovoltaic system
NASA Technical Reports Server (NTRS)
Appelbaum, Joseph
1989-01-01
A comparison of system performance is made for two possible system setups: (1) individual loads powered by separate solar cell sources; and (2) multiple loads powered by a common solar cell source. A proof for resistive loads is given that shows the advantage of a common source over a separate source photovoltaic system for a large range of loads. For identical loads, both systems perform the same.
Mohammed, Seid; Asfaw, Zeytu G
2018-01-01
The term malnutrition generally refers to both under-nutrition and over-nutrition, but this study uses the term to refer solely to a deficiency of nutrition. In Ethiopia, child malnutrition is one of the most serious public health problem and the highest in the world. The purpose of the present study was to identify the high risk factors of malnutrition and test different statistical models for childhood malnutrition and, thereafter weighing the preferable model through model comparison criteria. Bayesian Gaussian regression model was used to analyze the effect of selected socioeconomic, demographic, health and environmental covariates on malnutrition under five years old child's. Inference was made using Bayesian approach based on Markov Chain Monte Carlo (MCMC) simulation techniques in BayesX. The study found that the variables such as sex of a child, preceding birth interval, age of the child, father's education level, source of water, mother's body mass index, head of household sex, mother's age at birth, wealth index, birth order, diarrhea, child's size at birth and duration of breast feeding showed significant effects on children's malnutrition in Ethiopia. The age of child, mother's age at birth and mother's body mass index could also be important factors with a non linear effect for the child's malnutrition in Ethiopia. Thus, the present study emphasizes a special care on variables such as sex of child, preceding birth interval, father's education level, source of water, sex of head of household, wealth index, birth order, diarrhea, child's size at birth, duration of breast feeding, age of child, mother's age at birth and mother's body mass index to combat childhood malnutrition in developing countries.
A Bayesian Supertree Model for Genome-Wide Species Tree Reconstruction
De Oliveira Martins, Leonardo; Mallo, Diego; Posada, David
2016-01-01
Current phylogenomic data sets highlight the need for species tree methods able to deal with several sources of gene tree/species tree incongruence. At the same time, we need to make most use of all available data. Most species tree methods deal with single processes of phylogenetic discordance, namely, gene duplication and loss, incomplete lineage sorting (ILS) or horizontal gene transfer. In this manuscript, we address the problem of species tree inference from multilocus, genome-wide data sets regardless of the presence of gene duplication and loss and ILS therefore without the need to identify orthologs or to use a single individual per species. We do this by extending the idea of Maximum Likelihood (ML) supertrees to a hierarchical Bayesian model where several sources of gene tree/species tree disagreement can be accounted for in a modular manner. We implemented this model in a computer program called guenomu whose inputs are posterior distributions of unrooted gene tree topologies for multiple gene families, and whose output is the posterior distribution of rooted species tree topologies. We conducted extensive simulations to evaluate the performance of our approach in comparison with other species tree approaches able to deal with more than one leaf from the same species. Our method ranked best under simulated data sets, in spite of ignoring branch lengths, and performed well on empirical data, as well as being fast enough to analyze relatively large data sets. Our Bayesian supertree method was also very successful in obtaining better estimates of gene trees, by reducing the uncertainty in their distributions. In addition, our results show that under complex simulation scenarios, gene tree parsimony is also a competitive approach once we consider its speed, in contrast to more sophisticated models. PMID:25281847
Emerging Concepts of Data Integration in Pathogen Phylodynamics.
Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.
NASA Astrophysics Data System (ADS)
Frey, M. P.; Stamm, C.; Schneider, M. K.; Reichert, P.
2011-12-01
A distributed hydrological model was used to simulate the distribution of fast runoff formation as a proxy for critical source areas for herbicide pollution in a small agricultural catchment in Switzerland. We tested to what degree predictions based on prior knowledge without local measurements could be improved upon relying on observed discharge. This learning process consisted of five steps: For the prior prediction (step 1), knowledge of the model parameters was coarse and predictions were fairly uncertain. In the second step, discharge data were used to update the prior parameter distribution. Effects of uncertainty in input data and model structure were accounted for by an autoregressive error model. This step decreased the width of the marginal distributions of parameters describing the lower boundary (percolation rates) but hardly affected soil hydraulic parameters. Residual analysis (step 3) revealed model structure deficits. We modified the model, and in the subsequent Bayesian updating (step 4) the widths of the posterior marginal distributions were reduced for most parameters compared to those of the prior. This incremental procedure led to a strong reduction in the uncertainty of the spatial prediction. Thus, despite only using spatially integrated data (discharge), the spatially distributed effect of the improved model structure can be expected to improve the spatially distributed predictions also. The fifth step consisted of a test with independent spatial data on herbicide losses and revealed ambiguous results. The comparison depended critically on the ratio of event to preevent water that was discharged. This ratio cannot be estimated from hydrological data only. The results demonstrate that the value of local data is strongly dependent on a correct model structure. An iterative procedure of Bayesian updating, model testing, and model modification is suggested.
Emerging Concepts of Data Integration in Pathogen Phylodynamics
Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504
Burian, Alfred; Kainz, Martin J; Schagerl, Michael; Yasindi, Andrew
2014-06-01
1. The analysis of functional groups with a resolution to the individual species level is a basic requirement to better understand complex interactions in aquatic food webs. Species-specific stable isotope analyses are currently applied to analyse the trophic role of large zooplankton or fish species, but technical constraints complicate their application to smaller-sized plankton. 2. We investigated rotifer food assimilation during a short-term microzooplankton bloom in the East African soda lake Nakuru by developing a method for species-specific sampling of rotifers. 3. The two dominant rotifers, Brachionus plicatilis and Brachionus dimidiatus , were separated to single-species samples (purity >95%) and significantly differed in their isotopic values (4.1‰ in δ 13 C and 1.5‰ in δ 15 N). Bayesian mixing models indicated that isotopic differences were caused by different assimilation of filamentous cyanobacteria and particles <2 μm and underlined the importance of species-specific sampling of smaller plankton compartments. 4. A main difference was that the filamentous cyanobacterium Arthrospira fusiformis , which frequently forms blooms in African soda lakes, was an important food source for the larger-sized B. plicatilis (48%), whereas it was hardly ingested by B. dimidiatus . Overall, A . fusiformis was, relative to its biomass, assimilated to small extents, demonstrating a high grazing resistance of this species. 5. In combination with high population densities, these results demonstrate a strong potential of rotifer blooms to shape phytoplankton communities and are the first in situ demonstration of a quantitatively important direct trophic link between rotifers and filamentous cyanobacteria.
Burian, Alfred; Kainz, Martin J; Schagerl, Michael; Yasindi, Andrew
2014-01-01
1. The analysis of functional groups with a resolution to the individual species level is a basic requirement to better understand complex interactions in aquatic food webs. Species-specific stable isotope analyses are currently applied to analyse the trophic role of large zooplankton or fish species, but technical constraints complicate their application to smaller-sized plankton. 2. We investigated rotifer food assimilation during a short-term microzooplankton bloom in the East African soda lake Nakuru by developing a method for species-specific sampling of rotifers. 3. The two dominant rotifers, Brachionus plicatilis and Brachionus dimidiatus, were separated to single-species samples (purity >95%) and significantly differed in their isotopic values (4.1‰ in δ13C and 1.5‰ in δ15N). Bayesian mixing models indicated that isotopic differences were caused by different assimilation of filamentous cyanobacteria and particles <2 μm and underlined the importance of species-specific sampling of smaller plankton compartments. 4. A main difference was that the filamentous cyanobacterium Arthrospira fusiformis, which frequently forms blooms in African soda lakes, was an important food source for the larger-sized B. plicatilis (48%), whereas it was hardly ingested by B. dimidiatus. Overall, A. fusiformis was, relative to its biomass, assimilated to small extents, demonstrating a high grazing resistance of this species. 5. In combination with high population densities, these results demonstrate a strong potential of rotifer blooms to shape phytoplankton communities and are the first in situ demonstration of a quantitatively important direct trophic link between rotifers and filamentous cyanobacteria. PMID:25866422
Meece, J.K.; Anderson, J.L.; Fisher, M.C.; Henk, D.A.; Sloss, Brian L.; Reed, K.D.
2011-01-01
Blastomyces dermatitidis, a thermally dimorphic fungus, is the etiologic agent of North American blastomycosis. Clinical presentation is varied, ranging from silent infections to fulminant respiratory disease and dissemination to skin and other sites. Exploration of the population genetic structure of B. dermatitidis would improve our knowledge regarding variation in virulence phenotypes, geographic distribution, and difference in host specificity. The objective of this study was to develop and test a panel of microsatellite markers to delineate the population genetic structure within a group of clinical and environmental isolates of B. dermatitidis. We developed 27 microsatellite markers and genotyped B. dermatitidis isolates from various hosts and environmental sources (n = 112). Assembly of a neighbor-joining tree of allele-sharing distance revealed two genetically distinct groups, separated by a deep node. Bayesian admixture analysis showed that two populations were statistically supported. Principal coordinate analysis also reinforced support for two genetic groups, with the primary axis explaining 61.41% of the genetic variability. Group 1 isolates average 1.8 alleles/locus, whereas group 2 isolates are highly polymorphic, averaging 8.2 alleles/locus. In this data set, alleles at three loci are unshared between the two groups and appear diagnostic. The mating type of individual isolates was determined by PCR. Both mating type-specific genes, the HMG and ??-box domains, were represented in each of the genetic groups, with slightly more isolates having the HMG allele. One interpretation of this study is that the species currently designated B. dermatitidis includes a cryptic subspecies or perhaps a separate species. ?? 2011, American Society for Microbiology.
Meece, Jennifer K.; Anderson, Jennifer L.; Fisher, Matthew C.; Henk, Daniel A.; Sloss, Brian L.; Reed, Kurt D.
2011-01-01
Blastomyces dermatitidis, a thermally dimorphic fungus, is the etiologic agent of North American blastomycosis. Clinical presentation is varied, ranging from silent infections to fulminant respiratory disease and dissemination to skin and other sites. Exploration of the population genetic structure of B. dermatitidis would improve our knowledge regarding variation in virulence phenotypes, geographic distribution, and difference in host specificity. The objective of this study was to develop and test a panel of microsatellite markers to delineate the population genetic structure within a group of clinical and environmental isolates of B. dermatitidis. We developed 27 microsatellite markers and genotyped B. dermatitidis isolates from various hosts and environmental sources (n=112). Assembly of a neighbor-joining tree of allele-sharing distance revealed two genetically distinct groups, separated by a deep node. Bayesian admixture analysis showed that two populations were statistically supported. Principal coordinate analysis also reinforced support for two genetic groups, with the primary axis explaining 61.41% of the genetic variability. Group 1 isolates average 1.8 alleles/locus, whereas group 2 isolates are highly polymorphic, averaging 8.2 alleles/locus. In this data set, alleles at three loci are unshared between the two groups and appear diagnostic. The mating type of individual isolates was determined by PCR. Both mating type-specific genes, the HMG and α-box domains, were represented in each of the genetic groups, with slightly more isolates having the HMG allele. One interpretation of this study is that the species currently designated B. dermatitidis includes a cryptic subspecies or perhaps a separate species.
Speeded Reaching Movements around Invisible Obstacles
Hudson, Todd E.; Wolfe, Uta; Maloney, Laurence T.
2012-01-01
We analyze the problem of obstacle avoidance from a Bayesian decision-theoretic perspective using an experimental task in which reaches around a virtual obstacle were made toward targets on an upright monitor. Subjects received monetary rewards for touching the target and incurred losses for accidentally touching the intervening obstacle. The locations of target-obstacle pairs within the workspace were varied from trial to trial. We compared human performance to that of a Bayesian ideal movement planner (who chooses motor strategies maximizing expected gain) using the Dominance Test employed in Hudson et al. (2007). The ideal movement planner suffers from the same sources of noise as the human, but selects movement plans that maximize expected gain in the presence of that noise. We find good agreement between the predictions of the model and actual performance in most but not all experimental conditions. PMID:23028276
Rifai Chai; Naik, Ganesh R; Tran, Yvonne; Sai Ho Ling; Craig, Ashley; Nguyen, Hung T
2015-08-01
An electroencephalography (EEG)-based counter measure device could be used for fatigue detection during driving. This paper explores the classification of fatigue and alert states using power spectral density (PSD) as a feature extractor and fuzzy swarm based-artificial neural network (ANN) as a classifier. An independent component analysis of entropy rate bound minimization (ICA-ERBM) is investigated as a novel source separation technique for fatigue classification using EEG analysis. A comparison of the classification accuracy of source separator versus no source separator is presented. Classification performance based on 43 participants without the inclusion of the source separator resulted in an overall sensitivity of 71.67%, a specificity of 75.63% and an accuracy of 73.65%. However, these results were improved after the inclusion of a source separator module, resulting in an overall sensitivity of 78.16%, a specificity of 79.60% and an accuracy of 78.88% (p <; 0.05).
Hong S. He; Daniel C. Dey; Xiuli Fan; Mevin B. Hooten; John M. Kabrick; Christopher K. Wikle; Zhaofei. Fan
2007-01-01
In the Midwestern United States, the GeneralLandOffice (GLO) survey records provide the only reasonably accurate data source of forest composition and tree species distribution at the time of pre-European settlement (circa late 1800 to early 1850). However, GLO data have two fundamental limitations: coarse spatial resolutions (the square mile section and half mile...
Classification of Maize and Weeds by Bayesian Networks
NASA Astrophysics Data System (ADS)
Chapron, Michel; Oprea, Alina; Sultana, Bogdan; Assemat, Louis
2007-11-01
Precision Agriculture is concerned with all sorts of within-field variability, spatially and temporally, that reduces the efficacy of agronomic practices applied in a uniform way all over the field. Because of these sources of heterogeneity, uniform management actions strongly reduce the efficiency of the resource input to the crop (i.e. fertilization, water) or for the agrochemicals use for pest control (i.e. herbicide). Moreover, this low efficacy means high environmental cost (pollution) and reduced economic return for the farmer. Weed plants are one of these sources of variability for the crop, as they occur in patches in the field. Detecting the location, size and internal density of these patches, along with identification of main weed species involved, open the way to a site-specific weed control strategy, where only patches of weeds would receive the appropriate herbicide (type and dose). Herein, an automatic recognition method of vegetal species is described. First, the pixels of soil and vegetation are classified in two classes, then the vegetation part of the input image is segmented from the distance image by using the watershed method and finally the leaves of the vegetation are partitioned in two parts maize and weeds thanks to the two Bayesian networks.
Bucci, Melanie E.; Callahan, Peggy; Koprowski, John L.; Polfus, Jean L.; Krausman, Paul R.
2015-01-01
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable. PMID:25803664
Derbridge, Jonathan J; Merkle, Jerod A; Bucci, Melanie E; Callahan, Peggy; Koprowski, John L; Polfus, Jean L; Krausman, Paul R
2015-01-01
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable.
A fast Bayesian approach to discrete object detection in astronomical data sets - PowellSnakes I
NASA Astrophysics Data System (ADS)
Carvalho, Pedro; Rocha, Graça; Hobson, M. P.
2009-03-01
A new fast Bayesian approach is introduced for the detection of discrete objects immersed in a diffuse background. This new method, called PowellSnakes, speeds up traditional Bayesian techniques by (i) replacing the standard form of the likelihood for the parameters characterizing the discrete objects by an alternative exact form that is much quicker to evaluate; (ii) using a simultaneous multiple minimization code based on Powell's direction set algorithm to locate rapidly the local maxima in the posterior and (iii) deciding whether each located posterior peak corresponds to a real object by performing a Bayesian model selection using an approximate evidence value based on a local Gaussian approximation to the peak. The construction of this Gaussian approximation also provides the covariance matrix of the uncertainties in the derived parameter values for the object in question. This new approach provides a speed up in performance by a factor of `100' as compared to existing Bayesian source extraction methods that use Monte Carlo Markov chain to explore the parameter space, such as that presented by Hobson & McLachlan. The method can be implemented in either real or Fourier space. In the case of objects embedded in a homogeneous random field, working in Fourier space provides a further speed up that takes advantage of the fact that the correlation matrix of the background is circulant. We illustrate the capabilities of the method by applying to some simplified toy models. Furthermore, PowellSnakes has the advantage of consistently defining the threshold for acceptance/rejection based on priors which cannot be said of the frequentist methods. We present here the first implementation of this technique (version I). Further improvements to this implementation are currently under investigation and will be published shortly. The application of the method to realistic simulated Planck observations will be presented in a forthcoming publication.
Almost but not quite 2D, Non-linear Bayesian Inversion of CSEM Data
NASA Astrophysics Data System (ADS)
Ray, A.; Key, K.; Bodin, T.
2013-12-01
The geophysical inverse problem can be elegantly stated in a Bayesian framework where a probability distribution can be viewed as a statement of information regarding a random variable. After all, the goal of geophysical inversion is to provide information on the random variables of interest - physical properties of the earth's subsurface. However, though it may be simple to postulate, a practical difficulty of fully non-linear Bayesian inversion is the computer time required to adequately sample the model space and extract the information we seek. As a consequence, in geophysical problems where evaluation of a full 2D/3D forward model is computationally expensive, such as marine controlled source electromagnetic (CSEM) mapping of the resistivity of seafloor oil and gas reservoirs, Bayesian studies have largely been conducted with 1D forward models. While the 1D approximation is indeed appropriate for exploration targets with planar geometry and geological stratification, it only provides a limited, site-specific idea of uncertainty in resistivity with depth. In this work, we extend our fully non-linear 1D Bayesian inversion to a 2D model framework, without requiring the usual regularization of model resistivities in the horizontal or vertical directions used to stabilize quasi-2D inversions. In our approach, we use the reversible jump Markov-chain Monte-Carlo (RJ-MCMC) or trans-dimensional method and parameterize the subsurface in a 2D plane with Voronoi cells. The method is trans-dimensional in that the number of cells required to parameterize the subsurface is variable, and the cells dynamically move around and multiply or combine as demanded by the data being inverted. This approach allows us to expand our uncertainty analysis of resistivity at depth to more than a single site location, allowing for interactions between model resistivities at different horizontal locations along a traverse over an exploration target. While the model is parameterized in 2D, we efficiently evaluate the forward response using 1D profiles extracted from the model at the common-midpoints of the EM source-receiver pairs. Since the 1D approximation is locally valid at different midpoint locations, the computation time is far lower than is required by a full 2D or 3D simulation. We have applied this method to both synthetic and real CSEM survey data from the Scarborough gas field on the Northwest shelf of Australia, resulting in a spatially variable quantification of resistivity and its uncertainty in 2D. This Bayesian approach results in a large database of 2D models that comprise a posterior probability distribution, which we can subset to test various hypotheses about the range of model structures compatible with the data. For example, we can subset the model distributions to examine the hypothesis that a resistive reservoir extends overs a certain spatial extent. Depending on how this conditions other parts of the model space, light can be shed on the geological viability of the hypothesis. Since tackling spatially variable uncertainty and trade-offs in 2D and 3D is a challenging research problem, the insights gained from this work may prove valuable for subsequent full 2D and 3D Bayesian inversions.
Schirtzinger, Erin E.; Matsumoto, Tania; Eberhard, Jessica R.; Graves, Gary R.; Sanchez, Juan J.; Capelli, Sara; Müller, Heinrich; Scharpegge, Julia; Chambers, Geoffrey K.; Fleischer, Robert C.
2008-01-01
The question of when modern birds (Neornithes) first diversified has generated much debate among avian systematists. Fossil evidence generally supports a Tertiary diversification, whereas estimates based on molecular dating favor an earlier diversification in the Cretaceous period. In this study, we used an alternate approach, the inference of historical biogeographic patterns, to test the hypothesis that the initial radiation of the Order Psittaciformes (the parrots and cockatoos) originated on the Gondwana supercontinent during the Cretaceous. We utilized broad taxonomic sampling (representatives of 69 of the 82 extant genera and 8 outgroup taxa) and multilocus molecular character sampling (3,941 bp from mitochondrial DNA (mtDNA) genes cytochrome oxidase I and NADH dehydrogenase 2 and nuclear introns of rhodopsin intron 1, tropomyosin alpha-subunit intron 5, and transforming growth factor ß-2) to generate phylogenetic hypotheses for the Psittaciformes. Analyses of the combined character partitions using maximum parsimony, maximum likelihood, and Bayesian criteria produced well-resolved and topologically similar trees in which the New Zealand taxa Strigops and Nestor (Psittacidae) were sister to all other psittaciforms and the cockatoo clade (Cacatuidae) was sister to a clade containing all remaining parrots (Psittacidae). Within this large clade of Psittacidae, some traditionally recognized tribes and subfamilies were monophyletic (e.g., Arini, Psittacini, and Loriinae), whereas several others were polyphyletic (e.g., Cyclopsittacini, Platycercini, Psittaculini, and Psittacinae). Ancestral area reconstructions using our Bayesian phylogenetic hypothesis and current distributions of genera supported the hypothesis of an Australasian origin for the Psittaciformes. Separate analyses of the timing of parrot diversification constructed with both Bayesian relaxed-clock and penalized likelihood approaches showed better agreement between geologic and diversification events in the chronograms based on a Cretaceous dating of the basal split within parrots than the chronograms based on a Tertiary dating of this split, although these data are more equivocal. Taken together, our results support a Cretaceous origin of Psittaciformes in Gondwana after the separation of Africa and the India/Madagascar block with subsequent diversification through both vicariance and dispersal. These well-resolved molecular phylogenies will be of value for comparative studies of behavior, ecology, and life history in parrots. PMID:18653733
An Anticipatory Model of Cavitation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Allgood, G.O.; Dress, W.B., Jr.; Hylton, J.O.
1999-04-05
The Anticipatory System (AS) formalism developed by Robert Rosen provides some insight into the problem of embedding intelligent behavior in machines. AS emulates the anticipatory behavior of biological systems. AS bases its behavior on its expectations about the near future and those expectations are modified as the system gains experience. The expectation is based on an internal model that is drawn from an appeal to physical reality. To be adaptive, the model must be able to update itself. To be practical, the model must run faster than real-time. The need for a physical model and the requirement that the modelmore » execute at extreme speeds, has held back the application of AS to practical problems. Two recent advances make it possible to consider the use of AS for practical intelligent sensors. First, advances in transducer technology make it possible to obtain previously unavailable data from which a model can be derived. For example, acoustic emissions (AE) can be fed into a Bayesian system identifier that enables the separation of a weak characterizing signal, such as the signature of pump cavitation precursors, from a strong masking signal, such as a pump vibration feature. The second advance is the development of extremely fast, but inexpensive, digital signal processing hardware on which it is possible to run an adaptive Bayesian-derived model faster than real-time. This paper reports the investigation of an AS using a model of cavitation based on hydrodynamic principles and Bayesian analysis of data from high-performance AE sensors.« less
Calibrated birth-death phylogenetic time-tree priors for bayesian inference.
Heled, Joseph; Drummond, Alexei J
2015-05-01
Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python.
Wiecki, Thomas V; Sofer, Imri; Frank, Michael J
2013-01-01
The diffusion model is a commonly used tool to infer latent psychological processes underlying decision-making, and to link them to neural mechanisms based on response times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of response time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model), which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject/condition than non-hierarchical methods, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g., fMRI) influence decision-making parameters. This paper will first describe the theoretical background of the drift diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the χ(2)-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs/
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.
dos Reis, Mario; Yang, Ziheng
2011-07-01
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
Bayesian networks for satellite payload testing
NASA Astrophysics Data System (ADS)
Przytula, Krzysztof W.; Hagen, Frank; Yung, Kar
1999-11-01
Satellite payloads are fast increasing in complexity, resulting in commensurate growth in cost of manufacturing and operation. A need exists for a software tool, which would assist engineers in production and operation of satellite systems. We have designed and implemented a software tool, which performs part of this task. The tool aids a test engineer in debugging satellite payloads during system testing. At this stage of satellite integration and testing both the tested payload and the testing equipment represent complicated systems consisting of a very large number of components and devices. When an error is detected during execution of a test procedure, the tool presents to the engineer a ranked list of potential sources of the error and a list of recommended further tests. The engineer decides this on this basis if to perform some of the recommended additional test or replace the suspect component. The tool has been installed in payload testing facility. The tool is based on Bayesian networks, a graphical method of representing uncertainty in terms of probabilistic influences. The Bayesian network was configured using detailed flow diagrams of testing procedures and block diagrams of the payload and testing hardware. The conditional and prior probability values were initially obtained from experts and refined in later stages of design. The Bayesian network provided a very informative model of the payload and testing equipment and inspired many new ideas regarding the future test procedures and testing equipment configurations. The tool is the first step in developing a family of tools for various phases of satellite integration and operation.
NASA Astrophysics Data System (ADS)
Han, Feng; Zheng, Yi
2018-06-01
Significant Input uncertainty is a major source of error in watershed water quality (WWQ) modeling. It remains challenging to address the input uncertainty in a rigorous Bayesian framework. This study develops the Bayesian Analysis of Input and Parametric Uncertainties (BAIPU), an approach for the joint analysis of input and parametric uncertainties through a tight coupling of Markov Chain Monte Carlo (MCMC) analysis and Bayesian Model Averaging (BMA). The formal likelihood function for this approach is derived considering a lag-1 autocorrelated, heteroscedastic, and Skew Exponential Power (SEP) distributed error model. A series of numerical experiments were performed based on a synthetic nitrate pollution case and on a real study case in the Newport Bay Watershed, California. The Soil and Water Assessment Tool (SWAT) and Differential Evolution Adaptive Metropolis (DREAM(ZS)) were used as the representative WWQ model and MCMC algorithm, respectively. The major findings include the following: (1) the BAIPU can be implemented and used to appropriately identify the uncertain parameters and characterize the predictive uncertainty; (2) the compensation effect between the input and parametric uncertainties can seriously mislead the modeling based management decisions, if the input uncertainty is not explicitly accounted for; (3) the BAIPU accounts for the interaction between the input and parametric uncertainties and therefore provides more accurate calibration and uncertainty results than a sequential analysis of the uncertainties; and (4) the BAIPU quantifies the credibility of different input assumptions on a statistical basis and can be implemented as an effective inverse modeling approach to the joint inference of parameters and inputs.
A Bayesian approach to model structural error and input variability in groundwater modeling
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.; Lin, Y. F. F.; Liang, F.
2015-12-01
Effective water resource management typically relies on numerical models to analyze groundwater flow and solute transport processes. Model structural error (due to simplification and/or misrepresentation of the "true" environmental system) and input forcing variability (which commonly arises since some inputs are uncontrolled or estimated with high uncertainty) are ubiquitous in groundwater models. Calibration that overlooks errors in model structure and input data can lead to biased parameter estimates and compromised predictions. We present a fully Bayesian approach for a complete assessment of uncertainty for spatially distributed groundwater models. The approach explicitly recognizes stochastic input and uses data-driven error models based on nonparametric kernel methods to account for model structural error. We employ exploratory data analysis to assist in specifying informative prior for error models to improve identifiability. The inference is facilitated by an efficient sampling algorithm based on DREAM-ZS and a parameter subspace multiple-try strategy to reduce the required number of forward simulations of the groundwater model. We demonstrate the Bayesian approach through a synthetic case study of surface-ground water interaction under changing pumping conditions. It is found that explicit treatment of errors in model structure and input data (groundwater pumping rate) has substantial impact on the posterior distribution of groundwater model parameters. Using error models reduces predictive bias caused by parameter compensation. In addition, input variability increases parametric and predictive uncertainty. The Bayesian approach allows for a comparison among the contributions from various error sources, which could inform future model improvement and data collection efforts on how to best direct resources towards reducing predictive uncertainty.
Bayesian methods for uncertainty factor application for derivation of reference values.
Simon, Ted W; Zhu, Yiliang; Dourson, Michael L; Beck, Nancy B
2016-10-01
In 2014, the National Research Council (NRC) published Review of EPA's Integrated Risk Information System (IRIS) Process that considers methods EPA uses for developing toxicity criteria for non-carcinogens. These criteria are the Reference Dose (RfD) for oral exposure and Reference Concentration (RfC) for inhalation exposure. The NRC Review suggested using Bayesian methods for application of uncertainty factors (UFs) to adjust the point of departure dose or concentration to a level considered to be without adverse effects for the human population. The NRC foresaw Bayesian methods would be potentially useful for combining toxicity data from disparate sources-high throughput assays, animal testing, and observational epidemiology. UFs represent five distinct areas for which both adjustment and consideration of uncertainty may be needed. NRC suggested UFs could be represented as Bayesian prior distributions, illustrated the use of a log-normal distribution to represent the composite UF, and combined this distribution with a log-normal distribution representing uncertainty in the point of departure (POD) to reflect the overall uncertainty. Here, we explore these suggestions and present a refinement of the methodology suggested by NRC that considers each individual UF as a distribution. From an examination of 24 evaluations from EPA's IRIS program, when individual UFs were represented using this approach, the geometric mean fold change in the value of the RfD or RfC increased from 3 to over 30, depending on the number of individual UFs used and the sophistication of the assessment. We present example calculations and recommendations for implementing the refined NRC methodology. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Zhang, Xiang; Faries, Douglas E; Boytsov, Natalie; Stamey, James D; Seaman, John W
2016-09-01
Observational studies are frequently used to assess the effectiveness of medical interventions in routine clinical practice. However, the use of observational data for comparative effectiveness is challenged by selection bias and the potential of unmeasured confounding. This is especially problematic for analyses using a health care administrative database, in which key clinical measures are often not available. This paper provides an approach to conducting a sensitivity analyses to investigate the impact of unmeasured confounding in observational studies. In a real world osteoporosis comparative effectiveness study, the bone mineral density (BMD) score, an important predictor of fracture risk and a factor in the selection of osteoporosis treatments, is unavailable in the data base and lack of baseline BMD could potentially lead to significant selection bias. We implemented Bayesian twin-regression models, which simultaneously model both the observed outcome and the unobserved unmeasured confounder, using information from external sources. A sensitivity analysis was also conducted to assess the robustness of our conclusions to changes in such external data. The use of Bayesian modeling in this study suggests that the lack of baseline BMD did have a strong impact on the analysis, reversing the direction of the estimated effect (odds ratio of fracture incidence at 24 months: 0.40 vs. 1.36, with/without adjusting for unmeasured baseline BMD). The Bayesian twin-regression models provide a flexible sensitivity analysis tool to quantitatively assess the impact of unmeasured confounding in observational studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Ferguson, Henry
With the end of the Herschel mission and no immediate successor at far-infrared wavelengths, it is imperative to extract as much information as possible from the existing data. The difference between the theoretical noise limit and the confusion limit suggests that significant improvements can be made with a more sophisticated treatment of source confusion. This is possible because we have a lot of information about the Herschel deep fields from other wavelengths. The project will use existing already-reduced data from Herschel's deepest observations, which targeted the CANDELS. These data have a wealth of observations from Hubble, Spitzer, Chandra and many other telescopes. The main work will be to develop and employ a new Bayesian technique that incorporates spectral-energy-distribution priors to constrain the range of likely far-infrared fluxes for each source that is detected by Hubble. The far-IR images are then segmented and the regions which are likely to suffer the most confusion are simultaneously fit, using the (broad) constraints on the likely farIR fluxes as a Bayesian prior. The first pass of photometry will yield reliable photometry for sources at least a factor of two fainter than existing catalogs. Subsequent passes can yield full probability distributions for the ensemble Far-IR SEDs of much fainter sources (overcoming some of the limitations of stacking in image space). We will used the improved and deeper FIR photometry to address two "crises" in reconciling galaxy evolution models with high-z galaxy observations: (1) the surprisingly young ages of most bright Lyman-break galaxies at redshift z=3 and (2) the surprisingly high star-formation rates and dust masses high-redshift sub-mm and FIR-selected galaxies. The former could potentially be explained if many of the descendants of UVbright galaxies at z=4 have too much dust by z=3 to be included in Lyman-break samples. The latter problem could be resolved if the fluxes of many FIR and sub-mm selected galaxies are affected by blending. The project will employ state-of-the art semi-analytical models for galaxy evolution, both for guidance in developing flexible Bayesian priors, and for guidance on the interpretation of the results. As part of the work we plan to further test and improve the treatment of dust in these models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, Robert N.; Urban, Marie L.; Duchscherer, Samantha E.
Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the artmore » by introducing the Population Data Tables (PDT), a Bayesian based informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000ft2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the art by introducing the Population Data Tables (PDT), a Bayesian model and informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000 ft 2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.« less
The Development of Bayesian Theory and Its Applications in Business and Bioinformatics
NASA Astrophysics Data System (ADS)
Zhang, Yifei
2018-03-01
Bayesian Theory originated from an Essay of a British mathematician named Thomas Bayes in 1763, and after its development in 20th century, Bayesian Statistics has been taking a significant part in statistical study of all fields. Due to the recent breakthrough of high-dimensional integral, Bayesian Statistics has been improved and perfected, and now it can be used to solve problems that Classical Statistics failed to solve. This paper summarizes Bayesian Statistics’ history, concepts and applications, which are illustrated in five parts: the history of Bayesian Statistics, the weakness of Classical Statistics, Bayesian Theory and its development and applications. The first two parts make a comparison between Bayesian Statistics and Classical Statistics in a macroscopic aspect. And the last three parts focus on Bayesian Theory in specific -- from introducing some particular Bayesian Statistics’ concepts to listing their development and finally their applications.
Single-channel mixed signal blind source separation algorithm based on multiple ICA processing
NASA Astrophysics Data System (ADS)
Cheng, Xiefeng; Li, Ji
2017-01-01
Take separating the fetal heart sound signal from the mixed signal that get from the electronic stethoscope as the research background, the paper puts forward a single-channel mixed signal blind source separation algorithm based on multiple ICA processing. Firstly, according to the empirical mode decomposition (EMD), the single-channel mixed signal get multiple orthogonal signal components which are processed by ICA. The multiple independent signal components are called independent sub component of the mixed signal. Then by combining with the multiple independent sub component into single-channel mixed signal, the single-channel signal is expanded to multipath signals, which turns the under-determined blind source separation problem into a well-posed blind source separation problem. Further, the estimate signal of source signal is get by doing the ICA processing. Finally, if the separation effect is not very ideal, combined with the last time's separation effect to the single-channel mixed signal, and keep doing the ICA processing for more times until the desired estimated signal of source signal is get. The simulation results show that the algorithm has good separation effect for the single-channel mixed physiological signals.
Using multilevel spatial models to understand salamander site occupancy patterns after wildfire
Chelgren, Nathan; Adams, Michael J.; Bailey, Larissa L.; Bury, R. Bruce
2011-01-01
Studies of the distribution of elusive forest wildlife have suffered from the confounding of true presence with the uncertainty of detection. Occupancy modeling, which incorporates probabilities of species detection conditional on presence, is an emerging approach for reducing observation bias. However, the current likelihood modeling framework is restrictive for handling unexplained sources of variation in the response that may occur when there are dependence structures such as smaller sampling units that are nested within larger sampling units. We used multilevel Bayesian occupancy modeling to handle dependence structures and to partition sources of variation in occupancy of sites by terrestrial salamanders (family Plethodontidae) within and surrounding an earlier wildfire in western Oregon, USA. Comparison of model fit favored a spatial N-mixture model that accounted for variation in salamander abundance over models that were based on binary detection/non-detection data. Though catch per unit effort was higher in burned areas than unburned, there was strong support that this pattern was due to a higher probability of capture for individuals in burned plots. Within the burn, the odds of capturing an individual given it was present were 2.06 times the odds outside the burn, reflecting reduced complexity of ground cover in the burn. There was weak support that true occupancy was lower within the burned area. While the odds of occupancy in the burn were 0.49 times the odds outside the burn among the five species, the magnitude of variation attributed to the burn was small in comparison to variation attributed to other landscape variables and to unexplained, spatially autocorrelated random variation. While ordinary occupancy models may separate the biological pattern of interest from variation in detection probability when all sources of variation are known, the addition of random effects structures for unexplained sources of variation in occupancy and detection probability may often more appropriately represent levels of uncertainty. ?? 2011 by the Ecological Society of America.
Distinguishing dark matter from unresolved point sources in the Inner Galaxy with photon statistics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Samuel K.; Lisanti, Mariangela; Safdi, Benjamin R., E-mail: samuelkl@princeton.edu, E-mail: mlisanti@princeton.edu, E-mail: bsafdi@princeton.edu
2015-05-01
Data from the Fermi Large Area Telescope suggests that there is an extended excess of GeV gamma-ray photons in the Inner Galaxy. Identifying potential astrophysical sources that contribute to this excess is an important step in verifying whether the signal originates from annihilating dark matter. In this paper, we focus on the potential contribution of unresolved point sources, such as millisecond pulsars (MSPs). We propose that the statistics of the photons—in particular, the flux probability density function (PDF) of the photon counts below the point-source detection threshold—can potentially distinguish between the dark-matter and point-source interpretations. We calculate the flux PDFmore » via the method of generating functions for these two models of the excess. Working in the framework of Bayesian model comparison, we then demonstrate that the flux PDF can potentially provide evidence for an unresolved MSP-like point-source population.« less
Advances in audio source seperation and multisource audio content retrieval
NASA Astrophysics Data System (ADS)
Vincent, Emmanuel
2012-06-01
Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.
Zhao, Yang; Zheng, Wei; Zhuo, Daisy Y; Lu, Yuefeng; Ma, Xiwen; Liu, Hengchang; Zeng, Zhen; Laird, Glen
2017-10-11
Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.
Household food waste separation behavior and the importance of convenience.
Bernstad, Anna
2014-07-01
Two different strategies aiming at increasing household source-separation of food waste were assessed through a case-study in a Swedish residential area (a) use of written information, distributed as leaflets amongst households and (b) installation of equipment for source-segregation of waste with the aim of increasing convenience food waste sorting in kitchens. Weightings of separately collected food waste before and after distribution of written information suggest that this resulted in neither a significant increased amount of separately collected food waste, nor an increased source-separation ratio. After installation of sorting equipment in households, both the amount of separately collected food waste as well as the source-separation ratio increased vastly. Long-term monitoring shows that results where longstanding. Results emphasize the importance of convenience and existence of infrastructure necessary for source-segregation of waste as important factors for household waste recycling, but also highlight the need of addressing these aspects where waste is generated, i.e. already inside the household. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Bayesian Analysis of Scale-Invariant Processes
2012-01-01
Earth Grid (EASE- Grid). The NED raster elevation data of one arc-second resolution (30 m) over the continental US are derived from multiple satellites ...instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send...empirical and ME distributions, yet ensuring computational efficiency. Instead of com- puting empirical histograms from large amount of data , only some
Bayesian Inference for Source Reconstruction: A Real-World Application
2014-09-25
deliberately or acci- dentally . Two examples of operational monitoring sensor networks are the deployment of biological sensor arrays by the Department of...remarkable paper, Cox [16] demonstrated that proba- bility theory, when interpreted as logic, is the only calculus that conforms to a consistent theory...of inference. This demonstration provides the firm logical basis for asserting that probability calculus is the unique quantitative theory of
A New Mathematical Framework for Design Under Uncertainty
2016-05-05
blending multiple information sources via auto-regressive stochastic modeling. A computationally efficient machine learning framework is developed based on...sion and machine learning approaches; see Fig. 1. This will lead to a comprehensive description of system performance with less uncertainty than in the...Bayesian optimization of super-cavitating hy- drofoils The goal of this study is to demonstrate the capabilities of statistical learning and
2015-07-01
undergraduate student coauthors Aashish Jindia, Parag Srivastava, and Jay Jin for help with the research. In addition, thank you to the numerous...103 A.1.1 Sacramento Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 A.1.2 RadMap and SUNS Data Sets...parameters in a joint hypothesis space. We develop scalable branch and bound and pruning mechanisms for searching (at multiple resolutions) over source
Harlin-Cognato, April D; Honeycutt, Rodney L
2006-01-01
Background Dolphins of the genus Lagenorhynchus are anti-tropically distributed in temperate to cool waters. Phylogenetic analyses of cytochrome b sequences have suggested that the genus is polyphyletic; however, many relationships were poorly resolved. In this study, we present a combined-analysis phylogenetic hypothesis for Lagenorhynchus and members of the subfamily Lissodelphininae, which is derived from two nuclear and two mitochondrial data sets and the addition of 34 individuals representing 9 species. In addition, we characterize with parsimony and Bayesian analyses the phylogenetic utility and interaction of characters with statistical measures, including the utility of highly consistent (non-homoplasious) characters as a conservative measure of phylogenetic robustness. We also explore the effects of removing sources of character conflict on phylogenetic resolution. Results Overall, our study provides strong support for the monophyly of the subfamily Lissodelphininae and the polyphyly of the genus Lagenorhynchus. In addition, the simultaneous parsimony analysis resolved and/or improved resolution for 12 nodes including: (1) L. albirostris, L. acutus; (2) L. obscurus and L. obliquidens; and (3) L. cruciger and L. australis. In addition, the Bayesian analysis supported the monophyly of the Cephalorhynchus, and resolved ambiguities regarding the relationship of L. australis/L. cruciger to other members of the genus Lagenorhynchus. The frequency of highly consistent characters varied among data partitions, but the rate of evolution was consistent within data partitions. Although the control region was the greatest source of character conflict, removal of this data partition impeded phylogenetic resolution. Conclusion The simultaneous analysis approach produced a more robust phylogenetic hypothesis for Lagenorhynchus than previous studies, thus supporting a phylogenetic approach employing multiple data partitions that vary in overall rate of evolution. Even in cases where there was apparent conflict among characters, our data suggest a synergistic interaction in the simultaneous analysis, and speak against a priori exclusion of data because of potential conflicts, primarily because phylogenetic results can be less robust. For example, the removal of the control region, the putative source of character conflict, produced spurious results with inconsistencies among and within topologies from parsimony and Bayesian analyses. PMID:17078887
Bayesian Modeling of Exposure and Airflow Using Two-Zone Models
Zhang, Yufen; Banerjee, Sudipto; Yang, Rui; Lungu, Claudiu; Ramachandran, Gurumurthy
2009-01-01
Mathematical modeling is being increasingly used as a means for assessing occupational exposures. However, predicting exposure in real settings is constrained by lack of quantitative knowledge of exposure determinants. Validation of models in occupational settings is, therefore, a challenge. Not only do the model parameters need to be known, the models also need to predict the output with some degree of accuracy. In this paper, a Bayesian statistical framework is used for estimating model parameters and exposure concentrations for a two-zone model. The model predicts concentrations in a zone near the source and far away from the source as functions of the toluene generation rate, air ventilation rate through the chamber, and the airflow between near and far fields. The framework combines prior or expert information on the physical model along with the observed data. The framework is applied to simulated data as well as data obtained from the experiments conducted in a chamber. Toluene vapors are generated from a source under different conditions of airflow direction, the presence of a mannequin, and simulated body heat of the mannequin. The Bayesian framework accounts for uncertainty in measurement as well as in the unknown rate of airflow between the near and far fields. The results show that estimates of the interzonal airflow are always close to the estimated equilibrium solutions, which implies that the method works efficiently. The predictions of near-field concentration for both the simulated and real data show nice concordance with the true values, indicating that the two-zone model assumptions agree with the reality to a large extent and the model is suitable for predicting the contaminant concentration. Comparison of the estimated model and its margin of error with the experimental data thus enables validation of the physical model assumptions. The approach illustrates how exposure models and information on model parameters together with the knowledge of uncertainty and variability in these quantities can be used to not only provide better estimates of model outputs but also model parameters. PMID:19403840
Hopkins, Richard S; Cook, Robert L; Striley, Catherine W
2016-01-01
Background Traditional influenza surveillance relies on influenza-like illness (ILI) syndrome that is reported by health care providers. It primarily captures individuals who seek medical care and misses those who do not. Recently, Web-based data sources have been studied for application to public health surveillance, as there is a growing number of people who search, post, and tweet about their illnesses before seeking medical care. Existing research has shown some promise of using data from Google, Twitter, and Wikipedia to complement traditional surveillance for ILI. However, past studies have evaluated these Web-based sources individually or dually without comparing all 3 of them, and it would be beneficial to know which of the Web-based sources performs best in order to be considered to complement traditional methods. Objective The objective of this study is to comparatively analyze Google, Twitter, and Wikipedia by examining which best corresponds with Centers for Disease Control and Prevention (CDC) ILI data. It was hypothesized that Wikipedia will best correspond with CDC ILI data as previous research found it to be least influenced by high media coverage in comparison with Google and Twitter. Methods Publicly available, deidentified data were collected from the CDC, Google Flu Trends, HealthTweets, and Wikipedia for the 2012-2015 influenza seasons. Bayesian change point analysis was used to detect seasonal changes, or change points, in each of the data sources. Change points in Google, Twitter, and Wikipedia that occurred during the exact week, 1 preceding week, or 1 week after the CDC’s change points were compared with the CDC data as the gold standard. All analyses were conducted using the R package “bcp” version 4.0.0 in RStudio version 0.99.484 (RStudio Inc). In addition, sensitivity and positive predictive values (PPV) were calculated for Google, Twitter, and Wikipedia. Results During the 2012-2015 influenza seasons, a high sensitivity of 92% was found for Google, whereas the PPV for Google was 85%. A low sensitivity of 50% was calculated for Twitter; a low PPV of 43% was found for Twitter also. Wikipedia had the lowest sensitivity of 33% and lowest PPV of 40%. Conclusions Of the 3 Web-based sources, Google had the best combination of sensitivity and PPV in detecting Bayesian change points in influenza-related data streams. Findings demonstrated that change points in Google, Twitter, and Wikipedia data occasionally aligned well with change points captured in CDC ILI data, yet these sources did not detect all changes in CDC data and should be further studied and developed. PMID:27765731
Engelhardt, Benjamin; Kschischo, Maik; Fröhlich, Holger
2017-06-01
Ordinary differential equations (ODEs) are a popular approach to quantitatively model molecular networks based on biological knowledge. However, such knowledge is typically restricted. Wrongly modelled biological mechanisms as well as relevant external influence factors that are not included into the model are likely to manifest in major discrepancies between model predictions and experimental data. Finding the exact reasons for such observed discrepancies can be quite challenging in practice. In order to address this issue, we suggest a Bayesian approach to estimate hidden influences in ODE-based models. The method can distinguish between exogenous and endogenous hidden influences. Thus, we can detect wrongly specified as well as missed molecular interactions in the model. We demonstrate the performance of our Bayesian dynamic elastic-net with several ordinary differential equation models from the literature, such as human JAK-STAT signalling, information processing at the erythropoietin receptor, isomerization of liquid α -Pinene, G protein cycling in yeast and UV-B triggered signalling in plants. Moreover, we investigate a set of commonly known network motifs and a gene-regulatory network. Altogether our method supports the modeller in an algorithmic manner to identify possible sources of errors in ODE-based models on the basis of experimental data. © 2017 The Author(s).
Robust Bayesian Experimental Design for Conceptual Model Discrimination
NASA Astrophysics Data System (ADS)
Pham, H. V.; Tsai, F. T. C.
2015-12-01
A robust Bayesian optimal experimental design under uncertainty is presented to provide firm information for model discrimination, given the least number of pumping wells and observation wells. Firm information is the maximum information of a system can be guaranteed from an experimental design. The design is based on the Box-Hill expected entropy decrease (EED) before and after the experiment design and the Bayesian model averaging (BMA) framework. A max-min programming is introduced to choose the robust design that maximizes the minimal Box-Hill EED subject to that the highest expected posterior model probability satisfies a desired probability threshold. The EED is calculated by the Gauss-Hermite quadrature. The BMA method is used to predict future observations and to quantify future observation uncertainty arising from conceptual and parametric uncertainties in calculating EED. Monte Carlo approach is adopted to quantify the uncertainty in the posterior model probabilities. The optimal experimental design is tested by a synthetic 5-layer anisotropic confined aquifer. Nine conceptual groundwater models are constructed due to uncertain geological architecture and boundary condition. High-performance computing is used to enumerate all possible design solutions in order to identify the most plausible groundwater model. Results highlight the impacts of scedasticity in future observation data as well as uncertainty sources on potential pumping and observation locations.