bayesian non-linear large: Topics by Science.gov

Sample records for bayesian non-linear large

Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data.

PubMed

Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger

2017-01-01

Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

PubMed Central

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-01-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

PubMed

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-12-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Boosting Bayesian parameter inference of stochastic differential equation models with methods from statistical physics

NASA Astrophysics Data System (ADS)

Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

2016-04-01

Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.
Past and present cosmic structure in the SDSS DR7 main sample

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jasche, J.; Leclercq, F.; Wandelt, B.D., E-mail: jasche@iap.fr, E-mail: florent.leclercq@polytechnique.org, E-mail: wandelt@iap.fr

2015-01-01

We present a chrono-cosmography project, aiming at the inference of the four dimensional formation history of the observed large scale structure from its origin to the present epoch. To do so, we perform a full-scale Bayesian analysis of the northern galactic cap of the Sloan Digital Sky Survey (SDSS) Data Release 7 main galaxy sample, relying on a fully probabilistic, physical model of the non-linearly evolved density field. Besides inferring initial conditions from observations, our methodology naturally and accurately reconstructs non-linear features at the present epoch, such as walls and filaments, corresponding to high-order correlation functions generated by late-time structuremore » formation. Our inference framework self-consistently accounts for typical observational systematic and statistical uncertainties such as noise, survey geometry and selection effects. We further account for luminosity dependent galaxy biases and automatic noise calibration within a fully Bayesian approach. As a result, this analysis provides highly-detailed and accurate reconstructions of the present density field on scales larger than ∼ 3 Mpc/h, constrained by SDSS observations. This approach also leads to the first quantitative inference of plausible formation histories of the dynamic large scale structure underlying the observed galaxy distribution. The results described in this work constitute the first full Bayesian non-linear analysis of the cosmic large scale structure with the demonstrated capability of uncertainty quantification. Some of these results will be made publicly available along with this work. The level of detail of inferred results and the high degree of control on observational uncertainties pave the path towards high precision chrono-cosmography, the subject of simultaneously studying the dynamics and the morphology of the inhomogeneous Universe.« less
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

PubMed Central

Zhao, Xin; Cheung, Leo Wang-Kit

2007-01-01

Background Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. Results A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. Conclusion Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently. PMID:17328811
Bayesian Approach to the Joint Inversion of Gravity and Magnetic Data, with Application to the Ismenius Area of Mars

NASA Technical Reports Server (NTRS)

Jewell, Jeffrey B.; Raymond, C.; Smrekar, S.; Millbury, C.

2004-01-01

This viewgraph presentation reviews a Bayesian approach to the inversion of gravity and magnetic data with specific application to the Ismenius Area of Mars. Many inverse problems encountered in geophysics and planetary science are well known to be non-unique (i.e. inversion of gravity the density structure of a body). In hopes of reducing the non-uniqueness of solutions, there has been interest in the joint analysis of data. An example is the joint inversion of gravity and magnetic data, with the assumption that the same physical anomalies generate both the observed magnetic and gravitational anomalies. In this talk, we formulate the joint analysis of different types of data in a Bayesian framework and apply the formalism to the inference of the density and remanent magnetization structure for a local region in the Ismenius area of Mars. The Bayesian approach allows prior information or constraints in the solutions to be incorporated in the inversion, with the "best" solutions those whose forward predictions most closely match the data while remaining consistent with assumed constraints. The application of this framework to the inversion of gravity and magnetic data on Mars reveals two typical challenges - the forward predictions of the data have a linear dependence on some of the quantities of interest, and non-linear dependence on others (termed the "linear" and "non-linear" variables, respectively). For observations with Gaussian noise, a Bayesian approach to inversion for "linear" variables reduces to a linear filtering problem, with an explicitly computable "error" matrix. However, for models whose forward predictions have non-linear dependencies, inference is no longer given by such a simple linear problem, and moreover, the uncertainty in the solution is no longer completely specified by a computable "error matrix". It is therefore important to develop methods for sampling from the full Bayesian posterior to provide a complete and statistically consistent picture of model uncertainty, and what has been learned from observations. We will discuss advanced numerical techniques, including Monte Carlo Markov
Mixed linear-non-linear inversion of crustal deformation data: Bayesian inference of model, weighting and regularization parameters

NASA Astrophysics Data System (ADS)

Fukuda, Jun'ichi; Johnson, Kaj M.

2010-06-01

We present a unified theoretical framework and solution method for probabilistic, Bayesian inversions of crustal deformation data. The inversions involve multiple data sets with unknown relative weights, model parameters that are related linearly or non-linearly through theoretic models to observations, prior information on model parameters and regularization priors to stabilize underdetermined problems. To efficiently handle non-linear inversions in which some of the model parameters are linearly related to the observations, this method combines both analytical least-squares solutions and a Monte Carlo sampling technique. In this method, model parameters that are linearly and non-linearly related to observations, relative weights of multiple data sets and relative weights of prior information and regularization priors are determined in a unified Bayesian framework. In this paper, we define the mixed linear-non-linear inverse problem, outline the theoretical basis for the method, provide a step-by-step algorithm for the inversion, validate the inversion method using synthetic data and apply the method to two real data sets. We apply the method to inversions of multiple geodetic data sets with unknown relative data weights for interseismic fault slip and locking depth. We also apply the method to the problem of estimating the spatial distribution of coseismic slip on faults with unknown fault geometry, relative data weights and smoothing regularization weight.
Sequential Designs Based on Bayesian Uncertainty Quantification in Sparse Representation Surrogate Modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.

A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less
Sequential Designs Based on Bayesian Uncertainty Quantification in Sparse Representation Surrogate Modeling

DOE PAGES

Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.

2017-04-12

A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less
IMNN: Information Maximizing Neural Networks

NASA Astrophysics Data System (ADS)

Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

2018-04-01

This software trains artificial neural networks to find non-linear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). As compressing large data sets vastly simplifies both frequentist and Bayesian inference, important information may be inadvertently missed. Likelihood-free inference based on automatically derived IMNN summaries produces summaries that are good approximations to sufficient statistics. IMNNs are robustly capable of automatically finding optimal, non-linear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima.
Missing-value estimation using linear and non-linear regression with Bayesian gene selection.

PubMed

Zhou, Xiaobo; Wang, Xiaodong; Dougherty, Edward R

2003-11-22

Data from microarray experiments are usually in the form of large matrices of expression levels of genes under different experimental conditions. Owing to various reasons, there are frequently missing values. Estimating these missing values is important because they affect downstream analysis, such as clustering, classification and network design. Several methods of missing-value estimation are in use. The problem has two parts: (1) selection of genes for estimation and (2) design of an estimation rule. We propose Bayesian variable selection to obtain genes to be used for estimation, and employ both linear and nonlinear regression for the estimation rule itself. Fast implementation issues for these methods are discussed, including the use of QR decomposition for parameter estimation. The proposed methods are tested on data sets arising from hereditary breast cancer and small round blue-cell tumors. The results compare very favorably with currently used methods based on the normalized root-mean-square error. The appendix is available from http://gspsnap.tamu.edu/gspweb/zxb/missing_zxb/ (user: gspweb; passwd: gsplab).
Bayesian integration and non-linear feedback control in a full-body motor task.

PubMed

Stevenson, Ian H; Fernandes, Hugo L; Vilares, Iris; Wei, Kunlin; Körding, Konrad P

2009-12-01

A large number of experiments have asked to what degree human reaching movements can be understood as being close to optimal in a statistical sense. However, little is known about whether these principles are relevant for other classes of movements. Here we analyzed movement in a task that is similar to surfing or snowboarding. Human subjects stand on a force plate that measures their center of pressure. This center of pressure affects the acceleration of a cursor that is displayed in a noisy fashion (as a cloud of dots) on a projection screen while the subject is incentivized to keep the cursor close to a fixed position. We find that salient aspects of observed behavior are well-described by optimal control models where a Bayesian estimation model (Kalman filter) is combined with an optimal controller (either a Linear-Quadratic-Regulator or Bang-bang controller). We find evidence that subjects integrate information over time taking into account uncertainty. However, behavior in this continuous steering task appears to be a highly non-linear function of the visual feedback. While the nervous system appears to implement Bayes-like mechanisms for a full-body, dynamic task, it may additionally take into account the specific costs and constraints of the task.
Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys

ERIC Educational Resources Information Center

Si, Yajuan; Reiter, Jerome P.

2013-01-01

In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian,…
A Bayesian approach for estimating under-reported dengue incidence with a focus on non-linear associations between climate and dengue in Dhaka, Bangladesh.

PubMed

Sharmin, Sifat; Glass, Kathryn; Viennet, Elvina; Harley, David

2018-04-01

Determining the relation between climate and dengue incidence is challenging due to under-reporting of disease and consequent biased incidence estimates. Non-linear associations between climate and incidence compound this. Here, we introduce a modelling framework to estimate dengue incidence from passive surveillance data while incorporating non-linear climate effects. We estimated the true number of cases per month using a Bayesian generalised linear model, developed in stages to adjust for under-reporting. A semi-parametric thin-plate spline approach was used to quantify non-linear climate effects. The approach was applied to data collected from the national dengue surveillance system of Bangladesh. The model estimated that only 2.8% (95% credible interval 2.7-2.8) of all cases in the capital Dhaka were reported through passive case reporting. The optimal mean monthly temperature for dengue transmission is 29℃ and average monthly rainfall above 15 mm decreases transmission. Our approach provides an estimate of true incidence and an understanding of the effects of temperature and rainfall on dengue transmission in Dhaka, Bangladesh.
Efficient Bayesian mixed model analysis increases association power in large cohorts

PubMed Central

Loh, Po-Ru; Tucker, George; Bulik-Sullivan, Brendan K; Vilhjálmsson, Bjarni J; Finucane, Hilary K; Salem, Rany M; Chasman, Daniel I; Ridker, Paul M; Neale, Benjamin M; Berger, Bonnie; Patterson, Nick; Price, Alkes L

2014-01-01

Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts, and may not optimize power. All existing methods require time cost O(MN2) (where N = #samples and M = #SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here, we present a far more efficient mixed model association method, BOLT-LMM, which requires only a small number of O(MN)-time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to nine quantitative traits in 23,294 samples from the Women’s Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for GWAS in large cohorts. PMID:25642633
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

PubMed

Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

2018-05-30

NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
BROCCOLI: Software for fast fMRI analysis on many-core CPUs and GPUs

PubMed Central

Eklund, Anders; Dufort, Paul; Villani, Mattias; LaConte, Stephen

2014-01-01

Analysis of functional magnetic resonance imaging (fMRI) data is becoming ever more computationally demanding as temporal and spatial resolutions improve, and large, publicly available data sets proliferate. Moreover, methodological improvements in the neuroimaging pipeline, such as non-linear spatial normalization, non-parametric permutation tests and Bayesian Markov Chain Monte Carlo approaches, can dramatically increase the computational burden. Despite these challenges, there do not yet exist any fMRI software packages which leverage inexpensive and powerful graphics processing units (GPUs) to perform these analyses. Here, we therefore present BROCCOLI, a free software package written in OpenCL (Open Computing Language) that can be used for parallel analysis of fMRI data on a large variety of hardware configurations. BROCCOLI has, for example, been tested with an Intel CPU, an Nvidia GPU, and an AMD GPU. These tests show that parallel processing of fMRI data can lead to significantly faster analysis pipelines. This speedup can be achieved on relatively standard hardware, but further, dramatic speed improvements require only a modest investment in GPU hardware. BROCCOLI (running on a GPU) can perform non-linear spatial normalization to a 1 mm3 brain template in 4–6 s, and run a second level permutation test with 10,000 permutations in about a minute. These non-parametric tests are generally more robust than their parametric counterparts, and can also enable more sophisticated analyses by estimating complicated null distributions. Additionally, BROCCOLI includes support for Bayesian first-level fMRI analysis using a Gibbs sampler. The new software is freely available under GNU GPL3 and can be downloaded from github (https://github.com/wanderine/BROCCOLI/). PMID:24672471
Identification of transmissivity fields using a Bayesian strategy and perturbative approach

NASA Astrophysics Data System (ADS)

Zanini, Andrea; Tanda, Maria Giovanna; Woodbury, Allan D.

2017-10-01

The paper deals with the crucial problem of the groundwater parameter estimation that is the basis for efficient modeling and reclamation activities. A hierarchical Bayesian approach is developed: it uses the Akaike's Bayesian Information Criteria in order to estimate the hyperparameters (related to the covariance model chosen) and to quantify the unknown noise variance. The transmissivity identification proceeds in two steps: the first, called empirical Bayesian interpolation, uses Y* (Y = lnT) observations to interpolate Y values on a specified grid; the second, called empirical Bayesian update, improve the previous Y estimate through the addition of hydraulic head observations. The relationship between the head and the lnT has been linearized through a perturbative solution of the flow equation. In order to test the proposed approach, synthetic aquifers from literature have been considered. The aquifers in question contain a variety of boundary conditions (both Dirichelet and Neuman type) and scales of heterogeneities (σY2 = 1.0 and σY2 = 5.3). The estimated transmissivity fields were compared to the true one. The joint use of Y* and head measurements improves the estimation of Y considering both degrees of heterogeneity. Even if the variance of the strong transmissivity field can be considered high for the application of the perturbative approach, the results show the same order of approximation of the non-linear methods proposed in literature. The procedure allows to compute the posterior probability distribution of the target quantities and to quantify the uncertainty in the model prediction. Bayesian updating has advantages related both to the Monte-Carlo (MC) and non-MC approaches. In fact, as the MC methods, Bayesian updating allows computing the direct posterior probability distribution of the target quantities and as non-MC methods it has computational times in the order of seconds.
Bayesian analysis of the dynamic cosmic web in the SDSS galaxy survey

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leclercq, Florent; Wandelt, Benjamin; Jasche, Jens, E-mail: florent.leclercq@polytechnique.org, E-mail: jasche@iap.fr, E-mail: wandelt@iap.fr

Recent application of the Bayesian algorithm \\textsc(borg) to the Sloan Digital Sky Survey (SDSS) main sample galaxies resulted in the physical inference of the formation history of the observed large-scale structure from its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the cosmic web into four distinct components (voids, sheets, filaments, and clusters) on the basis of themore » tidal field. Our inference framework automatically and self-consistently propagates typical observational uncertainties to web-type classification. As a result, this study produces accurate cosmographic classification of large-scale structure elements in the SDSS volume. By also providing the history of these structure maps, the approach allows an analysis of the origin and growth of the early traces of the cosmic web present in the initial density field and of the evolution of global quantities such as the volume and mass filling fractions of different structures. For the problem of web-type classification, the results described in this work constitute the first connection between theory and observations at non-linear scales including a physical model of structure formation and the demonstrated capability of uncertainty quantification. A connection between cosmology and information theory using real data also naturally emerges from our probabilistic approach. Our results constitute quantitative chrono-cosmography of the complex web-like patterns underlying the observed galaxy distribution.« less

Almost but not quite 2D, Non-linear Bayesian Inversion of CSEM Data

NASA Astrophysics Data System (ADS)

Ray, A.; Key, K.; Bodin, T.

2013-12-01

The geophysical inverse problem can be elegantly stated in a Bayesian framework where a probability distribution can be viewed as a statement of information regarding a random variable. After all, the goal of geophysical inversion is to provide information on the random variables of interest - physical properties of the earth's subsurface. However, though it may be simple to postulate, a practical difficulty of fully non-linear Bayesian inversion is the computer time required to adequately sample the model space and extract the information we seek. As a consequence, in geophysical problems where evaluation of a full 2D/3D forward model is computationally expensive, such as marine controlled source electromagnetic (CSEM) mapping of the resistivity of seafloor oil and gas reservoirs, Bayesian studies have largely been conducted with 1D forward models. While the 1D approximation is indeed appropriate for exploration targets with planar geometry and geological stratification, it only provides a limited, site-specific idea of uncertainty in resistivity with depth. In this work, we extend our fully non-linear 1D Bayesian inversion to a 2D model framework, without requiring the usual regularization of model resistivities in the horizontal or vertical directions used to stabilize quasi-2D inversions. In our approach, we use the reversible jump Markov-chain Monte-Carlo (RJ-MCMC) or trans-dimensional method and parameterize the subsurface in a 2D plane with Voronoi cells. The method is trans-dimensional in that the number of cells required to parameterize the subsurface is variable, and the cells dynamically move around and multiply or combine as demanded by the data being inverted. This approach allows us to expand our uncertainty analysis of resistivity at depth to more than a single site location, allowing for interactions between model resistivities at different horizontal locations along a traverse over an exploration target. While the model is parameterized in 2D, we efficiently evaluate the forward response using 1D profiles extracted from the model at the common-midpoints of the EM source-receiver pairs. Since the 1D approximation is locally valid at different midpoint locations, the computation time is far lower than is required by a full 2D or 3D simulation. We have applied this method to both synthetic and real CSEM survey data from the Scarborough gas field on the Northwest shelf of Australia, resulting in a spatially variable quantification of resistivity and its uncertainty in 2D. This Bayesian approach results in a large database of 2D models that comprise a posterior probability distribution, which we can subset to test various hypotheses about the range of model structures compatible with the data. For example, we can subset the model distributions to examine the hypothesis that a resistive reservoir extends overs a certain spatial extent. Depending on how this conditions other parts of the model space, light can be shed on the geological viability of the hypothesis. Since tackling spatially variable uncertainty and trade-offs in 2D and 3D is a challenging research problem, the insights gained from this work may prove valuable for subsequent full 2D and 3D Bayesian inversions.
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.

PubMed

Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa

2017-01-01

TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.
SOMBI: Bayesian identification of parameter relations in unstructured cosmological data

NASA Astrophysics Data System (ADS)

Frank, Philipp; Jasche, Jens; Enßlin, Torsten A.

2016-11-01

This work describes the implementation and application of a correlation determination method based on self organizing maps and Bayesian inference (SOMBI). SOMBI aims to automatically identify relations between different observed parameters in unstructured cosmological or astrophysical surveys by automatically identifying data clusters in high-dimensional datasets via the self organizing map neural network algorithm. Parameter relations are then revealed by means of a Bayesian inference within respective identified data clusters. Specifically such relations are assumed to be parametrized as a polynomial of unknown order. The Bayesian approach results in a posterior probability distribution function for respective polynomial coefficients. To decide which polynomial order suffices to describe correlation structures in data, we include a method for model selection, the Bayesian information criterion, to the analysis. The performance of the SOMBI algorithm is tested with mock data. As illustration we also provide applications of our method to cosmological data. In particular, we present results of a correlation analysis between galaxy and active galactic nucleus (AGN) properties provided by the SDSS catalog with the cosmic large-scale-structure (LSS). The results indicate that the combined galaxy and LSS dataset indeed is clustered into several sub-samples of data with different average properties (for example different stellar masses or web-type classifications). The majority of data clusters appear to have a similar correlation structure between galaxy properties and the LSS. In particular we revealed a positive and linear dependency between the stellar mass, the absolute magnitude and the color of a galaxy with the corresponding cosmic density field. A remaining subset of data shows inverted correlations, which might be an artifact of non-linear redshift distortions.
Bayesian analysis of non-linear differential equation models with application to a gut microbial ecosystem.

PubMed

Lawson, Daniel J; Holtrop, Grietje; Flint, Harry

2011-07-01

Process models specified by non-linear dynamic differential equations contain many parameters, which often must be inferred from a limited amount of data. We discuss a hierarchical Bayesian approach combining data from multiple related experiments in a meaningful way, which permits more powerful inference than treating each experiment as independent. The approach is illustrated with a simulation study and example data from experiments replicating the aspects of the human gut microbial ecosystem. A predictive model is obtained that contains prediction uncertainty caused by uncertainty in the parameters, and we extend the model to capture situations of interest that cannot easily be studied experimentally. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Trans-dimensional Bayesian inversion of airborne electromagnetic data for 2D conductivity profiles

NASA Astrophysics Data System (ADS)

Hawkins, Rhys; Brodie, Ross C.; Sambridge, Malcolm

2018-02-01

This paper presents the application of a novel trans-dimensional sampling approach to a time domain airborne electromagnetic (AEM) inverse problem to solve for plausible conductivities of the subsurface. Geophysical inverse field problems, such as time domain AEM, are well known to have a large degree of non-uniqueness. Common least-squares optimisation approaches fail to take this into account and provide a single solution with linearised estimates of uncertainty that can result in overly optimistic appraisal of the conductivity of the subsurface. In this new non-linear approach, the spatial complexity of a 2D profile is controlled directly by the data. By examining an ensemble of proposed conductivity profiles it accommodates non-uniqueness and provides more robust estimates of uncertainties.
Revisiting Isotherm Analyses Using R: Comparison of Linear, Non-linear, and Bayesian Techniques

EPA Science Inventory

Extensive adsorption isotherm data exist for an array of chemicals of concern on a variety of engineered and natural sorbents. Several isotherm models exist that can accurately describe these data from which the resultant fitting parameters may subsequently be used in numerical ...
A comparative study of minimum norm inverse methods for MEG imaging

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leahy, R.M.; Mosher, J.C.; Phillips, J.W.

1996-07-01

The majority of MEG imaging techniques currently in use fall into the general class of (weighted) minimum norm methods. The minimization of a norm is used as the basis for choosing one from a generally infinite set of solutions that provide an equally good fit to the data. This ambiguity in the solution arises from the inherent non- uniqueness of the continuous inverse problem and is compounded by the imbalance between the relatively small number of measurements and the large number of source voxels. Here we present a unified view of the minimum norm methods and describe how we canmore » use Tikhonov regularization to avoid instabilities in the solutions due to noise. We then compare the performance of regularized versions of three well known linear minimum norm methods with the non-linear iteratively reweighted minimum norm method and a Bayesian approach.« less
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors

PubMed Central

Woldegerima, Woldegebriel Assefa

2017-01-01

TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437
An efficient method for model refinement in diffuse optical tomography

NASA Astrophysics Data System (ADS)

Zirak, A. R.; Khademi, M.

2007-11-01

Diffuse optical tomography (DOT) is a non-linear, ill-posed, boundary value and optimization problem which necessitates regularization. Also, Bayesian methods are suitable owing to measurements data are sparse and correlated. In such problems which are solved with iterative methods, for stabilization and better convergence, the solution space must be small. These constraints subject to extensive and overdetermined system of equations which model retrieving criteria specially total least squares (TLS) must to refine model error. Using TLS is limited to linear systems which is not achievable when applying traditional Bayesian methods. This paper presents an efficient method for model refinement using regularized total least squares (RTLS) for treating on linearized DOT problem, having maximum a posteriori (MAP) estimator and Tikhonov regulator. This is done with combination Bayesian and regularization tools as preconditioner matrices, applying them to equations and then using RTLS to the resulting linear equations. The preconditioning matrixes are guided by patient specific information as well as a priori knowledge gained from the training set. Simulation results illustrate that proposed method improves the image reconstruction performance and localize the abnormally well.
A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

NASA Astrophysics Data System (ADS)

Khawaja, Taimoor Saleem

A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.
Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data

PubMed Central

Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.

2013-01-01

Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689
Learning oncogenetic networks by reducing to mixed integer linear programming.

PubMed

Shahrabi Farahani, Hossein; Lagergren, Jens

2013-01-01

Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease. In order to model cancer progression, we propose Progression Networks, a special case of Bayesian networks, that are tailored to model disease progression. Progression networks have similarities with Conjunctive Bayesian Networks (CBNs) [1],a variation of Bayesian networks also proposed for modeling disease progression. We also describe a learning algorithm for learning Bayesian networks in general and progression networks in particular. We reduce the hard problem of learning the Bayesian and progression networks to Mixed Integer Linear Programming (MILP). MILP is a Non-deterministic Polynomial-time complete (NP-complete) problem for which very good heuristics exists. We tested our algorithm on synthetic and real cytogenetic data from renal cell carcinoma. We also compared our learned progression networks with the networks proposed in earlier publications. The software is available on the website https://bitbucket.org/farahani/diprog.
Mapping brucellosis increases relative to elk density using hierarchical Bayesian models

USGS Publications Warehouse

Cross, Paul C.; Heisey, Dennis M.; Scurlock, Brandon M.; Edwards, William H.; Brennan, Angela; Ebinger, Michael R.

2010-01-01

The relationship between host density and parasite transmission is central to the effectiveness of many disease management strategies. Few studies, however, have empirically estimated this relationship particularly in large mammals. We applied hierarchical Bayesian methods to a 19-year dataset of over 6400 brucellosis tests of adult female elk (Cervus elaphus) in northwestern Wyoming. Management captures that occurred from January to March were over two times more likely to be seropositive than hunted elk that were killed in September to December, while accounting for site and year effects. Areas with supplemental feeding grounds for elk had higher seroprevalence in 1991 than other regions, but by 2009 many areas distant from the feeding grounds were of comparable seroprevalence. The increases in brucellosis seroprevalence were correlated with elk densities at the elk management unit, or hunt area, scale (mean 2070 km2; range = [95–10237]). The data, however, could not differentiate among linear and non-linear effects of host density. Therefore, control efforts that focus on reducing elk densities at a broad spatial scale were only weakly supported. Additional research on how a few, large groups within a region may be driving disease dynamics is needed for more targeted and effective management interventions. Brucellosis appears to be expanding its range into new regions and elk populations, which is likely to further complicate the United States brucellosis eradication program. This study is an example of how the dynamics of host populations can affect their ability to serve as disease reservoirs.
A Bayesian account of quantum histories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marlow, Thomas

2006-05-15

We investigate whether quantum history theories can be consistent with Bayesian reasoning and whether such an analysis helps clarify the interpretation of such theories. First, we summarise and extend recent work categorising two different approaches to formalising multi-time measurements in quantum theory. The standard approach consists of describing an ordered series of measurements in terms of history propositions with non-additive 'probabilities.' The non-standard approach consists of defining multi-time measurements to consist of sets of exclusive and exhaustive history propositions and recovering the single-time exclusivity of results when discussing single-time history propositions. We analyse whether such history propositions can be consistentmore » with Bayes' rule. We show that certain class of histories are given a natural Bayesian interpretation, namely, the linearly positive histories originally introduced by Goldstein and Page. Thus, we argue that this gives a certain amount of interpretational clarity to the non-standard approach. We also attempt a justification of our analysis using Cox's axioms of probability theory.« less
Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.

PubMed

Kärkkäinen, Hanni P; Sillanpää, Mikko J

2013-09-04

Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data

PubMed Central

Kärkkäinen, Hanni P.; Sillanpää, Mikko J.

2013-01-01

Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
A Non-linear Geodetic Data Inversion Using ABIC for Slip Distribution on a Fault With an Unknown dip Angle

NASA Astrophysics Data System (ADS)

Fukahata, Y.; Wright, T. J.

2006-12-01

We developed a method of geodetic data inversion for slip distribution on a fault with an unknown dip angle. When fault geometry is unknown, the problem of geodetic data inversion is non-linear. A common strategy for obtaining slip distribution is to first determine the fault geometry by minimizing the square misfit under the assumption of a uniform slip on a rectangular fault, and then apply the usual linear inversion technique to estimate a slip distribution on the determined fault. It is not guaranteed, however, that the fault determined under the assumption of a uniform slip gives the best fault geometry for a spatially variable slip distribution. In addition, in obtaining a uniform slip fault model, we have to simultaneously determine the values of the nine mutually dependent parameters, which is a highly non-linear, complicated process. Although the inverse problem is non-linear for cases with unknown fault geometries, the non-linearity of the problems is actually weak, when we can assume the fault surface to be flat. In particular, when a clear fault trace is observed on the EarthOs surface after an earthquake, we can precisely estimate the strike and the location of the fault. In this case only the dip angle has large ambiguity. In geodetic data inversion we usually need to introduce smoothness constraints in order to compromise reciprocal requirements for model resolution and estimation errors in a natural way. Strictly speaking, the inverse problem with smoothness constraints is also non-linear, even if the fault geometry is known. The non-linearity has been dissolved by introducing AkaikeOs Bayesian Information Criterion (ABIC), with which the optimal value of the relative weight of observed data to smoothness constraints is objectively determined. In this study, using ABIC in determining the optimal dip angle, we dissolved the non-linearity of the inverse problem. We applied the method to the InSAR data of the 1995 Dinar, Turkey earthquake and obtained a much shallower dip angle than before.
Bayesian Models for Astrophysical Data Using R, JAGS, Python, and Stan

NASA Astrophysics Data System (ADS)

Hilbe, Joseph M.; de Souza, Rafael S.; Ishida, Emille E. O.

2017-05-01

This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.
An introduction to Bayesian statistics in health psychology.

PubMed

Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske

2017-09-01

The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.
Bayesian dynamical systems modelling in the social sciences.

PubMed

Ranganathan, Shyam; Spaiser, Viktoria; Mann, Richard P; Sumpter, David J T

2014-01-01

Data arising from social systems is often highly complex, involving non-linear relationships between the macro-level variables that characterize these systems. We present a method for analyzing this type of longitudinal or panel data using differential equations. We identify the best non-linear functions that capture interactions between variables, employing Bayes factor to decide how many interaction terms should be included in the model. This method punishes overly complicated models and identifies models with the most explanatory power. We illustrate our approach on the classic example of relating democracy and economic growth, identifying non-linear relationships between these two variables. We show how multiple variables and variable lags can be accounted for and provide a toolbox in R to implement our approach.

Parameter and Structure Inference for Nonlinear Dynamical Systems

NASA Technical Reports Server (NTRS)

Morris, Robin D.; Smelyanskiy, Vadim N.; Millonas, Mark

2006-01-01

A great many systems can be modeled in the non-linear dynamical systems framework, as x = f(x) + xi(t), where f() is the potential function for the system, and xi is the excitation noise. Modeling the potential using a set of basis functions, we derive the posterior for the basis coefficients. A more challenging problem is to determine the set of basis functions that are required to model a particular system. We show that using the Bayesian Information Criteria (BIC) to rank models, and the beam search technique, that we can accurately determine the structure of simple non-linear dynamical system models, and the structure of the coupling between non-linear dynamical systems where the individual systems are known. This last case has important ecological applications.
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

EPA Science Inventory

We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
More green space is related to less antidepressant prescription rates in the Netherlands: A Bayesian geoadditive quantile regression approach.

PubMed

Helbich, Marco; Klein, Nadja; Roberts, Hannah; Hagedoorn, Paulien; Groenewegen, Peter P

2018-06-20

Exposure to green space seems to be beneficial for self-reported mental health. In this study we used an objective health indicator, namely antidepressant prescription rates. Current studies rely exclusively upon mean regression models assuming linear associations. It is, however, plausible that the presence of green space is non-linearly related with different quantiles of the outcome antidepressant prescription rates. These restrictions may contribute to inconsistent findings. Our aim was: a) to assess antidepressant prescription rates in relation to green space, and b) to analyze how the relationship varies non-linearly across different quantiles of antidepressant prescription rates. We used cross-sectional data for the year 2014 at a municipality level in the Netherlands. Ecological Bayesian geoadditive quantile regressions were fitted for the 15%, 50%, and 85% quantiles to estimate green space-prescription rate correlations, controlling for physical activity levels, socio-demographics, urbanicity, etc. RESULTS: The results suggested that green space was overall inversely and non-linearly associated with antidepressant prescription rates. More important, the associations differed across the quantiles, although the variation was modest. Significant non-linearities were apparent: The associations were slightly positive in the lower quantile and strongly negative in the upper one. Our findings imply that an increased availability of green space within a municipality may contribute to a reduction in the number of antidepressant prescriptions dispensed. Green space is thus a central health and community asset, whilst a minimum level of 28% needs to be established for health gains. The highest effectiveness occurred at a municipality surface percentage higher than 79%. This inverse dose-dependent relation has important implications for setting future community-level health and planning policies. Copyright © 2018 Elsevier Inc. All rights reserved.
The Healthy Worker Effect and Nuclear Industry Workers

PubMed Central

Fornalski, Krzysztof W.; Dobrzyński, Ludwik

2010-01-01

The linear no-threshold (LNT) dose-effect relationship has been consistently used by most radiation epidemiologists to estimate cancer mortality risk. The large scattering of data by International Agency for Research on Cancer, IARC (Vrijheid et al. 2007; Therry-Chef et al. 2007; Cardis et al. 2007), interpreted in accordance with LNT, has been previously demonstrated (Fornalski and Dobrzyński 2009). Using conventional and Bayesian methods the present paper demonstrates that the standard mortality ratios (SMRs), lower in the IARC cohort of exposed nuclear workers than in the non exposed group, should be considered as a hormetic effect, rather than a healthy worker effect (HWE) as claimed by the IARC group. PMID:20585442
Pretense, Counterfactuals, and Bayesian Causal Models: Why What Is Not Real Really Matters

ERIC Educational Resources Information Center

Weisberg, Deena S.; Gopnik, Alison

2013-01-01

Young children spend a large portion of their time pretending about non-real situations. Why? We answer this question by using the framework of Bayesian causal models to argue that pretending and counterfactual reasoning engage the same component cognitive abilities: disengaging with current reality, making inferences about an alternative…
A flexible Bayesian assessment for the expected impact of data on prediction confidence for optimal sampling designs

NASA Astrophysics Data System (ADS)

Leube, Philipp; Geiges, Andreas; Nowak, Wolfgang

2010-05-01

Incorporating hydrogeological data, such as head and tracer data, into stochastic models of subsurface flow and transport helps to reduce prediction uncertainty. Considering limited financial resources available for the data acquisition campaign, information needs towards the prediction goal should be satisfied in a efficient and task-specific manner. For finding the best one among a set of design candidates, an objective function is commonly evaluated, which measures the expected impact of data on prediction confidence, prior to their collection. An appropriate approach to this task should be stochastically rigorous, master non-linear dependencies between data, parameters and model predictions, and allow for a wide variety of different data types. Existing methods fail to fulfill all these requirements simultaneously. For this reason, we introduce a new method, denoted as CLUE (Cross-bred Likelihood Uncertainty Estimator), that derives the essential distributions and measures of data utility within a generalized, flexible and accurate framework. The method makes use of Bayesian GLUE (Generalized Likelihood Uncertainty Estimator) and extends it to an optimal design method by marginalizing over the yet unknown data values. Operating in a purely Bayesian Monte-Carlo framework, CLUE is a strictly formal information processing scheme free of linearizations. It provides full flexibility associated with the type of measurements (linear, non-linear, direct, indirect) and accounts for almost arbitrary sources of uncertainty (e.g. heterogeneity, geostatistical assumptions, boundary conditions, model concepts) via stochastic simulation and Bayesian model averaging. This helps to minimize the strength and impact of possible subjective prior assumptions, that would be hard to defend prior to data collection. Our study focuses on evaluating two different uncertainty measures: (i) expected conditional variance and (ii) expected relative entropy of a given prediction goal. The applicability and advantages are shown in a synthetic example. Therefor, we consider a contaminant source, posing a threat on a drinking water well in an aquifer. Furthermore, we assume uncertainty in geostatistical parameters, boundary conditions and hydraulic gradient. The two mentioned measures evaluate the sensitivity of (1) general prediction confidence and (2) exceedance probability of a legal regulatory threshold value on sampling locations.
Non-stationary hydrologic frequency analysis using B-spline quantile regression

NASA Astrophysics Data System (ADS)

Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.

2017-11-01

Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.
Non-Linear Modeling of Growth Prerequisites in a Finnish Polytechnic Institution of Higher Education

ERIC Educational Resources Information Center

Nokelainen, Petri; Ruohotie, Pekka

2009-01-01

Purpose: This study aims to examine the factors of growth-oriented atmosphere in a Finnish polytechnic institution of higher education with categorical exploratory factor analysis, multidimensional scaling and Bayesian unsupervised model-based visualization. Design/methodology/approach: This study was designed to examine employee perceptions of…
Enhancements of Bayesian Blocks; Application to Large Light Curve Databases

NASA Technical Reports Server (NTRS)

Scargle, Jeff

2015-01-01

Bayesian Blocks are optimal piecewise linear representations (step function fits) of light-curves. The simple algorithm implementing this idea, using dynamic programming, has been extended to include more data modes and fitness metrics, multivariate analysis, and data on the circle (Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations, Scargle, Norris, Jackson and Chiang 2013, ApJ, 764, 167), as well as new results on background subtraction and refinement of the procedure for precise timing of transient events in sparse data. Example demonstrations will include exploratory analysis of the Kepler light curve archive in a search for "star-tickling" signals from extraterrestrial civilizations. (The Cepheid Galactic Internet, Learned, Kudritzki, Pakvasa1, and Zee, 2008, arXiv: 0809.0339; Walkowicz et al., in progress).
Rigorous Approach in Investigation of Seismic Structure and Source Characteristicsin Northeast Asia: Hierarchical and Trans-dimensional Bayesian Inversion

NASA Astrophysics Data System (ADS)

Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.

2015-12-01

Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.
Pathway analysis of high-throughput biological data within a Bayesian network framework.

PubMed

Isci, Senol; Ozturk, Cengizhan; Jones, Jon; Otu, Hasan H

2011-06-15

Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Proposed method takes into account the connectivity and relatedness between nodes of the pathway through factoring pathway topology in its model. Our simulations using synthetic data demonstrated robustness of our approach. We tested proposed method, Bayesian Pathway Analysis (BPA), on human microarray data regarding renal cell carcinoma (RCC) and compared our results with gene set enrichment analysis. BPA was able to find broader and more specific pathways related to RCC. Accompanying BPA software (BPAS) package is freely available for academic use at http://bumil.boun.edu.tr/bpa.
Sparsity-promoting and edge-preserving maximum a posteriori estimators in non-parametric Bayesian inverse problems

NASA Astrophysics Data System (ADS)

Agapiou, Sergios; Burger, Martin; Dashti, Masoumeh; Helin, Tapio

2018-04-01

We consider the inverse problem of recovering an unknown functional parameter u in a separable Banach space, from a noisy observation vector y of its image through a known possibly non-linear map {{\\mathcal G}} . We adopt a Bayesian approach to the problem and consider Besov space priors (see Lassas et al (2009 Inverse Problems Imaging 3 87-122)), which are well-known for their edge-preserving and sparsity-promoting properties and have recently attracted wide attention especially in the medical imaging community. Our key result is to show that in this non-parametric setup the maximum a posteriori (MAP) estimates are characterized by the minimizers of a generalized Onsager-Machlup functional of the posterior. This is done independently for the so-called weak and strong MAP estimates, which as we show coincide in our context. In addition, we prove a form of weak consistency for the MAP estimators in the infinitely informative data limit. Our results are remarkable for two reasons: first, the prior distribution is non-Gaussian and does not meet the smoothness conditions required in previous research on non-parametric MAP estimates. Second, the result analytically justifies existing uses of the MAP estimate in finite but high dimensional discretizations of Bayesian inverse problems with the considered Besov priors.
Bayesian selection of misspecified models is overconfident and may cause spurious posterior probabilities for phylogenetic trees.

PubMed

Yang, Ziheng; Zhu, Tianqi

2018-02-20

The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.
Stochastic static fault slip inversion from geodetic data with non-negativity and bound constraints

NASA Astrophysics Data System (ADS)

Nocquet, J.-M.

2018-07-01

Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modelling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a truncated multivariate normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulae for the single, 2-D or n-D marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations. Posterior mean and covariance can also be efficiently derived. I show that the maximum posterior (MAP) can be obtained using a non-negative least-squares algorithm for the single truncated case or using the bounded-variable least-squares algorithm for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modelling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC-based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the MAP is extremely fast.
Bayesian evidence computation for model selection in non-linear geoacoustic inference problems.

PubMed

Dettmer, Jan; Dosso, Stan E; Osler, John C

2010-12-01

This paper applies a general Bayesian inference approach, based on Bayesian evidence computation, to geoacoustic inversion of interface-wave dispersion data. Quantitative model selection is carried out by computing the evidence (normalizing constants) for several model parameterizations using annealed importance sampling. The resulting posterior probability density estimate is compared to estimates obtained from Metropolis-Hastings sampling to ensure consistent results. The approach is applied to invert interface-wave dispersion data collected on the Scotian Shelf, off the east coast of Canada for the sediment shear-wave velocity profile. Results are consistent with previous work on these data but extend the analysis to a rigorous approach including model selection and uncertainty analysis. The results are also consistent with core samples and seismic reflection measurements carried out in the area.
A Sparse Bayesian Learning Algorithm for White Matter Parameter Estimation from Compressed Multi-shell Diffusion MRI.

PubMed

Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe

2017-09-01

We propose a sparse Bayesian learning algorithm for improved estimation of white matter fiber parameters from compressed (under-sampled q-space) multi-shell diffusion MRI data. The multi-shell data is represented in a dictionary form using a non-monoexponential decay model of diffusion, based on continuous gamma distribution of diffusivities. The fiber volume fractions with predefined orientations, which are the unknown parameters, form the dictionary weights. These unknown parameters are estimated with a linear un-mixing framework, using a sparse Bayesian learning algorithm. A localized learning of hyperparameters at each voxel and for each possible fiber orientations improves the parameter estimation. Our experiments using synthetic data from the ISBI 2012 HARDI reconstruction challenge and in-vivo data from the Human Connectome Project demonstrate the improvements.
Bayesian whole-genome prediction and genome-wide association analysis with missing genotypes using variable selection

USDA-ARS?s Scientific Manuscript database

Single-step Genomic Best Linear Unbiased Predictor (ssGBLUP) has become increasingly popular for whole-genome prediction (WGP) modeling as it utilizes any available pedigree and phenotypes on both genotyped and non-genotyped individuals. The WGP accuracy of ssGBLUP has been demonstrated to be greate...
A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.

PubMed

Houseman, E Andres; Virji, M Abbas

2017-08-01

Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.
Bayesian estimation of self-similarity exponent

NASA Astrophysics Data System (ADS)

Makarava, Natallia; Benmehdi, Sabah; Holschneider, Matthias

2011-08-01

In this study we propose a Bayesian approach to the estimation of the Hurst exponent in terms of linear mixed models. Even for unevenly sampled signals and signals with gaps, our method is applicable. We test our method by using artificial fractional Brownian motion of different length and compare it with the detrended fluctuation analysis technique. The estimation of the Hurst exponent of a Rosenblatt process is shown as an example of an H-self-similar process with non-Gaussian dimensional distribution. Additionally, we perform an analysis with real data, the Dow-Jones Industrial Average closing values, and analyze its temporal variation of the Hurst exponent.
Bayesian just-so stories in psychology and neuroscience.

PubMed

Bowers, Jeffrey S; Davis, Colin J

2012-05-01

According to Bayesian theories in psychology and neuroscience, minds and brains are (near) optimal in solving a wide range of tasks. We challenge this view and argue that more traditional, non-Bayesian approaches are more promising. We make 3 main arguments. First, we show that the empirical evidence for Bayesian theories in psychology is weak. This weakness relates to the many arbitrary ways that priors, likelihoods, and utility functions can be altered in order to account for the data that are obtained, making the models unfalsifiable. It further relates to the fact that Bayesian theories are rarely better at predicting data compared with alternative (and simpler) non-Bayesian theories. Second, we show that the empirical evidence for Bayesian theories in neuroscience is weaker still. There are impressive mathematical analyses showing how populations of neurons could compute in a Bayesian manner but little or no evidence that they do. Third, we challenge the general scientific approach that characterizes Bayesian theorizing in cognitive science. A common premise is that theories in psychology should largely be constrained by a rational analysis of what the mind ought to do. We question this claim and argue that many of the important constraints come from biological, evolutionary, and processing (algorithmic) considerations that have no adaptive relevance to the problem per se. In our view, these factors have contributed to the development of many Bayesian "just so" stories in psychology and neuroscience; that is, mathematical analyses of cognition that can be used to explain almost any behavior as optimal. 2012 APA, all rights reserved.

On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang

2015-02-01

The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesianmore » inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.« less
Protein construct storage: Bayesian variable selection and prediction with mixtures.

PubMed

Clyde, M A; Parmigiani, G

1998-07-01

Determining optimal conditions for protein storage while maintaining a high level of protein activity is an important question in pharmaceutical research. A designed experiment based on a space-filling design was conducted to understand the effects of factors affecting protein storage and to establish optimal storage conditions. Different model-selection strategies to identify important factors may lead to very different answers about optimal conditions. Uncertainty about which factors are important, or model uncertainty, can be a critical issue in decision-making. We use Bayesian variable selection methods for linear models to identify important variables in the protein storage data, while accounting for model uncertainty. We also use the Bayesian framework to build predictions based on a large family of models, rather than an individual model, and to evaluate the probability that certain candidate storage conditions are optimal.
Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model.

PubMed

Dong, Linsong; Wang, Zhiyong

2018-06-11

Genomic prediction is feasible for estimating genomic breeding values because of dense genome-wide markers and credible statistical methods, such as Genomic Best Linear Unbiased Prediction (GBLUP) and various Bayesian methods. Compared with GBLUP, Bayesian methods propose more flexible assumptions for the distributions of SNP effects. However, most Bayesian methods are performed based on Markov chain Monte Carlo (MCMC) algorithms, leading to computational efficiency challenges. Hence, some fast Bayesian approaches, such as fast BayesB (fBayesB), were proposed to speed up the calculation. This study proposed another fast Bayesian method termed fast BayesC (fBayesC). The prior distribution of fBayesC assumes that a SNP with probability γ has a non-zero effect which comes from a normal density with a common variance. The simulated data from QTLMAS XII workshop and actual data on large yellow croaker were used to compare the predictive results of fBayesB, fBayesC and (MCMC-based) BayesC. The results showed that when γ was set as a small value, such as 0.01 in the simulated data or 0.001 in the actual data, fBayesB and fBayesC yielded lower prediction accuracies (abilities) than BayesC. In the actual data, fBayesC could yield very similar predictive abilities as BayesC when γ ≥ 0.01. When γ = 0.01, fBayesB could also yield similar results as fBayesC and BayesC. However, fBayesB could not yield an explicit result when γ ≥ 0.1, but a similar situation was not observed for fBayesC. Moreover, the computational speed of fBayesC was significantly faster than that of BayesC, making fBayesC a promising method for genomic prediction.
Bayesian Inference for Generalized Linear Models for Spiking Neurons

PubMed Central

Gerwinn, Sebastian; Macke, Jakob H.; Bethge, Matthias

2010-01-01

Generalized Linear Models (GLMs) are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate. PMID:20577627
Non-linear Parameter Estimates from Non-stationary MEG Data

PubMed Central

Martínez-Vargas, Juan D.; López, Jose D.; Baker, Adam; Castellanos-Dominguez, German; Woolrich, Mark W.; Barnes, Gareth

2016-01-01

We demonstrate a method to estimate key electrophysiological parameters from resting state data. In this paper, we focus on the estimation of head-position parameters. The recovery of these parameters is especially challenging as they are non-linearly related to the measured field. In order to do this we use an empirical Bayesian scheme to estimate the cortical current distribution due to a range of laterally shifted head-models. We compare different methods of approaching this problem from the division of M/EEG data into stationary sections and performing separate source inversions, to explaining all of the M/EEG data with a single inversion. We demonstrate this through estimation of head position in both simulated and empirical resting state MEG data collected using a head-cast. PMID:27597815
Stochastic static fault slip inversion from geodetic data with non-negativity and bounds constraints

NASA Astrophysics Data System (ADS)

Nocquet, J.-M.

2018-04-01

Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems (Tarantola & Valette 1982; Tarantola 2005) provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modeling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a Truncated Multi-Variate Normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulas for the single, two-dimensional or n-dimensional marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations (e.g. Genz & Bretz 2009). Posterior mean and covariance can also be efficiently derived. I show that the Maximum Posterior (MAP) can be obtained using a Non-Negative Least-Squares algorithm (Lawson & Hanson 1974) for the single truncated case or using the Bounded-Variable Least-Squares algorithm (Stark & Parker 1995) for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov Chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modeling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the Maximum Posterior (MAP) is extremely fast.
Inferring the most probable maps of underground utilities using Bayesian mapping model

NASA Astrophysics Data System (ADS)

Bilal, Muhammad; Khan, Wasiq; Muggleton, Jennifer; Rustighi, Emiliano; Jenks, Hugo; Pennock, Steve R.; Atkins, Phil R.; Cohn, Anthony

2018-03-01

Mapping the Underworld (MTU), a major initiative in the UK, is focused on addressing social, environmental and economic consequences raised from the inability to locate buried underground utilities (such as pipes and cables) by developing a multi-sensor mobile device. The aim of MTU device is to locate different types of buried assets in real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization is challenging and requires the implementation of robust machine learning/data fusion approaches. An approach for automated creation of revised maps was developed as a Bayesian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensors was for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classification techniques for segment-manhole connections. The model consisting of image segmentation algorithm and various Bayesian classification techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing refined 2D/3D maps.
Discriminative Bayesian Dictionary Learning for Classification.

PubMed

Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

2016-12-01

We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.
Bayesian modelling of the emission spectrum of the Joint European Torus Lithium Beam Emission Spectroscopy system.

PubMed

Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C

2016-02-01

A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.

PubMed

Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi

2015-02-01

We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Comparison of Classifiers for Decoding Sensory and Cognitive Information from Prefrontal Neuronal Populations

PubMed Central

Astrand, Elaine; Enel, Pierre; Ibos, Guilhem; Dominey, Peter Ford; Baraduc, Pierre; Ben Hamed, Suliann

2014-01-01

Decoding neuronal information is important in neuroscience, both as a basic means to understand how neuronal activity is related to cerebral function and as a processing stage in driving neuroprosthetic effectors. Here, we compare the readout performance of six commonly used classifiers at decoding two different variables encoded by the spiking activity of the non-human primate frontal eye fields (FEF): the spatial position of a visual cue, and the instructed orientation of the animal's attention. While the first variable is exogenously driven by the environment, the second variable corresponds to the interpretation of the instruction conveyed by the cue; it is endogenously driven and corresponds to the output of internal cognitive operations performed on the visual attributes of the cue. These two variables were decoded using either a regularized optimal linear estimator in its explicit formulation, an optimal linear artificial neural network estimator, a non-linear artificial neural network estimator, a non-linear naïve Bayesian estimator, a non-linear Reservoir recurrent network classifier or a non-linear Support Vector Machine classifier. Our results suggest that endogenous information such as the orientation of attention can be decoded from the FEF with the same accuracy as exogenous visual information. All classifiers did not behave equally in the face of population size and heterogeneity, the available training and testing trials, the subject's behavior and the temporal structure of the variable of interest. In most situations, the regularized optimal linear estimator and the non-linear Support Vector Machine classifiers outperformed the other tested decoders. PMID:24466019
A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction

PubMed Central

Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R.; Buenrostro-Mariscal, Raymundo

2017-01-01

There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. PMID:28391241
A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction.

PubMed

Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R; Buenrostro-Mariscal, Raymundo

2017-06-07

There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. Copyright © 2017 Montesinos-López et al.
Validation of Bayesian analysis of compartmental kinetic models in medical imaging.

PubMed

Sitek, Arkadiusz; Li, Quanzheng; El Fakhri, Georges; Alpert, Nathaniel M

2016-10-01

Kinetic compartmental analysis is frequently used to compute physiologically relevant quantitative values from time series of images. In this paper, a new approach based on Bayesian analysis to obtain information about these parameters is presented and validated. The closed-form of the posterior distribution of kinetic parameters is derived with a hierarchical prior to model the standard deviation of normally distributed noise. Markov chain Monte Carlo methods are used for numerical estimation of the posterior distribution. Computer simulations of the kinetics of F18-fluorodeoxyglucose (FDG) are used to demonstrate drawing statistical inferences about kinetic parameters and to validate the theory and implementation. Additionally, point estimates of kinetic parameters and covariance of those estimates are determined using the classical non-linear least squares approach. Posteriors obtained using methods proposed in this work are accurate as no significant deviation from the expected shape of the posterior was found (one-sided P>0.08). It is demonstrated that the results obtained by the standard non-linear least-square methods fail to provide accurate estimation of uncertainty for the same data set (P<0.0001). The results of this work validate new methods for a computer simulations of FDG kinetics. Results show that in situations where the classical approach fails in accurate estimation of uncertainty, Bayesian estimation provides an accurate information about the uncertainties in the parameters. Although a particular example of FDG kinetics was used in the paper, the methods can be extended for different pharmaceuticals and imaging modalities. Copyright © 2016 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Tmax Determined Using a Bayesian Estimation Deconvolution Algorithm Applied to Bolus Tracking Perfusion Imaging: A Digital Phantom Validation Study.

PubMed

Uwano, Ikuko; Sasaki, Makoto; Kudo, Kohsuke; Boutelier, Timothé; Kameda, Hiroyuki; Mori, Futoshi; Yamashita, Fumio

2017-01-10

The Bayesian estimation algorithm improves the precision of bolus tracking perfusion imaging. However, this algorithm cannot directly calculate Tmax, the time scale widely used to identify ischemic penumbra, because Tmax is a non-physiological, artificial index that reflects the tracer arrival delay (TD) and other parameters. We calculated Tmax from the TD and mean transit time (MTT) obtained by the Bayesian algorithm and determined its accuracy in comparison with Tmax obtained by singular value decomposition (SVD) algorithms. The TD and MTT maps were generated by the Bayesian algorithm applied to digital phantoms with time-concentration curves that reflected a range of values for various perfusion metrics using a global arterial input function. Tmax was calculated from the TD and MTT using constants obtained by a linear least-squares fit to Tmax obtained from the two SVD algorithms that showed the best benchmarks in a previous study. Correlations between the Tmax values obtained by the Bayesian and SVD methods were examined. The Bayesian algorithm yielded accurate TD and MTT values relative to the true values of the digital phantom. Tmax calculated from the TD and MTT values with the least-squares fit constants showed excellent correlation (Pearson's correlation coefficient = 0.99) and agreement (intraclass correlation coefficient = 0.99) with Tmax obtained from SVD algorithms. Quantitative analyses of Tmax values calculated from Bayesian-estimation algorithm-derived TD and MTT from a digital phantom correlated and agreed well with Tmax values determined using SVD algorithms.
Bayesian Travel Time Inversion adopting Gaussian Process Regression

NASA Astrophysics Data System (ADS)

Mauerberger, S.; Holschneider, M.

2017-12-01

A major application in seismology is the determination of seismic velocity models. Travel time measurements are putting an integral constraint on the velocity between source and receiver. We provide insight into travel time inversion from a correlation-based Bayesian point of view. Therefore, the concept of Gaussian process regression is adopted to estimate a velocity model. The non-linear travel time integral is approximated by a 1st order Taylor expansion. A heuristic covariance describes correlations amongst observations and a priori model. That approach enables us to assess a proxy of the Bayesian posterior distribution at ordinary computational costs. No multi dimensional numeric integration nor excessive sampling is necessary. Instead of stacking the data, we suggest to progressively build the posterior distribution. Incorporating only a single evidence at a time accounts for the deficit of linearization. As a result, the most probable model is given by the posterior mean whereas uncertainties are described by the posterior covariance.As a proof of concept, a synthetic purely 1d model is addressed. Therefore a single source accompanied by multiple receivers is considered on top of a model comprising a discontinuity. We consider travel times of both phases - direct and reflected wave - corrupted by noise. Left and right of the interface are assumed independent where the squared exponential kernel serves as covariance.
Detection and recognition of simple spatial forms

NASA Technical Reports Server (NTRS)

Watson, A. B.

1983-01-01

A model of human visual sensitivity to spatial patterns is constructed. The model predicts the visibility and discriminability of arbitrary two-dimensional monochrome images. The image is analyzed by a large array of linear feature sensors, which differ in spatial frequency, phase, orientation, and position in the visual field. All sensors have one octave frequency bandwidths, and increase in size linearly with eccentricity. Sensor responses are processed by an ideal Bayesian classifier, subject to uncertainty. The performance of the model is compared to that of the human observer in detecting and discriminating some simple images.
Unsupervised Bayesian linear unmixing of gene expression microarrays.

PubMed

Bazot, Cécile; Dobigeon, Nicolas; Tourneret, Jean-Yves; Zaas, Aimee K; Ginsburg, Geoffrey S; Hero, Alfred O

2013-03-19

This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor.
Bayesian tomography and integrated data analysis in fusion diagnostics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Dong, E-mail: lid@swip.ac.cn; Dong, Y. B.; Deng, Wei

2016-11-15

In this article, a Bayesian tomography method using non-stationary Gaussian process for a prior has been introduced. The Bayesian formalism allows quantities which bear uncertainty to be expressed in the probabilistic form so that the uncertainty of a final solution can be fully resolved from the confidence interval of a posterior probability. Moreover, a consistency check of that solution can be performed by checking whether the misfits between predicted and measured data are reasonably within an assumed data error. In particular, the accuracy of reconstructions is significantly improved by using the non-stationary Gaussian process that can adapt to the varyingmore » smoothness of emission distribution. The implementation of this method to a soft X-ray diagnostics on HL-2A has been used to explore relevant physics in equilibrium and MHD instability modes. This project is carried out within a large size inference framework, aiming at an integrated analysis of heterogeneous diagnostics.« less
Simultaneous fitting of genomic-BLUP and Bayes-C components in a genomic prediction model.

PubMed

Iheshiulor, Oscar O M; Woolliams, John A; Svendsen, Morten; Solberg, Trygve; Meuwissen, Theo H E

2017-08-24

The rapid adoption of genomic selection is due to two key factors: availability of both high-throughput dense genotyping and statistical methods to estimate and predict breeding values. The development of such methods is still ongoing and, so far, there is no consensus on the best approach. Currently, the linear and non-linear methods for genomic prediction (GP) are treated as distinct approaches. The aim of this study was to evaluate the implementation of an iterative method (called GBC) that incorporates aspects of both linear [genomic-best linear unbiased prediction (G-BLUP)] and non-linear (Bayes-C) methods for GP. The iterative nature of GBC makes it less computationally demanding similar to other non-Markov chain Monte Carlo (MCMC) approaches. However, as a Bayesian method, GBC differs from both MCMC- and non-MCMC-based methods by combining some aspects of G-BLUP and Bayes-C methods for GP. Its relative performance was compared to those of G-BLUP and Bayes-C. We used an imputed 50 K single-nucleotide polymorphism (SNP) dataset based on the Illumina Bovine50K BeadChip, which included 48,249 SNPs and 3244 records. Daughter yield deviations for somatic cell count, fat yield, milk yield, and protein yield were used as response variables. GBC was frequently (marginally) superior to G-BLUP and Bayes-C in terms of prediction accuracy and was significantly better than G-BLUP only for fat yield. On average across the four traits, GBC yielded a 0.009 and 0.006 increase in prediction accuracy over G-BLUP and Bayes-C, respectively. Computationally, GBC was very much faster than Bayes-C and similar to G-BLUP. Our results show that incorporating some aspects of G-BLUP and Bayes-C in a single model can improve accuracy of GP over the commonly used method: G-BLUP. Generally, GBC did not statistically perform better than G-BLUP and Bayes-C, probably due to the close relationships between reference and validation individuals. Nevertheless, it is a flexible tool, in the sense, that it simultaneously incorporates some aspects of linear and non-linear models for GP, thereby exploiting family relationships while also accounting for linkage disequilibrium between SNPs and genes with large effects. The application of GBC in GP merits further exploration.

The Variability and Interpretation of Earthquake Source Mechanisms in The Geysers Geothermal Field From a Bayesian Standpoint Based on the Choice of a Noise Model

NASA Astrophysics Data System (ADS)

Mustać, Marija; Tkalčić, Hrvoje; Burky, Alexander L.

2018-01-01

Moment tensor (MT) inversion studies of events in The Geysers geothermal field mostly focused on microseismicity and found a large number of earthquakes with significant non-double-couple (non-DC) seismic radiation. Here we concentrate on the largest events in the area in recent years using a hierarchical Bayesian MT inversion. Initially, we show that the non-DC components of the MT can be reliably retrieved using regional waveform data from a small number of stations. Subsequently, we present results for a number of events and show that accounting for noise correlations can lead to retrieval of a lower isotropic (ISO) component and significantly different focal mechanisms. We compute the Bayesian evidence to compare solutions obtained with different assumptions of the noise covariance matrix. Although a diagonal covariance matrix produces a better waveform fit, inversions that account for noise correlations via an empirically estimated noise covariance matrix account for interdependences of data errors and are preferred from a Bayesian point of view. This implies that improper treatment of data noise in waveform inversions can result in fitting the noise and misinterpreting the non-DC components. Finally, one of the analyzed events is characterized as predominantly DC, while the others still have significant non-DC components, probably as a result of crack opening, which is a reasonable hypothesis for The Geysers geothermal field geological setting.
Advances in the microrheology of complex fluids

NASA Astrophysics Data System (ADS)

Waigh, Thomas Andrew

2016-07-01

New developments in the microrheology of complex fluids are considered. Firstly the requirements for a simple modern particle tracking microrheology experiment are introduced, the error analysis methods associated with it and the mathematical techniques required to calculate the linear viscoelasticity. Progress in microrheology instrumentation is then described with respect to detectors, light sources, colloidal probes, magnetic tweezers, optical tweezers, diffusing wave spectroscopy, optical coherence tomography, fluorescence correlation spectroscopy, elastic- and quasi-elastic scattering techniques, 3D tracking, single molecule methods, modern microscopy methods and microfluidics. New theoretical techniques are also reviewed such as Bayesian analysis, oversampling, inversion techniques, alternative statistical tools for tracks (angular correlations, first passage probabilities, the kurtosis, motor protein step segmentation etc), issues in micro/macro rheological agreement and two particle methodologies. Applications where microrheology has begun to make some impact are also considered including semi-flexible polymers, gels, microorganism biofilms, intracellular methods, high frequency viscoelasticity, comb polymers, active motile fluids, blood clots, colloids, granular materials, polymers, liquid crystals and foods. Two large emergent areas of microrheology, non-linear microrheology and surface microrheology are also discussed.
Flood quantile estimation at ungauged sites by Bayesian networks

NASA Astrophysics Data System (ADS)

Mediero, L.; Santillán, D.; Garrote, L.

2012-04-01

Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
a Novel Discrete Optimal Transport Method for Bayesian Inverse Problems

NASA Astrophysics Data System (ADS)

Bui-Thanh, T.; Myers, A.; Wang, K.; Thiery, A.

2017-12-01

We present the Augmented Ensemble Transform (AET) method for generating approximate samples from a high-dimensional posterior distribution as a solution to Bayesian inverse problems. Solving large-scale inverse problems is critical for some of the most relevant and impactful scientific endeavors of our time. Therefore, constructing novel methods for solving the Bayesian inverse problem in more computationally efficient ways can have a profound impact on the science community. This research derives the novel AET method for exploring a posterior by solving a sequence of linear programming problems, resulting in a series of transport maps which map prior samples to posterior samples, allowing for the computation of moments of the posterior. We show both theoretical and numerical results, indicating this method can offer superior computational efficiency when compared to other SMC methods. Most of this efficiency is derived from matrix scaling methods to solve the linear programming problem and derivative-free optimization for particle movement. We use this method to determine inter-well connectivity in a reservoir and the associated uncertainty related to certain parameters. The attached file shows the difference between the true parameter and the AET parameter in an example 3D reservoir problem. The error is within the Morozov discrepancy allowance with lower computational cost than other particle methods.
Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design

NASA Astrophysics Data System (ADS)

Leube, P. C.; Geiges, A.; Nowak, W.

2012-02-01

Incorporating hydro(geo)logical data, such as head and tracer data, into stochastic models of (subsurface) flow and transport helps to reduce prediction uncertainty. Because of financial limitations for investigation campaigns, information needs toward modeling or prediction goals should be satisfied efficiently and rationally. Optimal design techniques find the best one among a set of investigation strategies. They optimize the expected impact of data on prediction confidence or related objectives prior to data collection. We introduce a new optimal design method, called PreDIA(gnosis) (Preposterior Data Impact Assessor). PreDIA derives the relevant probability distributions and measures of data utility within a fully Bayesian, generalized, flexible, and accurate framework. It extends the bootstrap filter (BF) and related frameworks to optimal design by marginalizing utility measures over the yet unknown data values. PreDIA is a strictly formal information-processing scheme free of linearizations. It works with arbitrary simulation tools, provides full flexibility concerning measurement types (linear, nonlinear, direct, indirect), allows for any desired task-driven formulations, and can account for various sources of uncertainty (e.g., heterogeneity, geostatistical assumptions, boundary conditions, measurement values, model structure uncertainty, a large class of model errors) via Bayesian geostatistics and model averaging. Existing methods fail to simultaneously provide these crucial advantages, which our method buys at relatively higher-computational costs. We demonstrate the applicability and advantages of PreDIA over conventional linearized methods in a synthetic example of subsurface transport. In the example, we show that informative data is often invisible for linearized methods that confuse zero correlation with statistical independence. Hence, PreDIA will often lead to substantially better sampling designs. Finally, we extend our example to specifically highlight the consideration of conceptual model uncertainty.
Spatio-temporal Bayesian model selection for disease mapping

PubMed Central

Carroll, R; Lawson, AB; Faes, C; Kirby, RS; Aregay, M; Watjou, K

2016-01-01

Spatio-temporal analysis of small area health data often involves choosing a fixed set of predictors prior to the final model fit. In this paper, we propose a spatio-temporal approach of Bayesian model selection to implement model selection for certain areas of the study region as well as certain years in the study time line. Here, we examine the usefulness of this approach by way of a large-scale simulation study accompanied by a case study. Our results suggest that a special case of the model selection methods, a mixture model allowing a weight parameter to indicate if the appropriate linear predictor is spatial, spatio-temporal, or a mixture of the two, offers the best option to fitting these spatio-temporal models. In addition, the case study illustrates the effectiveness of this mixture model within the model selection setting by easily accommodating lifestyle, socio-economic, and physical environmental variables to select a predominantly spatio-temporal linear predictor. PMID:28070156
Bayesian SEM for Specification Search Problems in Testing Factorial Invariance.

PubMed

Shi, Dexin; Song, Hairong; Liao, Xiaolan; Terry, Robert; Snyder, Lori A

2017-01-01

Specification search problems refer to two important but under-addressed issues in testing for factorial invariance: how to select proper reference indicators and how to locate specific non-invariant parameters. In this study, we propose a two-step procedure to solve these issues. Step 1 is to identify a proper reference indicator using the Bayesian structural equation modeling approach. An item is selected if it is associated with the highest likelihood to be invariant across groups. Step 2 is to locate specific non-invariant parameters, given that a proper reference indicator has already been selected in Step 1. A series of simulation analyses show that the proposed method performs well under a variety of data conditions, and optimal performance is observed under conditions of large magnitude of non-invariance, low proportion of non-invariance, and large sample sizes. We also provide an empirical example to demonstrate the specific procedures to implement the proposed method in applied research. The importance and influences are discussed regarding the choices of informative priors with zero mean and small variances. Extensions and limitations are also pointed out.
Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method.

PubMed

Norris, Peter M; da Silva, Arlindo M

2016-07-01

A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.
Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 1: Method

NASA Technical Reports Server (NTRS)

Norris, Peter M.; Da Silva, Arlindo M.

2016-01-01

A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.
Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method

PubMed Central

Norris, Peter M.; da Silva, Arlindo M.

2018-01-01

A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847
Detecting a Change in School Performance: A Bayesian Analysis for a Multilevel Join Point Problem. CSE Technical Report 542.

ERIC Educational Resources Information Center

Thum, Yeow Meng; Bhattacharya, Suman Kumar

To better describe individual behavior within a system, this paper uses a sample of longitudinal test scores from a large urban school system to consider hierarchical Bayes estimation of a multilevel linear regression model in which each individual regression slope of test score on time switches at some unknown point in time, "kj."…
Bayesian inference of interaction properties of noisy dynamical systems with time-varying coupling: capabilities and limitations

NASA Astrophysics Data System (ADS)

Wilting, Jens; Lehnertz, Klaus

2015-08-01

We investigate a recently published analysis framework based on Bayesian inference for the time-resolved characterization of interaction properties of noisy, coupled dynamical systems. It promises wide applicability and a better time resolution than well-established methods. At the example of representative model systems, we show that the analysis framework has the same weaknesses as previous methods, particularly when investigating interacting, structurally different non-linear oscillators. We also inspect the tracking of time-varying interaction properties and propose a further modification of the algorithm, which improves the reliability of obtained results. We exemplarily investigate the suitability of this algorithm to infer strength and direction of interactions between various regions of the human brain during an epileptic seizure. Within the limitations of the applicability of this analysis tool, we show that the modified algorithm indeed allows a better time resolution through Bayesian inference when compared to previous methods based on least square fits.
Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks

NASA Astrophysics Data System (ADS)

Sun, Wei; Chang, K. C.

2005-05-01

Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.
Experimental and statistical study on fracture boundary of non-irradiated Zircaloy-4 cladding tube under LOCA conditions

NASA Astrophysics Data System (ADS)

Narukawa, Takafumi; Yamaguchi, Akira; Jang, Sunghyon; Amaya, Masaki

2018-02-01

For estimating fracture probability of fuel cladding tube under loss-of-coolant accident conditions of light-water-reactors, laboratory-scale integral thermal shock tests were conducted on non-irradiated Zircaloy-4 cladding tube specimens. Then, the obtained binary data with respect to fracture or non-fracture of the cladding tube specimen were analyzed statistically. A method to obtain the fracture probability curve as a function of equivalent cladding reacted (ECR) was proposed using Bayesian inference for generalized linear models: probit, logit, and log-probit models. Then, model selection was performed in terms of physical characteristics and information criteria, a widely applicable information criterion and a widely applicable Bayesian information criterion. As a result, it was clarified that the log-probit model was the best among the three models to estimate the fracture probability in terms of the degree of prediction accuracy for both next data to be obtained and the true model. Using the log-probit model, it was shown that 20% ECR corresponded to a 5% probability level with a 95% confidence of fracture of the cladding tube specimens.
Comparing Families of Dynamic Causal Models

PubMed Central

Penny, Will D.; Stephan, Klaas E.; Daunizeau, Jean; Rosa, Maria J.; Friston, Karl J.; Schofield, Thomas M.; Leff, Alex P.

2010-01-01

Mathematical models of scientific data can be formally compared using Bayesian model evidence. Previous applications in the biological sciences have mainly focussed on model selection in which one first selects the model with the highest evidence and then makes inferences based on the parameters of that model. This “best model” approach is very useful but can become brittle if there are a large number of models to compare, and if different subjects use different models. To overcome this shortcoming we propose the combination of two further approaches: (i) family level inference and (ii) Bayesian model averaging within families. Family level inference removes uncertainty about aspects of model structure other than the characteristic of interest. For example: What are the inputs to the system? Is processing serial or parallel? Is it linear or nonlinear? Is it mediated by a single, crucial connection? We apply Bayesian model averaging within families to provide inferences about parameters that are independent of further assumptions about model structure. We illustrate the methods using Dynamic Causal Models of brain imaging data. PMID:20300649
The clustering of galaxies in the completed SDSS-III Baryon Oscillation Spectroscopic Survey: cosmic flows and cosmic web from luminous red galaxies

NASA Astrophysics Data System (ADS)

Ata, Metin; Kitaura, Francisco-Shu; Chuang, Chia-Hsun; Rodríguez-Torres, Sergio; Angulo, Raul E.; Ferraro, Simone; Gil-Marín, Hector; McDonald, Patrick; Hernández Monteagudo, Carlos; Müller, Volker; Yepes, Gustavo; Autefage, Mathieu; Baumgarten, Falk; Beutler, Florian; Brownstein, Joel R.; Burden, Angela; Eisenstein, Daniel J.; Guo, Hong; Ho, Shirley; McBride, Cameron; Neyrinck, Mark; Olmstead, Matthew D.; Padmanabhan, Nikhil; Percival, Will J.; Prada, Francisco; Rossi, Graziano; Sánchez, Ariel G.; Schlegel, David; Schneider, Donald P.; Seo, Hee-Jong; Streblyanska, Alina; Tinker, Jeremy; Tojeiro, Rita; Vargas-Magana, Mariana

2017-06-01

We present a Bayesian phase-space reconstruction of the cosmic large-scale matter density and velocity fields from the Sloan Digital Sky Survey-III Baryon Oscillations Spectroscopic Survey Data Release 12 CMASS galaxy clustering catalogue. We rely on a given Λ cold dark matter cosmology, a mesh resolution in the range of 6-10 h-1 Mpc, and a lognormal-Poisson model with a redshift-dependent non-linear bias. The bias parameters are derived from the data and a general renormalized perturbation theory approach. We use combined Gibbs and Hamiltonian sampling, implemented in the argo code, to iteratively reconstruct the dark matter density field and the coherent peculiar velocities of individual galaxies, correcting hereby for coherent redshift space distortions. Our tests relying on accurate N-body-based mock galaxy catalogues show unbiased real space power spectra of the non-linear density field up to k ˜ 0.2 h Mpc-1, and vanishing quadrupoles down to r ˜ 20 h-1 Mpc. We also demonstrate that the non-linear cosmic web can be obtained from the tidal field tensor based on the Gaussian component of the reconstructed density field. We find that the reconstructed velocities have a statistical correlation coefficient compared to the true velocities of each individual light-cone mock galaxy of r ˜ 0.68 including about 10 per cent of satellite galaxies with virial motions (about r = 0.75 without satellites). The power spectra of the velocity divergence agree well with theoretical predictions up to k ˜ 0.2 h Mpc-1. This work will be especially useful to improve, for example, baryon acoustic oscillation reconstructions, kinematic Sunyaev-Zeldovich, integrated Sachs-Wolfe measurements or environmental studies.
Analytic posteriors for Pearson's correlation coefficient.

PubMed

Ly, Alexander; Marsman, Maarten; Wagenmakers, Eric-Jan

2018-02-01

Pearson's correlation is one of the most common measures of linear dependence. Recently, Bernardo (11th International Workshop on Objective Bayes Methodology, 2015) introduced a flexible class of priors to study this measure in a Bayesian setting. For this large class of priors, we show that the (marginal) posterior for Pearson's correlation coefficient and all of the posterior moments are analytic. Our results are available in the open-source software package JASP.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems

NASA Technical Reports Server (NTRS)

He, Yuning; Davies, Misty Dawn

2014-01-01

The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
Incorporation of diet information derived from Bayesian stable isotope mixing models into mass-balanced marine ecosystem models: A case study from the Marennes-Oleron Estuary, France

EPA Science Inventory

We investigated the use of output from Bayesian stable isotope mixing models as constraints for a linear inverse food web model of a temperate intertidal seagrass system in the Marennes-Oléron Bay, France. Linear inverse modeling (LIM) is a technique that estimates a complete net...
Design considerations and analysis planning of a phase 2a proof of concept study in rheumatoid arthritis in the presence of possible non-monotonicity.

PubMed

Liu, Feng; Walters, Stephen J; Julious, Steven A

2017-10-02

It is important to quantify the dose response for a drug in phase 2a clinical trials so the optimal doses can then be selected for subsequent late phase trials. In a phase 2a clinical trial of new lead drug being developed for the treatment of rheumatoid arthritis (RA), a U-shaped dose response curve was observed. In the light of this result further research was undertaken to design an efficient phase 2a proof of concept (PoC) trial for a follow-on compound using the lessons learnt from the lead compound. The planned analysis for the Phase 2a trial for GSK123456 was a Bayesian Emax model which assumes the dose-response relationship follows a monotonic sigmoid "S" shaped curve. This model was found to be suboptimal to model the U-shaped dose response observed in the data from this trial and alternatives approaches were needed to be considered for the next compound for which a Normal dynamic linear model (NDLM) is proposed. This paper compares the statistical properties of the Bayesian Emax model and NDLM model and both models are evaluated using simulation in the context of adaptive Phase 2a PoC design under a variety of assumed dose response curves: linear, Emax model, U-shaped model, and flat response. It is shown that the NDLM method is flexible and can handle a wide variety of dose-responses, including monotonic and non-monotonic relationships. In comparison to the NDLM model the Emax model excelled with higher probability of selecting ED90 and smaller average sample size, when the true dose response followed Emax like curve. In addition, the type I error, probability of incorrectly concluding a drug may work when it does not, is inflated with the Bayesian NDLM model in all scenarios which would represent a development risk to pharmaceutical company. The bias, which is the difference between the estimated effect from the Emax and NDLM models and the simulated value, is comparable if the true dose response follows a placebo like curve, an Emax like curve, or log linear shape curve under fixed dose allocation, no adaptive allocation, half adaptive and adaptive scenarios. The bias though is significantly increased for the Emax model if the true dose response follows a U-shaped curve. In most cases the Bayesian Emax model works effectively and efficiently, with low bias and good probability of success in case of monotonic dose response. However, if there is a belief that the dose response could be non-monotonic then the NDLM is the superior model to assess the dose response.

Technical Topic 3.2.2.d Bayesian and Non-Parametric Statistics: Integration of Neural Networks with Bayesian Networks for Data Fusion and Predictive Modeling

DTIC Science & Technology

2016-05-31

and included explosives such as TATP, HMTD, RDX, RDX, ammonium nitrate , potassium perchlorate, potassium nitrate , sugar, and TNT. The approach...Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2. d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2. d Bayesian and Non-parametric Statistics: Integration of Neural
Spatial variability of the effect of air pollution on term birth weight: evaluating influential factors using Bayesian hierarchical models.

PubMed

Li, Lianfa; Laurent, Olivier; Wu, Jun

2016-02-05

Epidemiological studies suggest that air pollution is adversely associated with pregnancy outcomes. Such associations may be modified by spatially-varying factors including socio-demographic characteristics, land-use patterns and unaccounted exposures. Yet, few studies have systematically investigated the impact of these factors on spatial variability of the air pollution's effects. This study aimed to examine spatial variability of the effects of air pollution on term birth weight across Census tracts and the influence of tract-level factors on such variability. We obtained over 900,000 birth records from 2001 to 2008 in Los Angeles County, California, USA. Air pollution exposure was modeled at individual level for nitrogen dioxide (NO2) and nitrogen oxides (NOx) using spatiotemporal models. Two-stage Bayesian hierarchical non-linear models were developed to (1) quantify the associations between air pollution exposure and term birth weight within each tract; and (2) examine the socio-demographic, land-use, and exposure-related factors contributing to the between-tract variability of the associations between air pollution and term birth weight. Higher air pollution exposure was associated with lower term birth weight (average posterior effects: -14.7 (95 % CI: -19.8, -9.7) g per 10 ppb increment in NO2 and -6.9 (95 % CI: -12.9, -0.9) g per 10 ppb increment in NOx). The variation of the association across Census tracts was significantly influenced by the tract-level socio-demographic, exposure-related and land-use factors. Our models captured the complex non-linear relationship between these factors and the associations between air pollution and term birth weight: we observed the thresholds from which the influence of the tract-level factors was markedly exacerbated or attenuated. Exacerbating factors might reflect additional exposure to environmental insults or lower socio-economic status with higher vulnerability, whereas attenuating factors might indicate reduced exposure or higher socioeconomic status with lower vulnerability. Our Bayesian models effectively combined a priori knowledge with training data to infer the posterior association of air pollution with term birth weight and to evaluate the influence of the tract-level factors on spatial variability of such association. This study contributes new findings about non-linear influences of socio-demographic factors, land-use patterns, and unaccounted exposures on spatial variability of the effects of air pollution.
Multi-objective calibration and uncertainty analysis of hydrologic models; A comparative study between formal and informal methods

NASA Astrophysics Data System (ADS)

Shafii, M.; Tolson, B.; Matott, L. S.

2012-04-01

Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.
ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration

PubMed Central

Bottolo, Leonardo; Langley, Sarah R.; Petretto, Enrico; Tiret, Laurence; Tregouet, David; Richardson, Sylvia

2011-01-01

Summary: ESS++ is a C++ implementation of a fully Bayesian variable selection approach for single and multiple response linear regression. ESS++ works well both when the number of observations is larger than the number of predictors and in the ‘large p, small n’ case. In the current version, ESS++ can handle several hundred observations, thousands of predictors and a few responses simultaneously. The core engine of ESS++ for the selection of relevant predictors is based on Evolutionary Monte Carlo. Our implementation is open source, allowing community-based alterations and improvements. Availability: C++ source code and documentation including compilation instructions are available under GNU licence at http://bgx.org.uk/software/ESS.html. Contact: l.bottolo@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21233165
Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Baichuan; Choudhury, Sutanay; Al-Hasan, Mohammad

2016-02-01

Estimating the confidence for a link is a critical task for Knowledge Graph construction. Link prediction, or predicting the likelihood of a link in a knowledge graph based on prior state is a key research direction within this area. We propose a Latent Feature Embedding based link recommendation model for prediction task and utilize Bayesian Personalized Ranking based optimization technique for learning models for each predicate. Experimental results on large-scale knowledge bases such as YAGO2 show that our approach achieves substantially higher performance than several state-of-art approaches. Furthermore, we also study the performance of the link prediction algorithm in termsmore » of topological properties of the Knowledge Graph and present a linear regression model to reason about its expected level of accuracy.« less
Efficient reconstruction method for ground layer adaptive optics with mixed natural and laser guide stars.

PubMed

Wagner, Roland; Helin, Tapio; Obereder, Andreas; Ramlau, Ronny

2016-02-20

The imaging quality of modern ground-based telescopes such as the planned European Extremely Large Telescope is affected by atmospheric turbulence. In consequence, they heavily depend on stable and high-performance adaptive optics (AO) systems. Using measurements of incoming light from guide stars, an AO system compensates for the effects of turbulence by adjusting so-called deformable mirror(s) (DMs) in real time. In this paper, we introduce a novel reconstruction method for ground layer adaptive optics. In the literature, a common approach to this problem is to use Bayesian inference in order to model the specific noise structure appearing due to spot elongation. This approach leads to large coupled systems with high computational effort. Recently, fast solvers of linear order, i.e., with computational complexity O(n), where n is the number of DM actuators, have emerged. However, the quality of such methods typically degrades in low flux conditions. Our key contribution is to achieve the high quality of the standard Bayesian approach while at the same time maintaining the linear order speed of the recent solvers. Our method is based on performing a separate preprocessing step before applying the cumulative reconstructor (CuReD). The efficiency and performance of the new reconstructor are demonstrated using the OCTOPUS, the official end-to-end simulation environment of the ESO for extremely large telescopes. For more specific simulations we also use the MOST toolbox.
Comparing interval estimates for small sample ordinal CFA models

PubMed Central

Natesan, Prathiba

2015-01-01

Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002
Comparing interval estimates for small sample ordinal CFA models.

PubMed

Natesan, Prathiba

2015-01-01

Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.
Model-based Bayesian inference for ROC data analysis

NASA Astrophysics Data System (ADS)

Lei, Tianhu; Bae, K. Ty

2013-03-01

This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection.

PubMed

Tang, Zaixiang; Shen, Yueping; Zhang, Xinyan; Yi, Nengjun

2017-01-01

Large-scale "omics" data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarchical generalized linear models, called spike-and-slab lasso GLMs, for prognostic prediction and detection of associated genes using large-scale molecular data. The proposed model employs a spike-and-slab mixture double-exponential prior for coefficients that can induce weak shrinkage on large coefficients, and strong shrinkage on irrelevant coefficients. We have developed a fast and stable algorithm to fit large-scale hierarchal GLMs by incorporating expectation-maximization (EM) steps into the fast cyclic coordinate descent algorithm. The proposed approach integrates nice features of two popular methods, i.e., penalized lasso and Bayesian spike-and-slab variable selection. The performance of the proposed method is assessed via extensive simulation studies. The results show that the proposed approach can provide not only more accurate estimates of the parameters, but also better prediction. We demonstrate the proposed procedure on two cancer data sets: a well-known breast cancer data set consisting of 295 tumors, and expression data of 4919 genes; and the ovarian cancer data set from TCGA with 362 tumors, and expression data of 5336 genes. Our analyses show that the proposed procedure can generate powerful models for predicting outcomes and detecting associated genes. The methods have been implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/). Copyright © 2017 by the Genetics Society of America.
Joint probabilistic determination of earthquake location and velocity structure: application to local and regional events

NASA Astrophysics Data System (ADS)

Beucler, E.; Haugmard, M.; Mocquet, A.

2016-12-01

The most widely used inversion schemes to locate earthquakes are based on iterative linearized least-squares algorithms and using an a priori knowledge of the propagation medium. When a small amount of observations is available for moderate events for instance, these methods may lead to large trade-offs between outputs and both the velocity model and the initial set of hypocentral parameters. We present a joint structure-source determination approach using Bayesian inferences. Monte-Carlo continuous samplings, using Markov chains, generate models within a broad range of parameters, distributed according to the unknown posterior distributions. The non-linear exploration of both the seismic structure (velocity and thickness) and the source parameters relies on a fast forward problem using 1-D travel time computations. The a posteriori covariances between parameters (hypocentre depth, origin time and seismic structure among others) are computed and explicitly documented. This method manages to decrease the influence of the surrounding seismic network geometry (sparse and/or azimuthally inhomogeneous) and a too constrained velocity structure by inferring realistic distributions on hypocentral parameters. Our algorithm is successfully used to accurately locate events of the Armorican Massif (western France), which is characterized by moderate and apparently diffuse local seismicity.
Testing non-minimally coupled inflation with CMB data: a Bayesian analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Campista, Marcela; Benetti, Micol; Alcaniz, Jailson, E-mail: campista@on.br, E-mail: micolbenetti@on.br, E-mail: alcaniz@on.br

2017-09-01

We use the most recent cosmic microwave background (CMB) data to perform a Bayesian statistical analysis and discuss the observational viability of inflationary models with a non-minimal coupling, ξ, between the inflaton field and the Ricci scalar. We particularize our analysis to two examples of small and large field inflationary models, namely, the Coleman-Weinberg and the chaotic quartic potentials. We find that ( i ) the ξ parameter is closely correlated with the primordial amplitude ; ( ii ) although improving the agreement with the CMB data in the r − n {sub s} plane, where r is the tensor-to-scalarmore » ratio and n {sub s} the primordial spectral index, a non-null coupling is strongly disfavoured with respect to the minimally coupled standard ΛCDM model, since the upper bounds of the Bayes factor (odds) for ξ parameter are greater than 150:1.« less
Sparse-grid, reduced-basis Bayesian inversion: Nonaffine-parametric nonlinear equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Peng, E-mail: peng@ices.utexas.edu; Schwab, Christoph, E-mail: christoph.schwab@sam.math.ethz.ch

2016-07-01

We extend the reduced basis (RB) accelerated Bayesian inversion methods for affine-parametric, linear operator equations which are considered in [16,17] to non-affine, nonlinear parametric operator equations. We generalize the analysis of sparsity of parametric forward solution maps in [20] and of Bayesian inversion in [48,49] to the fully discrete setting, including Petrov–Galerkin high-fidelity (“HiFi”) discretization of the forward maps. We develop adaptive, stochastic collocation based reduction methods for the efficient computation of reduced bases on the parametric solution manifold. The nonaffinity and nonlinearity with respect to (w.r.t.) the distributed, uncertain parameters and the unknown solution is collocated; specifically, by themore » so-called Empirical Interpolation Method (EIM). For the corresponding Bayesian inversion problems, computational efficiency is enhanced in two ways: first, expectations w.r.t. the posterior are computed by adaptive quadratures with dimension-independent convergence rates proposed in [49]; the present work generalizes [49] to account for the impact of the PG discretization in the forward maps on the convergence rates of the Quantities of Interest (QoI for short). Second, we propose to perform the Bayesian estimation only w.r.t. a parsimonious, RB approximation of the posterior density. Based on the approximation results in [49], the infinite-dimensional parametric, deterministic forward map and operator admit N-term RB and EIM approximations which converge at rates which depend only on the sparsity of the parametric forward map. In several numerical experiments, the proposed algorithms exhibit dimension-independent convergence rates which equal, at least, the currently known rate estimates for N-term approximation. We propose to accelerate Bayesian estimation by first offline construction of reduced basis surrogates of the Bayesian posterior density. The parsimonious surrogates can then be employed for online data assimilation and for Bayesian estimation. They also open a perspective for optimal experimental design.« less
Slip Distribution of the 2008 Iwate-Miyagi Nairiku, Japan, Earthquake Inverted from PALSAR Data

NASA Astrophysics Data System (ADS)

Fukahata, Y.; Fukushima, Y.; Arimoto, M.

2008-12-01

On 14 June 2008, the Iwate-Miyagi Nairiku earthquake struck northeast Japan, where active seismicity has been observed under east-west compressional stress fields. According to the Japan Meteorological Agency, the magnitude and the hypocenter depth of the earthquake are 7.2 and 8 km, respectively. The earthquake is considered to have occurred on a west dipping reverse fault with a roughly north-south strike. The earthquake caused significant surface displacements, which were detected by PALSAR, a Synthetic Aperture Radar (SAR) onboard the Advanced Land Observing Satellite (ALOS) employed by the Japan Aerospace Exploration Agency (JAXA). Several pairs of PALSAR images are available to measure the coseismic displacements. InSAR data show up to 1 m of line-of-sight displacements both for ascending and descending paths. The pixel matching method was also used to obtain range and azimuth offset data around the epicentral region, where displacements were too large for the interferometric technique (see Fukushima (this meeting) in detail). We inverted the obtained SAR interferometric and pixel matching data to estimate slip distribution on the fault. Since the geometry of the fault are not well known, the inverse problem is non-linear. If the fault surface is assumed to be a flat plane, however, the non-linearity is weak. Following the method of Fukahata & Wright (2008), we resolved the weak non-linearity based on ABIC (Akaike"fs Bayesian Information Criterion). That is to say, the fault parameters (e.g. strike, dip and location) as well as the weight of smoothing parameter were objectively determined by minimizing ABIC. We first estimated slip distribution by assuming a pure dip slip for simplicity, since it has been reported that the dip slip component is dominant. Then, the optimal fault geometry was dip 26 and strike 203 degrees with the location passing through (140.90E, 38.97N). The maximum slip was more than 8 m and most slips concentrated at shallow depths (< 4 km). Without fixing the rake, a large slip area with the maximum slip of about 8 m concentrated in the shallow region was obtained again.
Marginally specified priors for non-parametric Bayesian estimation

PubMed Central

Kessler, David C.; Hoff, Peter D.; Dunson, David B.

2014-01-01

Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813
Receptive Field Inference with Localized Priors

PubMed Central

Park, Mijung; Pillow, Jonathan W.

2011-01-01

The linear receptive field describes a mapping from sensory stimuli to a one-dimensional variable governing a neuron's spike response. However, traditional receptive field estimators such as the spike-triggered average converge slowly and often require large amounts of data. Bayesian methods seek to overcome this problem by biasing estimates towards solutions that are more likely a priori, typically those with small, smooth, or sparse coefficients. Here we introduce a novel Bayesian receptive field estimator designed to incorporate locality, a powerful form of prior information about receptive field structure. The key to our approach is a hierarchical receptive field model that flexibly adapts to localized structure in both spacetime and spatiotemporal frequency, using an inference method known as empirical Bayes. We refer to our method as automatic locality determination (ALD), and show that it can accurately recover various types of smooth, sparse, and localized receptive fields. We apply ALD to neural data from retinal ganglion cells and V1 simple cells, and find it achieves error rates several times lower than standard estimators. Thus, estimates of comparable accuracy can be achieved with substantially less data. Finally, we introduce a computationally efficient Markov Chain Monte Carlo (MCMC) algorithm for fully Bayesian inference under the ALD prior, yielding accurate Bayesian confidence intervals for small or noisy datasets. PMID:22046110
Monte Carlo Bayesian Inference on a Statistical Model of Sub-gridcolumn Moisture Variability Using High-resolution Cloud Observations . Part II; Sensitivity Tests and Results

NASA Technical Reports Server (NTRS)

da Silva, Arlindo M.; Norris, Peter M.

2013-01-01

Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.
Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability using High-Resolution Cloud Observations

NASA Astrophysics Data System (ADS)

Norris, P. M.; da Silva, A. M., Jr.

2016-12-01

Norris and da Silva recently published a method to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation (CDA). The gridcolumn model includes assumed-PDF intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used are MODIS cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. The new approach not only significantly reduces mean and standard deviation biases with respect to the assimilated observables, but also improves the simulated rotational-Ramman scattering cloud optical centroid pressure against independent (non-assimilated) retrievals from the OMI instrument. One obvious difficulty for the method, and other CDA methods, is the lack of information content in passive cloud observables on cloud vertical structure, beyond cloud-top and thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard is helpful, better honoring inversion structures in the background state.
Algorithm for ion beam figuring of low-gradient mirrors.

PubMed

Jiao, Changjun; Li, Shengyi; Xie, Xuhui

2009-07-20

Ion beam figuring technology for low-gradient mirrors is discussed. Ion beam figuring is a noncontact machining technique in which a beam of high-energy ions is directed toward a target workpiece to remove material in a predetermined and controlled fashion. Owing to this noncontact mode of material removal, problems associated with tool wear and edge effects, which are common in conventional contact polishing processes, are avoided. Based on the Bayesian principle, an iterative dwell time algorithm for planar mirrors is deduced from the computer-controlled optical surfacing (CCOS) principle. With the properties of the removal function, the shaping process of low-gradient mirrors can be approximated by the linear model for planar mirrors. With these discussions, the error surface figuring technology for low-gradient mirrors with a linear path is set up. With the near-Gaussian property of the removal function, the figuring process with a spiral path can be described by the conventional linear CCOS principle, and a Bayesian-based iterative algorithm can be used to deconvolute the dwell time. Moreover, the selection criterion of the spiral parameter is given. Ion beam figuring technology with a spiral scan path based on these methods can be used to figure mirrors with non-axis-symmetrical errors. Experiments on SiC chemical vapor deposition planar and Zerodur paraboloid samples are made, and the final surface errors are all below 1/100 lambda.
Statistical modelling of networked human-automation performance using working memory capacity.

PubMed

Ahmed, Nisar; de Visser, Ewart; Shaw, Tyler; Mohamed-Ameen, Amira; Campbell, Mark; Parasuraman, Raja

2014-01-01

This study examines the challenging problem of modelling the interaction between individual attentional limitations and decision-making performance in networked human-automation system tasks. Analysis of real experimental data from a task involving networked supervision of multiple unmanned aerial vehicles by human participants shows that both task load and network message quality affect performance, but that these effects are modulated by individual differences in working memory (WM) capacity. These insights were used to assess three statistical approaches for modelling and making predictions with real experimental networked supervisory performance data: classical linear regression, non-parametric Gaussian processes and probabilistic Bayesian networks. It is shown that each of these approaches can help designers of networked human-automated systems cope with various uncertainties in order to accommodate future users by linking expected operating conditions and performance from real experimental data to observable cognitive traits like WM capacity. Practitioner Summary: Working memory (WM) capacity helps account for inter-individual variability in operator performance in networked unmanned aerial vehicle supervisory tasks. This is useful for reliable performance prediction near experimental conditions via linear models; robust statistical prediction beyond experimental conditions via Gaussian process models and probabilistic inference about unknown task conditions/WM capacities via Bayesian network models.

Extreme-Scale Bayesian Inference for Uncertainty Quantification of Complex Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Biros, George

Uncertainty quantification (UQ)—that is, quantifying uncertainties in complex mathematical models and their large-scale computational implementations—is widely viewed as one of the outstanding challenges facing the field of CS&E over the coming decade. The EUREKA project set to address the most difficult class of UQ problems: those for which both the underlying PDE model as well as the uncertain parameters are of extreme scale. In the project we worked on these extreme-scale challenges in the following four areas: 1. Scalable parallel algorithms for sampling and characterizing the posterior distribution that exploit the structure of the underlying PDEs and parameter-to-observable map. Thesemore » include structure-exploiting versions of the randomized maximum likelihood method, which aims to overcome the intractability of employing conventional MCMC methods for solving extreme-scale Bayesian inversion problems by appealing to and adapting ideas from large-scale PDE-constrained optimization, which have been very successful at exploring high-dimensional spaces. 2. Scalable parallel algorithms for construction of prior and likelihood functions based on learning methods and non-parametric density estimation. Constructing problem-specific priors remains a critical challenge in Bayesian inference, and more so in high dimensions. Another challenge is construction of likelihood functions that capture unmodeled couplings between observations and parameters. We will create parallel algorithms for non-parametric density estimation using high dimensional N-body methods and combine them with supervised learning techniques for the construction of priors and likelihood functions. 3. Bayesian inadequacy models, which augment physics models with stochastic models that represent their imperfections. The success of the Bayesian inference framework depends on the ability to represent the uncertainty due to imperfections of the mathematical model of the phenomena of interest. This is a central challenge in UQ, especially for large-scale models. We propose to develop the mathematical tools to address these challenges in the context of extreme-scale problems. 4. Parallel scalable algorithms for Bayesian optimal experimental design (OED). Bayesian inversion yields quantified uncertainties in the model parameters, which can be propagated forward through the model to yield uncertainty in outputs of interest. This opens the way for designing new experiments to reduce the uncertainties in the model parameters and model predictions. Such experimental design problems have been intractable for large-scale problems using conventional methods; we will create OED algorithms that exploit the structure of the PDE model and the parameter-to-output map to overcome these challenges. Parallel algorithms for these four problems were created, analyzed, prototyped, implemented, tuned, and scaled up for leading-edge supercomputers, including UT-Austin’s own 10 petaflops Stampede system, ANL’s Mira system, and ORNL’s Titan system. While our focus is on fundamental mathematical/computational methods and algorithms, we will assess our methods on model problems derived from several DOE mission applications, including multiscale mechanics and ice sheet dynamics.« less
Population-level differences in disease transmission: A Bayesian analysis of multiple smallpox epidemics

PubMed Central

Elderd, Bret D.; Dwyer, Greg; Dukic, Vanja

2013-01-01

Estimates of a disease’s basic reproductive rate R0 play a central role in understanding outbreaks and planning intervention strategies. In many calculations of R0, a simplifying assumption is that different host populations have effectively identical transmission rates. This assumption can lead to an underestimate of the overall uncertainty associated with R0, which, due to the non-linearity of epidemic processes, may result in a mis-estimate of epidemic intensity and miscalculated expenditures associated with public-health interventions. In this paper, we utilize a Bayesian method for quantifying the overall uncertainty arising from differences in population-specific basic reproductive rates. Using this method, we fit spatial and non-spatial susceptible-exposed-infected-recovered (SEIR) models to a series of 13 smallpox outbreaks. Five outbreaks occurred in populations that had been previously exposed to smallpox, while the remaining eight occurred in Native-American populations that were naïve to the disease at the time. The Native-American outbreaks were close in a spatial and temporal sense. Using Bayesian Information Criterion (BIC), we show that the best model includes population-specific R0 values. These differences in R0 values may, in part, be due to differences in genetic background, social structure, or food and water availability. As a result of these inter-population differences, the overall uncertainty associated with the “population average” value of smallpox R0 is larger, a finding that can have important consequences for controlling epidemics. In general, Bayesian hierarchical models are able to properly account for the uncertainty associated with multiple epidemics, provide a clearer understanding of variability in epidemic dynamics, and yield a better assessment of the range of potential risks and consequences that decision makers face. PMID:24021521
Appraisal of jump distributions in ensemble-based sampling algorithms

NASA Astrophysics Data System (ADS)

Dejanic, Sanda; Scheidegger, Andreas; Rieckermann, Jörg; Albert, Carlo

2017-04-01

Sampling Bayesian posteriors of model parameters is often required for making model-based probabilistic predictions. For complex environmental models, standard Monte Carlo Markov Chain (MCMC) methods are often infeasible because they require too many sequential model runs. Therefore, we focused on ensemble methods that use many Markov chains in parallel, since they can be run on modern cluster architectures. Little is known about how to choose the best performing sampler, for a given application. A poor choice can lead to an inappropriate representation of posterior knowledge. We assessed two different jump moves, the stretch and the differential evolution move, underlying, respectively, the software packages EMCEE and DREAM, which are popular in different scientific communities. For the assessment, we used analytical posteriors with features as they often occur in real posteriors, namely high dimensionality, strong non-linear correlations or multimodality. For posteriors with non-linear features, standard convergence diagnostics based on sample means can be insufficient. Therefore, we resorted to an entropy-based convergence measure. We assessed the samplers by means of their convergence speed, robustness and effective sample sizes. For posteriors with strongly non-linear features, we found that the stretch move outperforms the differential evolution move, w.r.t. all three aspects.
Spectral likelihood expansions for Bayesian inference

NASA Astrophysics Data System (ADS)

Nagel, Joseph B.; Sudret, Bruno

2016-03-01

A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.
Predicting BCI subject performance using probabilistic spatio-temporal filters.

PubMed

Suk, Heung-Il; Fazli, Siamac; Mehnert, Jan; Müller, Klaus-Robert; Lee, Seong-Whan

2014-01-01

Recently, spatio-temporal filtering to enhance decoding for Brain-Computer-Interfacing (BCI) has become increasingly popular. In this work, we discuss a novel, fully Bayesian-and thereby probabilistic-framework, called Bayesian Spatio-Spectral Filter Optimization (BSSFO) and apply it to a large data set of 80 non-invasive EEG-based BCI experiments. Across the full frequency range, the BSSFO framework allows to analyze which spatio-spectral parameters are common and which ones differ across the subject population. As expected, large variability of brain rhythms is observed between subjects. We have clustered subjects according to similarities in their corresponding spectral characteristics from the BSSFO model, which is found to reflect their BCI performances well. In BCI, a considerable percentage of subjects is unable to use a BCI for communication, due to their missing ability to modulate their brain rhythms-a phenomenon sometimes denoted as BCI-illiteracy or inability. Predicting individual subjects' performance preceding the actual, time-consuming BCI-experiment enhances the usage of BCIs, e.g., by detecting users with BCI inability. This work additionally contributes by using the novel BSSFO method to predict the BCI-performance using only 2 minutes and 3 channels of resting-state EEG data recorded before the actual BCI-experiment. Specifically, by grouping the individual frequency characteristics we have nicely classified them into the subject 'prototypes' (like μ - or β -rhythm type subjects) or users without ability to communicate with a BCI, and then by further building a linear regression model based on the grouping we could predict subjects' performance with the maximum correlation coefficient of 0.581 with the performance later seen in the actual BCI session.
Bayesian Inference on the Radio-quietness of Gamma-ray Pulsars

NASA Astrophysics Data System (ADS)

Yu, Hoi-Fung; Hui, Chung Yue; Kong, Albert K. H.; Takata, Jumpei

2018-04-01

For the first time we demonstrate using a robust Bayesian approach to analyze the populations of radio-quiet (RQ) and radio-loud (RL) gamma-ray pulsars. We quantify their differences and obtain their distributions of the radio-cone opening half-angle δ and the magnetic inclination angle α by Bayesian inference. In contrast to the conventional frequentist point estimations that might be non-representative when the distribution is highly skewed or multi-modal, which is often the case when data points are scarce, Bayesian statistics displays the complete posterior distribution that the uncertainties can be readily obtained regardless of the skewness and modality. We found that the spin period, the magnetic field strength at the light cylinder, the spin-down power, the gamma-ray-to-X-ray flux ratio, and the spectral curvature significance of the two groups of pulsars exhibit significant differences at the 99% level. Using Bayesian inference, we are able to infer the values and uncertainties of δ and α from the distribution of RQ and RL pulsars. We found that δ is between 10° and 35° and the distribution of α is skewed toward large values.
Spatio-temporal interpolation of precipitation during monsoon periods in Pakistan

NASA Astrophysics Data System (ADS)

Hussain, Ijaz; Spöck, Gunter; Pilz, Jürgen; Yu, Hwa-Lung

2010-08-01

Spatio-temporal estimation of precipitation over a region is essential to the modeling of hydrologic processes for water resources management. The changes of magnitude and space-time heterogeneity of rainfall observations make space-time estimation of precipitation a challenging task. In this paper we propose a Box-Cox transformed hierarchical Bayesian multivariate spatio-temporal interpolation method for the skewed response variable. The proposed method is applied to estimate space-time monthly precipitation in the monsoon periods during 1974-2000, and 27-year monthly average precipitation data are obtained from 51 stations in Pakistan. The results of transformed hierarchical Bayesian multivariate spatio-temporal interpolation are compared to those of non-transformed hierarchical Bayesian interpolation by using cross-validation. The software developed by [11] is used for Bayesian non-stationary multivariate space-time interpolation. It is observed that the transformed hierarchical Bayesian method provides more accuracy than the non-transformed hierarchical Bayesian method.
A Bayesian approach to earthquake source studies

NASA Astrophysics Data System (ADS)

Minson, Sarah

Bayesian sampling has several advantages over conventional optimization approaches to solving inverse problems. It produces the distribution of all possible models sampled proportionally to how much each model is consistent with the data and the specified prior information, and thus images the entire solution space, revealing the uncertainties and trade-offs in the model. Bayesian sampling is applicable to both linear and non-linear modeling, and the values of the model parameters being sampled can be constrained based on the physics of the process being studied and do not have to be regularized. However, these methods are computationally challenging for high-dimensional problems. Until now the computational expense of Bayesian sampling has been too great for it to be practicable for most geophysical problems. I present a new parallel sampling algorithm called CATMIP for Cascading Adaptive Tempered Metropolis In Parallel. This technique, based on Transitional Markov chain Monte Carlo, makes it possible to sample distributions in many hundreds of dimensions, if the forward model is fast, or to sample computationally expensive forward models in smaller numbers of dimensions. The design of the algorithm is independent of the model being sampled, so CATMIP can be applied to many areas of research. I use CATMIP to produce a finite fault source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. Surface displacements from the earthquake were recorded by six interferograms and twelve local high-rate GPS stations. Because of the wealth of near-fault data, the source process is well-constrained. I find that the near-field high-rate GPS data have significant resolving power above and beyond the slip distribution determined from static displacements. The location and magnitude of the maximum displacement are resolved. The rupture almost certainly propagated at sub-shear velocities. The full posterior distribution can be used not only to calculate source parameters but also to determine their uncertainties. So while kinematic source modeling and the estimation of source parameters is not new, with CATMIP I am able to use Bayesian sampling to determine which parts of the source process are well-constrained and which are not.
Non-Bayesian Optical Inference Machines

NASA Astrophysics Data System (ADS)

Kadar, Ivan; Eichmann, George

1987-01-01

In a recent paper, Eichmann and Caulfield) presented a preliminary exposition of optical learning machines suited for use in expert systems. In this paper, we extend the previous ideas by introducing learning as a means of reinforcement by information gathering and reasoning with uncertainty in a non-Bayesian framework2. More specifically, the non-Bayesian approach allows the representation of total ignorance (not knowing) as opposed to assuming equally likely prior distributions.
A geometric approach to non-linear correlations with intrinsic scatter

NASA Astrophysics Data System (ADS)

Pihajoki, Pauli

2017-12-01

We propose a new mathematical model for n - k-dimensional non-linear correlations with intrinsic scatter in n-dimensional data. The model is based on Riemannian geometry and is naturally symmetric with respect to the measured variables and invariant under coordinate transformations. We combine the model with a Bayesian approach for estimating the parameters of the correlation relation and the intrinsic scatter. A side benefit of the approach is that censored and truncated data sets and independent, arbitrary measurement errors can be incorporated. We also derive analytic likelihoods for the typical astrophysical use case of linear relations in n-dimensional Euclidean space. We pay particular attention to the case of linear regression in two dimensions and compare our results to existing methods. Finally, we apply our methodology to the well-known MBH-σ correlation between the mass of a supermassive black hole in the centre of a galactic bulge and the corresponding bulge velocity dispersion. The main result of our analysis is that the most likely slope of this correlation is ∼6 for the data sets used, rather than the values in the range of ∼4-5 typically quoted in the literature for these data.
Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 2: Sensitivity tests and results

PubMed Central

Norris, Peter M.; da Silva, Arlindo M.

2018-01-01

Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational–Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by better honouring inversion structures in the background state. PMID:29618848
Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 2: Sensitivity Tests and Results

NASA Technical Reports Server (NTRS)

Norris, Peter M.; da Silva, Arlindo M.

2016-01-01

Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational-Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by better honouring inversion structures in the background state.
Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 2: Sensitivity tests and results.

PubMed

Norris, Peter M; da Silva, Arlindo M

2016-07-01

Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational-Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by better honouring inversion structures in the background state.
The network adjustment aimed for the campaigned gravity survey using a Bayesian approach: methodology and model test

NASA Astrophysics Data System (ADS)

Chen, Shi; Liao, Xu; Ma, Hongsheng; Zhou, Longquan; Wang, Xingzhou; Zhuang, Jiancang

2017-04-01

The relative gravimeter, which generally uses zero-length springs as the gravity senor, is still as the first choice in the field of terrestrial gravity measurement because of its efficiency and low-cost. Because the drift rate of instrument can be changed with the time and meter, it is necessary for estimating the drift rate to back to the base or known gravity value stations for repeated measurement at regular hour's interval during the practical survey. However, the campaigned gravity survey for the large-scale region, which the distance of stations is far away from serval or tens kilometers, the frequent back to close measurement will highly reduce the gravity survey efficiency and extremely time-consuming. In this paper, we proposed a new gravity data adjustment method for estimating the meter drift by means of Bayesian statistical interference. In our approach, we assumed the change of drift rate is a smooth function depend on the time-lapse. The trade-off parameters were be used to control the fitting residuals. We employed the Akaike's Bayesian Information Criterion (ABIC) for the estimated these trade-off parameters. The comparison and analysis of simulated data between the classical and Bayesian adjustment show that our method is robust and has self-adaptive ability for facing to the unregularly non-linear meter drift. At last, we used this novel approach to process the realistic campaigned gravity data at the North China. Our adjustment method is suitable to recover the time-varied drift rate function of each meter, and also to detect the meter abnormal drift during the gravity survey. We also defined an alternative error estimation for the inversed gravity value at the each station on the basis of the marginal distribution theory. Acknowledgment: This research is supported by Science Foundation Institute of Geophysics, CEA from the Ministry of Science and Technology of China (Nos. DQJB16A05; DQJB16B07), China National Special Fund for Earthquake Scientific Research in Public Interest (Nos. 201508006; 201508009).
Public opinion by a poll process: model study and Bayesian view

NASA Astrophysics Data System (ADS)

Lee, Hyun Keun; Kim, Yong Woon

2018-05-01

We study the formation of public opinion in a poll process where the current score is open to the public. The voters are assumed to vote probabilistically for or against their own preference considering the group opinion collected up to then in the score. The poll-score probability is found to follow the beta distribution in the large polls limit. We demonstrate that various poll results, even those contradictory to the population preference, are possible with non-zero probability density and that such deviations are readily triggered by initial bias. It is mentioned that our poll model can be understood in the Bayesian viewpoint.
Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors.

PubMed

Gustafsson, Mats G; Wallman, Mikael; Wickenberg Bolin, Ulrika; Göransson, Hanna; Fryknäs, M; Andersson, Claes R; Isaksson, Anders

2010-06-01

Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (CI) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the CI is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice. It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples. Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets. An empirically derived ME prior seems promising for improving the Bayesian CI for the unknown error rate of a designed classifier. Copyright 2010 Elsevier B.V. All rights reserved.
A fully Bayesian before-after analysis of permeable friction course (PFC) pavement wet weather safety.

PubMed

Buddhavarapu, Prasad; Smit, Andre F; Prozzi, Jorge A

2015-07-01

Permeable friction course (PFC), a porous hot-mix asphalt, is typically applied to improve wet weather safety on high-speed roadways in Texas. In order to warrant expensive PFC construction, a statistical evaluation of its safety benefits is essential. Generally, the literature on the effectiveness of porous mixes in reducing wet-weather crashes is limited and often inconclusive. In this study, the safety effectiveness of PFC was evaluated using a fully Bayesian before-after safety analysis. First, two groups of road segments overlaid with PFC and non-PFC material were identified across Texas; the non-PFC or reference road segments selected were similar to their PFC counterparts in terms of site specific features. Second, a negative binomial data generating process was assumed to model the underlying distribution of crash counts of PFC and reference road segments to perform Bayesian inference on the safety effectiveness. A data-augmentation based computationally efficient algorithm was employed for a fully Bayesian estimation. The statistical analysis shows that PFC is not effective in reducing wet weather crashes. It should be noted that the findings of this study are in agreement with the existing literature, although these studies were not based on a fully Bayesian statistical analysis. Our study suggests that the safety effectiveness of PFC road surfaces, or any other safety infrastructure, largely relies on its interrelationship with the road user. The results suggest that the safety infrastructure must be properly used to reap the benefits of the substantial investments. Copyright © 2015 Elsevier Ltd. All rights reserved.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.

PubMed

Duarte, Belmiro P M; Wong, Weng Kee

2015-08-01

This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach

PubMed Central

Duarte, Belmiro P. M.; Wong, Weng Kee

2014-01-01

Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159
Effective connectivity between superior temporal gyrus and Heschl's gyrus during white noise listening: linear versus non-linear models.

PubMed

Hamid, Ka; Yusoff, An; Rahman, Mza; Mohamad, M; Hamid, Aia

2012-04-01

This fMRI study is about modelling the effective connectivity between Heschl's gyrus (HG) and the superior temporal gyrus (STG) in human primary auditory cortices. MATERIALS #ENTITYSTARTX00026; Ten healthy male participants were required to listen to white noise stimuli during functional magnetic resonance imaging (fMRI) scans. Statistical parametric mapping (SPM) was used to generate individual and group brain activation maps. For input region determination, two intrinsic connectivity models comprising bilateral HG and STG were constructed using dynamic causal modelling (DCM). The models were estimated and inferred using DCM while Bayesian Model Selection (BMS) for group studies was used for model comparison and selection. Based on the winning model, six linear and six non-linear causal models were derived and were again estimated, inferred, and compared to obtain a model that best represents the effective connectivity between HG and the STG, balancing accuracy and complexity. Group results indicated significant asymmetrical activation (p(uncorr) < 0.001) in bilateral HG and STG. Model comparison results showed strong evidence of STG as the input centre. The winning model is preferred by 6 out of 10 participants. The results were supported by BMS results for group studies with the expected posterior probability, r = 0.7830 and exceedance probability, ϕ = 0.9823. One-sample t-tests performed on connection values obtained from the winning model indicated that the valid connections for the winning model are the unidirectional parallel connections from STG to bilateral HG (p < 0.05). Subsequent model comparison between linear and non-linear models using BMS prefers non-linear connection (r = 0.9160, ϕ = 1.000) from which the connectivity between STG and the ipsi- and contralateral HG is gated by the activity in STG itself. We are able to demonstrate that the effective connectivity between HG and STG while listening to white noise for the respective participants can be explained by a non-linear dynamic causal model with the activity in STG influencing the STG-HG connectivity non-linearly.

Massive parallelization of serial inference algorithms for a complex generalized linear model

PubMed Central

Suchard, Marc A.; Simpson, Shawn E.; Zorych, Ivan; Ryan, Patrick; Madigan, David

2014-01-01

Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety. PMID:25328363
The NIFTy way of Bayesian signal inference

NASA Astrophysics Data System (ADS)

Selig, Marco

2014-12-01

We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
An introduction to using Bayesian linear regression with clinical data.

PubMed

Baldwin, Scott A; Larson, Michael J

2017-11-01

Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Bayes linear Bayes method for estimation of correlated event rates.

PubMed

Quigley, John; Wilson, Kevin J; Walls, Lesley; Bedford, Tim

2013-12-01

Typically, full Bayesian estimation of correlated event rates can be computationally challenging since estimators are intractable. When estimation of event rates represents one activity within a larger modeling process, there is an incentive to develop more efficient inference than provided by a full Bayesian model. We develop a new subjective inference method for correlated event rates based on a Bayes linear Bayes model under the assumption that events are generated from a homogeneous Poisson process. To reduce the elicitation burden we introduce homogenization factors to the model and, as an alternative to a subjective prior, an empirical method using the method of moments is developed. Inference under the new method is compared against estimates obtained under a full Bayesian model, which takes a multivariate gamma prior, where the predictive and posterior distributions are derived in terms of well-known functions. The mathematical properties of both models are presented. A simulation study shows that the Bayes linear Bayes inference method and the full Bayesian model provide equally reliable estimates. An illustrative example, motivated by a problem of estimating correlated event rates across different users in a simple supply chain, shows how ignoring the correlation leads to biased estimation of event rates. © 2013 Society for Risk Analysis.
BATSE gamma-ray burst line search. 2: Bayesian consistency methodology

NASA Technical Reports Server (NTRS)

Band, D. L.; Ford, L. A.; Matteson, J. L.; Briggs, M.; Paciesas, W.; Pendleton, G.; Preece, R.; Palmer, D.; Teegarden, B.; Schaefer, B.

1994-01-01

We describe a Bayesian methodology to evaluate the consistency between the reported Ginga and Burst and Transient Source Experiment (BATSE) detections of absorption features in gamma-ray burst spectra. Currently no features have been detected by BATSE, but this methodology will still be applicable if and when such features are discovered. The Bayesian methodology permits the comparison of hypotheses regarding the two detectors' observations and makes explicit the subjective aspects of our analysis (e.g., the quantification of our confidence in detector performance). We also present non-Bayesian consistency statistics. Based on preliminary calculations of line detectability, we find that both the Bayesian and non-Bayesian techniques show that the BATSE and Ginga observations are consistent given our understanding of these detectors.
Modelling lactation curve for milk fat to protein ratio in Iranian buffaloes (Bubalus bubalis) using non-linear mixed models.

PubMed

Hossein-Zadeh, Navid Ghavi

2016-08-01

The aim of this study was to compare seven non-linear mathematical models (Brody, Wood, Dhanoa, Sikka, Nelder, Rook and Dijkstra) to examine their efficiency in describing the lactation curves for milk fat to protein ratio (FPR) in Iranian buffaloes. Data were 43 818 test-day records for FPR from the first three lactations of Iranian buffaloes which were collected on 523 dairy herds in the period from 1996 to 2012 by the Animal Breeding Center of Iran. Each model was fitted to monthly FPR records of buffaloes using the non-linear mixed model procedure (PROC NLMIXED) in SAS and the parameters were estimated. The models were tested for goodness of fit using Akaike's information criterion (AIC), Bayesian information criterion (BIC) and log maximum likelihood (-2 Log L). The Nelder and Sikka mixed models provided the best fit of lactation curve for FPR in the first and second lactations of Iranian buffaloes, respectively. However, Wood, Dhanoa and Sikka mixed models provided the best fit of lactation curve for FPR in the third parity buffaloes. Evaluation of first, second and third lactation features showed that all models, except for Dijkstra model in the third lactation, under-predicted test time at which daily FPR was minimum. On the other hand, minimum FPR was over-predicted by all equations. Evaluation of the different models used in this study indicated that non-linear mixed models were sufficient for fitting test-day FPR records of Iranian buffaloes.
Approximate Bayesian computation for forward modeling in cosmology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Akeret, Joël; Refregier, Alexandre; Amara, Adam

Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to themore » posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.« less
Robust cue integration: a Bayesian model and evidence from cue-conflict studies with stereoscopic and figure cues to slant.

PubMed

Knill, David C

2007-05-23

Most research on depth cue integration has focused on stimulus regimes in which stimuli contain the small cue conflicts that one might expect to normally arise from sensory noise. In these regimes, linear models for cue integration provide a good approximation to system performance. This article focuses on situations in which large cue conflicts can naturally occur in stimuli. We describe a Bayesian model for nonlinear cue integration that makes rational inferences about scenes across the entire range of possible cue conflicts. The model derives from the simple intuition that multiple properties of scenes or causal factors give rise to the image information associated with most cues. To make perceptual inferences about one property of a scene, an ideal observer must necessarily take into account the possible contribution of these other factors to the information provided by a cue. In the context of classical depth cues, large cue conflicts most commonly arise when one or another cue is generated by an object or scene that violates the strongest form of constraint that makes the cue informative. For example, when binocularly viewing a slanted trapezoid, the slant interpretation of the figure derived by assuming that the figure is rectangular may conflict greatly with the slant suggested by stereoscopic disparities. An optimal Bayesian estimator incorporates the possibility that different constraints might apply to objects in the world and robustly integrates cues with large conflicts by effectively switching between different internal models of the prior constraints underlying one or both cues. We performed two experiments to test the predictions of the model when applied to estimating surface slant from binocular disparities and the compression cue (the aspect ratio of figures in an image). The apparent weight that subjects gave to the compression cue decreased smoothly as a function of the conflict between the cues but did not shrink to zero; that is, subjects did not fully veto the compression cue at large cue conflicts. A Bayesian model that assumes a mixed prior distribution of figure shapes in the world, with a large proportion being very regular and a smaller proportion having random shapes, provides a good quantitative fit for subjects' performance. The best fitting model parameters are consistent with the sensory noise to be expected in measurements of figure shape, further supporting the Bayesian model as an account of robust cue integration.
Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models

NASA Astrophysics Data System (ADS)

Boudineau, Mégane; Carfantan, Hervé; Bourguignon, Sébastien; Bazot, Michael

2016-06-01

We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.
Non-linear effects of soda taxes on consumption and weight outcomes.

PubMed

Fletcher, Jason M; Frisvold, David E; Tefft, Nathan

2015-05-01

The potential health impacts of imposing large taxes on soda to improve population health have been of interest for over a decade. As estimates of the effects of existing soda taxes with low rates suggest little health improvements, recent proposals suggest that large taxes may be effective in reducing weight because of non-linear consumption responses or threshold effects. This paper tests this hypothesis in two ways. First, we estimate non-linear effects of taxes using the range of current rates. Second, we leverage the sudden, relatively large soda tax increase in two states during the early 1990s combined with new synthetic control methods useful for comparative case studies. Our findings suggest virtually no evidence of non-linear or threshold effects. Copyright © 2014 John Wiley & Sons, Ltd.
Functional Multi-Locus QTL Mapping of Temporal Trends in Scots Pine Wood Traits

PubMed Central

Li, Zitong; Hallingbäck, Henrik R.; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J.; García-Gil, M. Rosario

2014-01-01

Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. PMID:25305041
Functional multi-locus QTL mapping of temporal trends in Scots pine wood traits.

PubMed

Li, Zitong; Hallingbäck, Henrik R; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J; García-Gil, M Rosario

2014-10-09

Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. Copyright © 2014 Li et al.
Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain

NASA Astrophysics Data System (ADS)

Beck, Joakim; Dia, Ben Mansour; Espath, Luis F. R.; Long, Quan; Tempone, Raúl

2018-06-01

In calculating expected information gain in optimal Bayesian experimental design, the computation of the inner loop in the classical double-loop Monte Carlo requires a large number of samples and suffers from underflow if the number of samples is small. These drawbacks can be avoided by using an importance sampling approach. We present a computationally efficient method for optimal Bayesian experimental design that introduces importance sampling based on the Laplace method to the inner loop. We derive the optimal values for the method parameters in which the average computational cost is minimized according to the desired error tolerance. We use three numerical examples to demonstrate the computational efficiency of our method compared with the classical double-loop Monte Carlo, and a more recent single-loop Monte Carlo method that uses the Laplace method as an approximation of the return value of the inner loop. The first example is a scalar problem that is linear in the uncertain parameter. The second example is a nonlinear scalar problem. The third example deals with the optimal sensor placement for an electrical impedance tomography experiment to recover the fiber orientation in laminate composites.
Bayesian inversion of refraction seismic traveltime data

NASA Astrophysics Data System (ADS)

Ryberg, T.; Haberland, Ch

2018-03-01

We apply a Bayesian Markov chain Monte Carlo (McMC) formalism to the inversion of refraction seismic, traveltime data sets to derive 2-D velocity models below linear arrays (i.e. profiles) of sources and seismic receivers. Typical refraction data sets, especially when using the far-offset observations, are known as having experimental geometries which are very poor, highly ill-posed and far from being ideal. As a consequence, the structural resolution quickly degrades with depth. Conventional inversion techniques, based on regularization, potentially suffer from the choice of appropriate inversion parameters (i.e. number and distribution of cells, starting velocity models, damping and smoothing constraints, data noise level, etc.) and only local model space exploration. McMC techniques are used for exhaustive sampling of the model space without the need of prior knowledge (or assumptions) of inversion parameters, resulting in a large number of models fitting the observations. Statistical analysis of these models allows to derive an average (reference) solution and its standard deviation, thus providing uncertainty estimates of the inversion result. The highly non-linear character of the inversion problem, mainly caused by the experiment geometry, does not allow to derive a reference solution and error map by a simply averaging procedure. We present a modified averaging technique, which excludes parts of the prior distribution in the posterior values due to poor ray coverage, thus providing reliable estimates of inversion model properties even in those parts of the models. The model is discretized by a set of Voronoi polygons (with constant slowness cells) or a triangulated mesh (with interpolation within the triangles). Forward traveltime calculations are performed by a fast, finite-difference-based eikonal solver. The method is applied to a data set from a refraction seismic survey from Northern Namibia and compared to conventional tomography. An inversion test for a synthetic data set from a known model is also presented.
The moving-window Bayesian maximum entropy framework: estimation of PM(2.5) yearly average concentration across the contiguous United States.

PubMed

Akita, Yasuyuki; Chen, Jiu-Chiuan; Serre, Marc L

2012-09-01

Geostatistical methods are widely used in estimating long-term exposures for epidemiological studies on air pollution, despite their limited capabilities to handle spatial non-stationarity over large geographic domains and the uncertainty associated with missing monitoring data. We developed a moving-window (MW) Bayesian maximum entropy (BME) method and applied this framework to estimate fine particulate matter (PM(2.5)) yearly average concentrations over the contiguous US. The MW approach accounts for the spatial non-stationarity, while the BME method rigorously processes the uncertainty associated with data missingness in the air-monitoring system. In the cross-validation analyses conducted on a set of randomly selected complete PM(2.5) data in 2003 and on simulated data with different degrees of missing data, we demonstrate that the MW approach alone leads to at least 17.8% reduction in mean square error (MSE) in estimating the yearly PM(2.5). Moreover, the MWBME method further reduces the MSE by 8.4-43.7%, with the proportion of incomplete data increased from 18.3% to 82.0%. The MWBME approach leads to significant reductions in estimation error and thus is recommended for epidemiological studies investigating the effect of long-term exposure to PM(2.5) across large geographical domains with expected spatial non-stationarity.
New evidence and impact of electron transport non-linearities based on new perturbative inter-modulation analysis

NASA Astrophysics Data System (ADS)

van Berkel, M.; Kobayashi, T.; Igami, H.; Vandersteen, G.; Hogeweij, G. M. D.; Tanaka, K.; Tamura, N.; Zwart, H. J.; Kubo, S.; Ito, S.; Tsuchiya, H.; de Baar, M. R.; LHD Experiment Group

2017-12-01

A new methodology to analyze non-linear components in perturbative transport experiments is introduced. The methodology has been experimentally validated in the Large Helical Device for the electron heat transport channel. Electron cyclotron resonance heating with different modulation frequencies by two gyrotrons has been used to directly quantify the amplitude of the non-linear component at the inter-modulation frequencies. The measurements show significant quadratic non-linear contributions and also the absence of cubic and higher order components. The non-linear component is analyzed using the Volterra series, which is the non-linear generalization of transfer functions. This allows us to study the radial distribution of the non-linearity of the plasma and to reconstruct linear profiles where the measurements were not distorted by non-linearities. The reconstructed linear profiles are significantly different from the measured profiles, demonstrating the significant impact that non-linearity can have.
Bayesian inversion of the global present-day GIA signal uncertainty from RSL data

NASA Astrophysics Data System (ADS)

Caron, Lambert; Ivins, Erik R.; Adhikari, Surendra; Larour, Eric

2017-04-01

Various geophysical signals measured in the process of studying the present-day climate change (such as changes in the Earth gravitational potential, ocean altimery or GPS data) include a secular Glacial Isostatic Adjustment contribution that has to be corrected for. Yet, one of the current major challenges that Glacial Isostatic Adjustment modelling is currently struggling with is to accurately determine the uncertainty of the predicted present-day GIA signal. This is especially true at the global scale, where coupling between ice history and mantle rheology greatly contributes to the non-uniqueness of the solutions. Here we propose to use more than 11000 paleo sea level records to constrain a set of GIA Bayesian inversions and thoroughly explore its parameters space. We include two linearly relaxing models to represent the mantle rheology and couple them with a scalable ice history model in order to better assess the non-uniqueness of the solutions. From the resulting estimates of the Probability Density Function, we then extract maps of uncertainty affecting the present-day vertical land motion and geoid due to GIA at the global scale, and their associated expectation of the signal.
On uncertainty quantification in hydrogeology and hydrogeophysics

NASA Astrophysics Data System (ADS)

Linde, Niklas; Ginsbourger, David; Irving, James; Nobile, Fabio; Doucet, Arnaud

2017-12-01

Recent advances in sensor technologies, field methodologies, numerical modeling, and inversion approaches have contributed to unprecedented imaging of hydrogeological properties and detailed predictions at multiple temporal and spatial scales. Nevertheless, imaging results and predictions will always remain imprecise, which calls for appropriate uncertainty quantification (UQ). In this paper, we outline selected methodological developments together with pioneering UQ applications in hydrogeology and hydrogeophysics. The applied mathematics and statistics literature is not easy to penetrate and this review aims at helping hydrogeologists and hydrogeophysicists to identify suitable approaches for UQ that can be applied and further developed to their specific needs. To bypass the tremendous computational costs associated with forward UQ based on full-physics simulations, we discuss proxy-modeling strategies and multi-resolution (Multi-level Monte Carlo) methods. We consider Bayesian inversion for non-linear and non-Gaussian state-space problems and discuss how Sequential Monte Carlo may become a practical alternative. We also describe strategies to account for forward modeling errors in Bayesian inversion. Finally, we consider hydrogeophysical inversion, where petrophysical uncertainty is often ignored leading to overconfident parameter estimation. The high parameter and data dimensions encountered in hydrogeological and geophysical problems make UQ a complicated and important challenge that has only been partially addressed to date.
Using Discrete Loss Functions and Weighted Kappa for Classification: An Illustration Based on Bayesian Network Analysis

ERIC Educational Resources Information Center

Zwick, Rebecca; Lenaburg, Lubella

2009-01-01

In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…
Monitoring Human Development Goals: A Straightforward (Bayesian) Methodology for Cross-National Indices

ERIC Educational Resources Information Center

Abayomi, Kobi; Pizarro, Gonzalo

2013-01-01

We offer a straightforward framework for measurement of progress, across many dimensions, using cross-national social indices, which we classify as linear combinations of multivariate country level data onto a univariate score. We suggest a Bayesian approach which yields probabilistic (confidence type) intervals for the point estimates of country…

Inference on the Genetic Basis of Eye and Skin Color in an Admixed Population via Bayesian Linear Mixed Models.

PubMed

Lloyd-Jones, Luke R; Robinson, Matthew R; Moser, Gerhard; Zeng, Jian; Beleza, Sandra; Barsh, Gregory S; Tang, Hua; Visscher, Peter M

2017-06-01

Genetic association studies in admixed populations are underrepresented in the genomics literature, with a key concern for researchers being the adequate control of spurious associations due to population structure. Linear mixed models (LMMs) are well suited for genome-wide association studies (GWAS) because they account for both population stratification and cryptic relatedness and achieve increased statistical power by jointly modeling all genotyped markers. Additionally, Bayesian LMMs allow for more flexible assumptions about the underlying distribution of genetic effects, and can concurrently estimate the proportion of phenotypic variance explained by genetic markers. Using three recently published Bayesian LMMs, Bayes R, BSLMM, and BOLT-LMM, we investigate an existing data set on eye ( n = 625) and skin ( n = 684) color from Cape Verde, an island nation off West Africa that is home to individuals with a broad range of phenotypic values for eye and skin color due to the mix of West African and European ancestry. We use simulations to demonstrate the utility of Bayesian LMMs for mapping loci and studying the genetic architecture of quantitative traits in admixed populations. The Bayesian LMMs provide evidence for two new pigmentation loci: one for eye color ( AHRR ) and one for skin color ( DDB1 ). Copyright © 2017 by the Genetics Society of America.
Unveiling saturation effects from nuclear structure function measurements at the EIC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marquet, Cyrille; Moldes, Manoel R.; Zurita, Pia

Here, we analyze the possibility of extracting a clear signal of non-linear parton saturation effects from future measurements of nuclear structure functions at the Electron–Ion Collider (EIC), in the small-x region. Our approach consists in generating pseudodata for electron-gold collisions, using the running-coupling Balitsky–Kovchegov evolution equation, and in assessing the compatibility of these saturated pseudodata with existing sets of nuclear parton distribution functions (nPDFs), extrapolated if necessary. The level of disagreement between the two is quantified by applying a Bayesian reweighting technique. This allows to infer the parton distributions needed in order to describe the pseudodata, which we find quitemore » different from the actual distributions, especially for sea quarks and gluons. This tension suggests that, should saturation effects impact the future nuclear structure function data as predicted, a successful refitting of the nPDFs may not be achievable, which would unambiguously signal the presence of non-linear effects.« less
Unveiling saturation effects from nuclear structure function measurements at the EIC

DOE PAGES

Marquet, Cyrille; Moldes, Manoel R.; Zurita, Pia

2017-07-21

Here, we analyze the possibility of extracting a clear signal of non-linear parton saturation effects from future measurements of nuclear structure functions at the Electron–Ion Collider (EIC), in the small-x region. Our approach consists in generating pseudodata for electron-gold collisions, using the running-coupling Balitsky–Kovchegov evolution equation, and in assessing the compatibility of these saturated pseudodata with existing sets of nuclear parton distribution functions (nPDFs), extrapolated if necessary. The level of disagreement between the two is quantified by applying a Bayesian reweighting technique. This allows to infer the parton distributions needed in order to describe the pseudodata, which we find quitemore » different from the actual distributions, especially for sea quarks and gluons. This tension suggests that, should saturation effects impact the future nuclear structure function data as predicted, a successful refitting of the nPDFs may not be achievable, which would unambiguously signal the presence of non-linear effects.« less
Bayesian linearized amplitude-versus-frequency inversion for quality factor and its application

NASA Astrophysics Data System (ADS)

Yang, Xinchao; Teng, Long; Li, Jingnan; Cheng, Jiubing

2018-06-01

We propose a straightforward attenuation inversion method by utilizing the amplitude-versus-frequency (AVF) characteristics of seismic data. A new linearized approximation equation of the angle and frequency dependent reflectivity in viscoelastic media is derived. We then use the presented equation to implement the Bayesian linear AVF inversion. The inversion result includes not only P-wave and S-wave velocities, and densities, but also P-wave and S-wave quality factors. Synthetic tests show that the AVF inversion surpasses the AVA inversion for quality factor estimation. However, a higher signal noise ratio (SNR) of data is necessary for the AVF inversion. To show its feasibility, we apply both the new Bayesian AVF inversion and conventional AVA inversion to a tight gas reservoir data in Sichuan Basin in China. Considering the SNR of the field data, a combination of AVF inversion for attenuation parameters and AVA inversion for elastic parameters is recommended. The result reveals that attenuation estimations could serve as a useful complement in combination with the AVA inversion results for the detection of tight gas reservoirs.
Bayesian model reduction and empirical Bayes for group (DCM) studies

PubMed Central

Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter

2016-01-01

This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
New Insights into Handling Missing Values in Environmental Epidemiological Studies

PubMed Central

Roda, Célina; Nicolis, Ioannis; Momas, Isabelle; Guihenneuc, Chantal

2014-01-01

Missing data are unavoidable in environmental epidemiologic surveys. The aim of this study was to compare methods for handling large amounts of missing values: omission of missing values, single and multiple imputations (through linear regression or partial least squares regression), and a fully Bayesian approach. These methods were applied to the PARIS birth cohort, where indoor domestic pollutant measurements were performed in a random sample of babies' dwellings. A simulation study was conducted to assess performances of different approaches with a high proportion of missing values (from 50% to 95%). Different simulation scenarios were carried out, controlling the true value of the association (odds ratio of 1.0, 1.2, and 1.4), and varying the health outcome prevalence. When a large amount of data is missing, omitting these missing data reduced statistical power and inflated standard errors, which affected the significance of the association. Single imputation underestimated the variability, and considerably increased risk of type I error. All approaches were conservative, except the Bayesian joint model. In the case of a common health outcome, the fully Bayesian approach is the most efficient approach (low root mean square error, reasonable type I error, and high statistical power). Nevertheless for a less prevalent event, the type I error is increased and the statistical power is reduced. The estimated posterior distribution of the OR is useful to refine the conclusion. Among the methods handling missing values, no approach is absolutely the best but when usual approaches (e.g. single imputation) are not sufficient, joint modelling approach of missing process and health association is more efficient when large amounts of data are missing. PMID:25226278
Prior elicitation and Bayesian analysis of the Steroids for Corneal Ulcers Trial.

PubMed

See, Craig W; Srinivasan, Muthiah; Saravanan, Somu; Oldenburg, Catherine E; Esterberg, Elizabeth J; Ray, Kathryn J; Glaser, Tanya S; Tu, Elmer Y; Zegans, Michael E; McLeod, Stephen D; Acharya, Nisha R; Lietman, Thomas M

2012-12-01

To elicit expert opinion on the use of adjunctive corticosteroid therapy in bacterial corneal ulcers. To perform a Bayesian analysis of the Steroids for Corneal Ulcers Trial (SCUT), using expert opinion as a prior probability. The SCUT was a placebo-controlled trial assessing visual outcomes in patients receiving topical corticosteroids or placebo as adjunctive therapy for bacterial keratitis. Questionnaires were conducted at scientific meetings in India and North America to gauge expert consensus on the perceived benefit of corticosteroids as adjunct treatment. Bayesian analysis, using the questionnaire data as a prior probability and the primary outcome of SCUT as a likelihood, was performed. For comparison, an additional Bayesian analysis was performed using the results of the SCUT pilot study as a prior distribution. Indian respondents believed there to be a 1.21 Snellen line improvement, and North American respondents believed there to be a 1.24 line improvement with corticosteroid therapy. The SCUT primary outcome found a non-significant 0.09 Snellen line benefit with corticosteroid treatment. The results of the Bayesian analysis estimated a slightly greater benefit than did the SCUT primary analysis (0.19 lines verses 0.09 lines). Indian and North American experts had similar expectations on the effectiveness of corticosteroids in bacterial corneal ulcers; that corticosteroids would markedly improve visual outcomes. Bayesian analysis produced results very similar to those produced by the SCUT primary analysis. The similarity in result is likely due to the large sample size of SCUT and helps validate the results of SCUT.
Final Report, DOE Early Career Award: Predictive modeling of complex physical systems: new tools for statistical inference, uncertainty quantification, and experimental design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marzouk, Youssef

Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less
Artificial and Bayesian Neural Networks

PubMed

Korhani Kangi, Azam; Bahrampour, Abbas

2018-02-26

Introduction and purpose: In recent years the use of neural networks without any premises for investigation of prognosis in analyzing survival data has increased. Artificial neural networks (ANN) use small processors with a continuous network to solve problems inspired by the human brain. Bayesian neural networks (BNN) constitute a neural-based approach to modeling and non-linearization of complex issues using special algorithms and statistical methods. Gastric cancer incidence is the first and third ranking for men and women in Iran, respectively. The aim of the present study was to assess the value of an artificial neural network and a Bayesian neural network for modeling and predicting of probability of gastric cancer patient death. Materials and Methods: In this study, we used information on 339 patients aged from 20 to 90 years old with positive gastric cancer, referred to Afzalipoor and Shahid Bahonar Hospitals in Kerman City from 2001 to 2015. The three layers perceptron neural network (ANN) and the Bayesian neural network (BNN) were used for predicting the probability of mortality using the available data. To investigate differences between the models, sensitivity, specificity, accuracy and the area under receiver operating characteristic curves (AUROCs) were generated. Results: In this study, the sensitivity and specificity of the artificial neural network and Bayesian neural network models were 0.882, 0.903 and 0.954, 0.909, respectively. Prediction accuracy and the area under curve ROC for the two models were 0.891, 0.944 and 0.935, 0.961. The age at diagnosis of gastric cancer was most important for predicting survival, followed by tumor grade, morphology, gender, smoking history, opium consumption, receiving chemotherapy, presence of metastasis, tumor stage, receiving radiotherapy, and being resident in a village. Conclusion: The findings of the present study indicated that the Bayesian neural network is preferable to an artificial neural network for predicting survival of gastric cancer patients in Iran. Creative Commons Attribution License
Estimating mono- and bi-phasic regression parameters using a mixture piecewise linear Bayesian hierarchical model

PubMed Central

Zhao, Rui; Catalano, Paul; DeGruttola, Victor G.; Michor, Franziska

2017-01-01

The dynamics of tumor burden, secreted proteins or other biomarkers over time, is often used to evaluate the effectiveness of therapy and to predict outcomes for patients. Many methods have been proposed to investigate longitudinal trends to better characterize patients and to understand disease progression. However, most approaches assume a homogeneous patient population and a uniform response trajectory over time and across patients. Here, we present a mixture piecewise linear Bayesian hierarchical model, which takes into account both population heterogeneity and nonlinear relationships between biomarkers and time. Simulation results show that our method was able to classify subjects according to their patterns of treatment response with greater than 80% accuracy in the three scenarios tested. We then applied our model to a large randomized controlled phase III clinical trial of multiple myeloma patients. Analysis results suggest that the longitudinal tumor burden trajectories in multiple myeloma patients are heterogeneous and nonlinear, even among patients assigned to the same treatment cohort. In addition, between cohorts, there are distinct differences in terms of the regression parameters and the distributions among categories in the mixture. Those results imply that longitudinal data from clinical trials may harbor unobserved subgroups and nonlinear relationships; accounting for both may be important for analyzing longitudinal data. PMID:28723910
Bayesian multivariate hierarchical transformation models for ROC analysis.

PubMed

O'Malley, A James; Zou, Kelly H

2006-02-15

A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis

PubMed Central

O'Malley, A. James; Zou, Kelly H.

2006-01-01

SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola

PubMed Central

Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope

2010-01-01

The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
Bayesian GGE biplot models applied to maize multi-environments trials.

PubMed

de Oliveira, L A; da Silva, C P; Nuvunga, J J; da Silva, A Q; Balestre, M

2016-06-17

The additive main effects and multiplicative interaction (AMMI) and the genotype main effects and genotype x environment interaction (GGE) models stand out among the linear-bilinear models used in genotype x environment interaction studies. Despite the advantages of their use to describe genotype x environment (AMMI) or genotype and genotype x environment (GGE) interactions, these methods have known limitations that are inherent to fixed effects models, including difficulty in treating variance heterogeneity and missing data. Traditional biplots include no measure of uncertainty regarding the principal components. The present study aimed to apply the Bayesian approach to GGE biplot models and assess the implications for selecting stable and adapted genotypes. Our results demonstrated that the Bayesian approach applied to GGE models with non-informative priors was consistent with the traditional GGE biplot analysis, although the credible region incorporated into the biplot enabled distinguishing, based on probability, the performance of genotypes, and their relationships with the environments in the biplot. Those regions also enabled the identification of groups of genotypes and environments with similar effects in terms of adaptability and stability. The relative position of genotypes and environments in biplots is highly affected by the experimental accuracy. Thus, incorporation of uncertainty in biplots is a key tool for breeders to make decisions regarding stability selection and adaptability and the definition of mega-environments.
A Development of Nonstationary Regional Frequency Analysis Model with Large-scale Climate Information: Its Application to Korean Watershed

NASA Astrophysics Data System (ADS)

Kim, Jin-Young; Kwon, Hyun-Han; Kim, Hung-Soo

2015-04-01

The existing regional frequency analysis has disadvantages in that it is difficult to consider geographical characteristics in estimating areal rainfall. In this regard, this study aims to develop a hierarchical Bayesian model based nonstationary regional frequency analysis in that spatial patterns of the design rainfall with geographical information (e.g. latitude, longitude and altitude) are explicitly incorporated. This study assumes that the parameters of Gumbel (or GEV distribution) are a function of geographical characteristics within a general linear regression framework. Posterior distribution of the regression parameters are estimated by Bayesian Markov Chain Monte Carlo (MCMC) method, and the identified functional relationship is used to spatially interpolate the parameters of the distributions by using digital elevation models (DEM) as inputs. The proposed model is applied to derive design rainfalls over the entire Han-river watershed. It was found that the proposed Bayesian regional frequency analysis model showed similar results compared to L-moment based regional frequency analysis. In addition, the model showed an advantage in terms of quantifying uncertainty of the design rainfall and estimating the area rainfall considering geographical information. Finally, comprehensive discussion on design rainfall in the context of nonstationary will be presented. KEYWORDS: Regional frequency analysis, Nonstationary, Spatial information, Bayesian Acknowledgement This research was supported by a grant (14AWMP-B082564-01) from Advanced Water Management Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.
Combination of dynamic Bayesian network classifiers for the recognition of degraded characters

NASA Astrophysics Data System (ADS)

Likforman-Sulem, Laurence; Sigelle, Marc

2009-01-01

We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either independent or coupled, for the recognition of degraded characters. The independent classifiers are a vertical HMM and a horizontal HMM whose observable outputs are the image columns and the image rows respectively. The coupled classifiers, presented in a previous study, associate the vertical and horizontal observation streams into single DBNs. The scores of the independent and coupled classifiers are then combined linearly at the decision level. We compare the different classifiers -independent, coupled or linearly combined- on two tasks: the recognition of artificially degraded handwritten digits and the recognition of real degraded old printed characters. Our results show that coupled DBNs perform better on degraded characters than the linear combination of independent HMM scores. Our results also show that the best classifier is obtained by linearly combining the scores of the best coupled DBN and the best independent HMM.
Iterative updating of model error for Bayesian inversion

NASA Astrophysics Data System (ADS)

Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew

2018-02-01

In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.
Bayesian Just-So Stories in Psychology and Neuroscience

ERIC Educational Resources Information Center

Bowers, Jeffrey S.; Davis, Colin J.

2012-01-01

According to Bayesian theories in psychology and neuroscience, minds and brains are (near) optimal in solving a wide range of tasks. We challenge this view and argue that more traditional, non-Bayesian approaches are more promising. We make 3 main arguments. First, we show that the empirical evidence for Bayesian theories in psychology is weak.…
Teaching Bayesian Statistics in a Health Research Methodology Program

ERIC Educational Resources Information Center

Pullenayegum, Eleanor M.; Thabane, Lehana

2009-01-01

Despite the appeal of Bayesian methods in health research, they are not widely used. This is partly due to a lack of courses in Bayesian methods at an appropriate level for non-statisticians in health research. Teaching such a course can be challenging because most statisticians have been taught Bayesian methods using a mathematical approach, and…
Digital receiver study and implementation

NASA Technical Reports Server (NTRS)

Fogle, D. A.; Lee, G. M.; Massey, J. C.

1972-01-01

Computer software was developed which makes it possible to use any general purpose computer with A/D conversion capability as a PSK receiver for low data rate telemetry processing. Carrier tracking, bit synchronization, and matched filter detection are all performed digitally. To aid in the implementation of optimum computer processors, a study of general digital processing techniques was performed which emphasized various techniques for digitizing general analog systems. In particular, the phase-locked loop was extensively analyzed as a typical non-linear communication element. Bayesian estimation techniques for PSK demodulation were studied. A hardware implementation of the digital Costas loop was developed.

Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks

NASA Astrophysics Data System (ADS)

Zhu, Shijia; Wang, Yadong

2015-12-01

Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is ‘stationarity’, and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.
A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression.

PubMed

Liu, Fang; Eugenio, Evercita C

2018-04-01

Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
Quantum-Like Representation of Non-Bayesian Inference

NASA Astrophysics Data System (ADS)

Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

2013-01-01

This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
The moving-window Bayesian Maximum Entropy framework: Estimation of PM2.5 yearly average concentration across the contiguous United States

PubMed Central

Akita, Yasuyuki; Chen, Jiu-Chiuan; Serre, Marc L.

2013-01-01

Geostatistical methods are widely used in estimating long-term exposures for air pollution epidemiological studies, despite their limited capabilities to handle spatial non-stationarity over large geographic domains and uncertainty associated with missing monitoring data. We developed a moving-window (MW) Bayesian Maximum Entropy (BME) method and applied this framework to estimate fine particulate matter (PM2.5) yearly average concentrations over the contiguous U.S. The MW approach accounts for the spatial non-stationarity, while the BME method rigorously processes the uncertainty associated with data missingnees in the air monitoring system. In the cross-validation analyses conducted on a set of randomly selected complete PM2.5 data in 2003 and on simulated data with different degrees of missing data, we demonstrate that the MW approach alone leads to at least 17.8% reduction in mean square error (MSE) in estimating the yearly PM2.5. Moreover, the MWBME method further reduces the MSE by 8.4% to 43.7% with the proportion of incomplete data increased from 18.3% to 82.0%. The MWBME approach leads to significant reductions in estimation error and thus is recommended for epidemiological studies investigating the effect of long-term exposure to PM2.5 across large geographical domains with expected spatial non-stationarity. PMID:22739679
Bayesian inference based on stationary Fokker-Planck sampling.

PubMed

Berrones, Arturo

2010-06-01

A novel formalism for bayesian learning in the context of complex inference models is proposed. The method is based on the use of the stationary Fokker-Planck (SFP) approach to sample from the posterior density. Stationary Fokker-Planck sampling generalizes the Gibbs sampler algorithm for arbitrary and unknown conditional densities. By the SFP procedure, approximate analytical expressions for the conditionals and marginals of the posterior can be constructed. At each stage of SFP, the approximate conditionals are used to define a Gibbs sampling process, which is convergent to the full joint posterior. By the analytical marginals efficient learning methods in the context of artificial neural networks are outlined. Offline and incremental bayesian inference and maximum likelihood estimation from the posterior are performed in classification and regression examples. A comparison of SFP with other Monte Carlo strategies in the general problem of sampling from arbitrary densities is also presented. It is shown that SFP is able to jump large low-probability regions without the need of a careful tuning of any step-size parameter. In fact, the SFP method requires only a small set of meaningful parameters that can be selected following clear, problem-independent guidelines. The computation cost of SFP, measured in terms of loss function evaluations, grows linearly with the given model's dimension.
Bayesian model reduction and empirical Bayes for group (DCM) studies.

PubMed

Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter

2016-03-01

This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Nonlinear and non-Gaussian Bayesian based handwriting beautification

NASA Astrophysics Data System (ADS)

Shi, Cao; Xiao, Jianguo; Xu, Canhui; Jia, Wenhua

2013-03-01

A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.
TYPE Ia SUPERNOVA COLORS AND EJECTA VELOCITIES: HIERARCHICAL BAYESIAN REGRESSION WITH NON-GAUSSIAN DISTRIBUTIONS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mandel, Kaisey S.; Kirshner, Robert P.; Foley, Ryan J., E-mail: kmandel@cfa.harvard.edu

2014-12-20

We investigate the statistical dependence of the peak intrinsic colors of Type Ia supernovae (SNe Ia) on their expansion velocities at maximum light, measured from the Si II λ6355 spectral feature. We construct a new hierarchical Bayesian regression model, accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust, and implement a Gibbs sampler and deviance information criteria to estimate the correlation. The method is applied to the apparent colors from BVRI light curves and Si II velocity data for 79 nearby SNe Ia. The apparent color distributions of high-velocity (HV) and normal velocitymore » (NV) supernovae exhibit significant discrepancies for B – V and B – R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B band, rather than dust reddening. The mean intrinsic B – V and B – R color differences between HV and NV groups are 0.06 ± 0.02 and 0.09 ± 0.02 mag, respectively. A linear model finds significant slopes of –0.021 ± 0.006 and –0.030 ± 0.009 mag (10{sup 3} km s{sup –1}){sup –1} for intrinsic B – V and B – R colors versus velocity, respectively. Because the ejecta velocity distribution is skewed toward high velocities, these effects imply non-Gaussian intrinsic color distributions with skewness up to +0.3. Accounting for the intrinsic-color-velocity correlation results in corrections to A{sub V} extinction estimates as large as –0.12 mag for HV SNe Ia and +0.06 mag for NV events. Velocity measurements from SN Ia spectra have the potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances.« less
astroABC : An Approximate Bayesian Computation Sequential Monte Carlo sampler for cosmological parameter estimation

NASA Astrophysics Data System (ADS)

Jennings, E.; Madigan, M.

2017-04-01

Given the complexity of modern cosmological parameter inference where we are faced with non-Gaussian data and noise, correlated systematics and multi-probe correlated datasets,the Approximate Bayesian Computation (ABC) method is a promising alternative to traditional Markov Chain Monte Carlo approaches in the case where the Likelihood is intractable or unknown. The ABC method is called "Likelihood free" as it avoids explicit evaluation of the Likelihood by using a forward model simulation of the data which can include systematics. We introduce astroABC, an open source ABC Sequential Monte Carlo (SMC) sampler for parameter estimation. A key challenge in astrophysics is the efficient use of large multi-probe datasets to constrain high dimensional, possibly correlated parameter spaces. With this in mind astroABC allows for massive parallelization using MPI, a framework that handles spawning of processes across multiple nodes. A key new feature of astroABC is the ability to create MPI groups with different communicators, one for the sampler and several others for the forward model simulation, which speeds up sampling time considerably. For smaller jobs the Python multiprocessing option is also available. Other key features of this new sampler include: a Sequential Monte Carlo sampler; a method for iteratively adapting tolerance levels; local covariance estimate using scikit-learn's KDTree; modules for specifying optimal covariance matrix for a component-wise or multivariate normal perturbation kernel and a weighted covariance metric; restart files output frequently so an interrupted sampling run can be resumed at any iteration; output and restart files are backed up at every iteration; user defined distance metric and simulation methods; a module for specifying heterogeneous parameter priors including non-standard prior PDFs; a module for specifying a constant, linear, log or exponential tolerance level; well-documented examples and sample scripts. This code is hosted online at https://github.com/EliseJ/astroABC.
Population response to climate change: linear vs. non-linear modeling approaches.

PubMed

Ellis, Alicia M; Post, Eric

2004-03-31

Research on the ecological consequences of global climate change has elicited a growing interest in the use of time series analysis to investigate population dynamics in a changing climate. Here, we compare linear and non-linear models describing the contribution of climate to the density fluctuations of the population of wolves on Isle Royale, Michigan from 1959 to 1999. The non-linear self excitatory threshold autoregressive (SETAR) model revealed that, due to differences in the strength and nature of density dependence, relatively small and large populations may be differentially affected by future changes in climate. Both linear and non-linear models predict a decrease in the population of wolves with predicted changes in climate. Because specific predictions differed between linear and non-linear models, our study highlights the importance of using non-linear methods that allow the detection of non-linearity in the strength and nature of density dependence. Failure to adopt a non-linear approach to modelling population response to climate change, either exclusively or in addition to linear approaches, may compromise efforts to quantify ecological consequences of future warming.
Bayesian estimation inherent in a Mexican-hat-type neural network

NASA Astrophysics Data System (ADS)

Takiyama, Ken

2016-05-01

Brain functions, such as perception, motor control and learning, and decision making, have been explained based on a Bayesian framework, i.e., to decrease the effects of noise inherent in the human nervous system or external environment, our brain integrates sensory and a priori information in a Bayesian optimal manner. However, it remains unclear how Bayesian computations are implemented in the brain. Herein, I address this issue by analyzing a Mexican-hat-type neural network, which was used as a model of the visual cortex, motor cortex, and prefrontal cortex. I analytically demonstrate that the dynamics of an order parameter in the model corresponds exactly to a variational inference of a linear Gaussian state-space model, a Bayesian estimation, when the strength of recurrent synaptic connectivity is appropriately stronger than that of an external stimulus, a plausible condition in the brain. This exact correspondence can reveal the relationship between the parameters in the Bayesian estimation and those in the neural network, providing insight for understanding brain functions.
Fast and local non-linear evolution of steep wave-groups on deep water: A comparison of approximate models to fully non-linear simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Adcock, T. A. A.; Taylor, P. H.

2016-01-15

The non-linear Schrödinger equation and its higher order extensions are routinely used for analysis of extreme ocean waves. This paper compares the evolution of individual wave-packets modelled using non-linear Schrödinger type equations with packets modelled using fully non-linear potential flow models. The modified non-linear Schrödinger Equation accurately models the relatively large scale non-linear changes to the shape of wave-groups, with a dramatic contraction of the group along the mean propagation direction and a corresponding extension of the width of the wave-crests. In addition, as extreme wave form, there is a local non-linear contraction of the wave-group around the crest whichmore » leads to a localised broadening of the wave spectrum which the bandwidth limited non-linear Schrödinger Equations struggle to capture. This limitation occurs for waves of moderate steepness and a narrow underlying spectrum.« less
Prior Elicitation and Bayesian Analysis of the Steroids for Corneal Ulcers Trial

PubMed Central

See, Craig W.; Srinivasan, Muthiah; Saravanan, Somu; Oldenburg, Catherine E.; Esterberg, Elizabeth J.; Ray, Kathryn J.; Glaser, Tanya S.; Tu, Elmer Y.; Zegans, Michael E.; McLeod, Stephen D.; Acharya, Nisha R.; Lietman, Thomas M.

2013-01-01

Purpose To elicit expert opinion on the use of adjunctive corticosteroid therapy in bacterial corneal ulcers. To perform a Bayesian analysis of the Steroids for Corneal Ulcers Trial (SCUT), using expert opinion as a prior probability. Methods The SCUT was a placebo-controlled trial assessing visual outcomes in patients receiving topical corticosteroids or placebo as adjunctive therapy for bacterial keratitis. Questionnaires were conducted at scientific meetings in India and North America to gauge expert consensus on the perceived benefit of corticosteroids as adjunct treatment. Bayesian analysis, using the questionnaire data as a prior probability and the primary outcome of SCUT as a likelihood, was performed. For comparison, an additional Bayesian analysis was performed using the results of the SCUT pilot study as a prior distribution. Results Indian respondents believed there to be a 1.21 Snellen line improvement, and North American respondents believed there to be a 1.24 line improvement with corticosteroid therapy. The SCUT primary outcome found a non-significant 0.09 Snellen line benefit with corticosteroid treatment. The results of the Bayesian analysis estimated a slightly greater benefit than did the SCUT primary analysis (0.19 lines verses 0.09 lines). Conclusion Indian and North American experts had similar expectations on the effectiveness of corticosteroids in bacterial corneal ulcers; that corticosteroids would markedly improve visual outcomes. Bayesian analysis produced results very similar to those produced by the SCUT primary analysis. The similarity in result is likely due to the large sample size of SCUT and helps validate the results of SCUT. PMID:23171211
HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python.

PubMed

Wiecki, Thomas V; Sofer, Imri; Frank, Michael J

2013-01-01

The diffusion model is a commonly used tool to infer latent psychological processes underlying decision-making, and to link them to neural mechanisms based on response times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of response time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model), which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject/condition than non-hierarchical methods, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g., fMRI) influence decision-making parameters. This paper will first describe the theoretical background of the drift diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the χ(2)-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs/
Multiplicative Forests for Continuous-Time Processes

PubMed Central

Weiss, Jeremy C.; Natarajan, Sriraam; Page, David

2013-01-01

Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability. PMID:25284967
Multiplicative Forests for Continuous-Time Processes.

PubMed

Weiss, Jeremy C; Natarajan, Sriraam; Page, David

2012-01-01

Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability.
Accurate Biomass Estimation via Bayesian Adaptive Sampling

NASA Technical Reports Server (NTRS)

Wheeler, Kevin R.; Knuth, Kevin H.; Castle, Joseph P.; Lvov, Nikolay

2005-01-01

The following concepts were introduced: a) Bayesian adaptive sampling for solving biomass estimation; b) Characterization of MISR Rahman model parameters conditioned upon MODIS landcover. c) Rigorous non-parametric Bayesian approach to analytic mixture model determination. d) Unique U.S. asset for science product validation and verification.
Bayesian Statistics for Biological Data: Pedigree Analysis

ERIC Educational Resources Information Center

Stanfield, William D.; Carlton, Matthew A.

2004-01-01

The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
Predicting the multi-domain progression of Parkinson's disease: a Bayesian multivariate generalized linear mixed-effect model.

PubMed

Wang, Ming; Li, Zheng; Lee, Eun Young; Lewis, Mechelle M; Zhang, Lijun; Sterling, Nicholas W; Wagner, Daymond; Eslinger, Paul; Du, Guangwei; Huang, Xuemei

2017-09-25

It is challenging for current statistical models to predict clinical progression of Parkinson's disease (PD) because of the involvement of multi-domains and longitudinal data. Past univariate longitudinal or multivariate analyses from cross-sectional trials have limited power to predict individual outcomes or a single moment. The multivariate generalized linear mixed-effect model (GLMM) under the Bayesian framework was proposed to study multi-domain longitudinal outcomes obtained at baseline, 18-, and 36-month. The outcomes included motor, non-motor, and postural instability scores from the MDS-UPDRS, and demographic and standardized clinical data were utilized as covariates. The dynamic prediction was performed for both internal and external subjects using the samples from the posterior distributions of the parameter estimates and random effects, and also the predictive accuracy was evaluated based on the root of mean square error (RMSE), absolute bias (AB) and the area under the receiver operating characteristic (ROC) curve. First, our prediction model identified clinical data that were differentially associated with motor, non-motor, and postural stability scores. Second, the predictive accuracy of our model for the training data was assessed, and improved prediction was gained in particularly for non-motor (RMSE and AB: 2.89 and 2.20) compared to univariate analysis (RMSE and AB: 3.04 and 2.35). Third, the individual-level predictions of longitudinal trajectories for the testing data were performed, with ~80% observed values falling within the 95% credible intervals. Multivariate general mixed models hold promise to predict clinical progression of individual outcomes in PD. The data was obtained from Dr. Xuemei Huang's NIH grant R01 NS060722 , part of NINDS PD Biomarker Program (PDBP). All data was entered within 24 h of collection to the Data Management Repository (DMR), which is publically available ( https://pdbp.ninds.nih.gov/data-management ).
The ambient dose equivalent at flight altitudes: a fit to a large set of data using a Bayesian approach.

PubMed

Wissmann, F; Reginatto, M; Möller, T

2010-09-01

The problem of finding a simple, generally applicable description of worldwide measured ambient dose equivalent rates at aviation altitudes between 8 and 12 km is difficult to solve due to the large variety of functional forms and parametrisations that are possible. We present an approach that uses Bayesian statistics and Monte Carlo methods to fit mathematical models to a large set of data and to compare the different models. About 2500 data points measured in the periods 1997-1999 and 2003-2006 were used. Since the data cover wide ranges of barometric altitude, vertical cut-off rigidity and phases in the solar cycle 23, we developed functions which depend on these three variables. Whereas the dependence on the vertical cut-off rigidity is described by an exponential, the dependences on barometric altitude and solar activity may be approximated by linear functions in the ranges under consideration. Therefore, a simple Taylor expansion was used to define different models and to investigate the relevance of the different expansion coefficients. With the method presented here, it is possible to obtain probability distributions for each expansion coefficient and thus to extract reliable uncertainties even for the dose rate evaluated. The resulting function agrees well with new measurements made at fixed geographic positions and during long haul flights covering a wide range of latitudes.

A Novel Approach of Understanding and Incorporating Error of Chemical Transport Models into a Geostatistical Framework

NASA Astrophysics Data System (ADS)

Reyes, J.; Vizuete, W.; Serre, M. L.; Xu, Y.

2015-12-01

The EPA employs a vast monitoring network to measure ambient PM2.5 concentrations across the United States with one of its goals being to quantify exposure within the population. However, there are several areas of the country with sparse monitoring spatially and temporally. One means to fill in these monitoring gaps is to use PM2.5 modeled estimates from Chemical Transport Models (CTMs) specifically the Community Multi-scale Air Quality (CMAQ) model. CMAQ is able to provide complete spatial coverage but is subject to systematic and random error due to model uncertainty. Due to the deterministic nature of CMAQ, often these uncertainties are not quantified. Much effort is employed to quantify the efficacy of these models through different metrics of model performance. Currently evaluation is specific to only locations with observed data. Multiyear studies across the United States are challenging because the error and model performance of CMAQ are not uniform over such large space/time domains. Error changes regionally and temporally. Because of the complex mix of species that constitute PM2.5, CMAQ error is also a function of increasing PM2.5 concentration. To address this issue we introduce a model performance evaluation for PM2.5 CMAQ that is regionalized and non-linear. This model performance evaluation leads to error quantification for each CMAQ grid. Areas and time periods of error being better qualified. The regionalized error correction approach is non-linear and is therefore more flexible at characterizing model performance than approaches that rely on linearity assumptions and assume homoscedasticity of CMAQ predictions errors. Corrected CMAQ data are then incorporated into the modern geostatistical framework of Bayesian Maximum Entropy (BME). Through cross validation it is shown that incorporating error-corrected CMAQ data leads to more accurate estimates than just using observed data by themselves.
Bayesian models: A statistical primer for ecologists

USGS Publications Warehouse

Hobbs, N. Thompson; Hooten, Mevin B.

2015-01-01

Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
A three-step Maximum-A-Posterior probability method for InSAR data inversion of coseismic rupture with application to four recent large earthquakes in Asia

NASA Astrophysics Data System (ADS)

Sun, J.; Shen, Z.; Burgmann, R.; Liang, F.

2012-12-01

We develop a three-step Maximum-A-Posterior probability (MAP) method for coseismic rupture inversion, which aims at maximizing the a posterior probability density function (PDF) of elastic solutions of earthquake rupture. The method originates from the Fully Bayesian Inversion (FBI) and the Mixed linear-nonlinear Bayesian inversion (MBI) methods , shares the same a posterior PDF with them and keeps most of their merits, while overcoming its convergence difficulty when large numbers of low quality data are used and improving the convergence rate greatly using optimization procedures. A highly efficient global optimization algorithm, Adaptive Simulated Annealing (ASA), is used to search for the maximum posterior probability in the first step. The non-slip parameters are determined by the global optimization method, and the slip parameters are inverted for using the least squares method without positivity constraint initially, and then damped to physically reasonable range. This step MAP inversion brings the inversion close to 'true' solution quickly and jumps over local maximum regions in high-dimensional parameter space. The second step inversion approaches the 'true' solution further with positivity constraints subsequently applied on slip parameters using the Monte Carlo Inversion (MCI) technique, with all parameters obtained from step one as the initial solution. Then the slip artifacts are eliminated from slip models in the third step MAP inversion with fault geometry parameters fixed. We first used a designed model with 45 degree dipping angle and oblique slip, and corresponding synthetic InSAR data sets to validate the efficiency and accuracy of method. We then applied the method on four recent large earthquakes in Asia, namely the 2010 Yushu, China earthquake, the 2011 Burma earthquake, the 2011 New Zealand earthquake and the 2008 Qinghai, China earthquake, and compared our results with those results from other groups. Our results show the effectiveness of the method in earthquake studies and a number of advantages of it over other methods. The details will be reported on the meeting.
Large-scale dynamo action precedes turbulence in shearing box simulations of the magnetorotational instability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhat, Pallavi; Ebrahimi, Fatima; Blackman, Eric G.

Here, we study the dynamo generation (exponential growth) of large-scale (planar averaged) fields in unstratified shearing box simulations of the magnetorotational instability (MRI). In contrast to previous studies restricted to horizontal (x–y) averaging, we also demonstrate the presence of large-scale fields when vertical (y–z) averaging is employed instead. By computing space–time planar averaged fields and power spectra, we find large-scale dynamo action in the early MRI growth phase – a previously unidentified feature. Non-axisymmetric linear MRI modes with low horizontal wavenumbers and vertical wavenumbers near that of expected maximal growth, amplify the large-scale fields exponentially before turbulence and high wavenumbermore » fluctuations arise. Thus the large-scale dynamo requires only linear fluctuations but not non-linear turbulence (as defined by mode–mode coupling). Vertical averaging also allows for monitoring the evolution of the large-scale vertical field and we find that a feedback from horizontal low wavenumber MRI modes provides a clue as to why the large-scale vertical field sustains against turbulent diffusion in the non-linear saturation regime. We compute the terms in the mean field equations to identify the individual contributions to large-scale field growth for both types of averaging. The large-scale fields obtained from vertical averaging are found to compare well with global simulations and quasi-linear analytical analysis from a previous study by Ebrahimi & Blackman. We discuss the potential implications of these new results for understanding the large-scale MRI dynamo saturation and turbulence.« less
Large-scale dynamo action precedes turbulence in shearing box simulations of the magnetorotational instability

DOE PAGES

Bhat, Pallavi; Ebrahimi, Fatima; Blackman, Eric G.

2016-07-06

Here, we study the dynamo generation (exponential growth) of large-scale (planar averaged) fields in unstratified shearing box simulations of the magnetorotational instability (MRI). In contrast to previous studies restricted to horizontal (x–y) averaging, we also demonstrate the presence of large-scale fields when vertical (y–z) averaging is employed instead. By computing space–time planar averaged fields and power spectra, we find large-scale dynamo action in the early MRI growth phase – a previously unidentified feature. Non-axisymmetric linear MRI modes with low horizontal wavenumbers and vertical wavenumbers near that of expected maximal growth, amplify the large-scale fields exponentially before turbulence and high wavenumbermore » fluctuations arise. Thus the large-scale dynamo requires only linear fluctuations but not non-linear turbulence (as defined by mode–mode coupling). Vertical averaging also allows for monitoring the evolution of the large-scale vertical field and we find that a feedback from horizontal low wavenumber MRI modes provides a clue as to why the large-scale vertical field sustains against turbulent diffusion in the non-linear saturation regime. We compute the terms in the mean field equations to identify the individual contributions to large-scale field growth for both types of averaging. The large-scale fields obtained from vertical averaging are found to compare well with global simulations and quasi-linear analytical analysis from a previous study by Ebrahimi & Blackman. We discuss the potential implications of these new results for understanding the large-scale MRI dynamo saturation and turbulence.« less
Non-perturbative aspects of particle acceleration in non-linear electrodynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burton, David A.; Flood, Stephen P.; Wen, Haibao

2015-04-15

We undertake an investigation of particle acceleration in the context of non-linear electrodynamics. We deduce the maximum energy that an electron can gain in a non-linear density wave in a magnetised plasma, and we show that an electron can “surf” a sufficiently intense Born-Infeld electromagnetic plane wave and be strongly accelerated by the wave. The first result is valid for a large class of physically reasonable modifications of the linear Maxwell equations, whilst the second result exploits the special mathematical structure of Born-Infeld theory.
Analysis of statistical and standard algorithms for detecting muscle onset with surface electromyography.

PubMed

Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A

2017-01-01

The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60-90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity.
Improving satellite-based PM2.5 estimates in China using Gaussian processes modeling in a Bayesian hierarchical setting.

PubMed

Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun

2017-08-01

Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2 = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.
Statistical analysis of textural features for improved classification of oral histopathological images.

PubMed

Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K

2012-04-01

The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.
Global strength assessment in oblique waves of a large gas carrier ship, based on a non-linear iterative method

NASA Astrophysics Data System (ADS)

Domnisoru, L.; Modiga, A.; Gasparotti, C.

2016-08-01

At the ship's design, the first step of the hull structural assessment is based on the longitudinal strength analysis, with head wave equivalent loads by the ships' classification societies’ rules. This paper presents an enhancement of the longitudinal strength analysis, considering the general case of the oblique quasi-static equivalent waves, based on the own non-linear iterative procedure and in-house program. The numerical approach is developed for the mono-hull ships, without restrictions on 3D-hull offset lines non-linearities, and involves three interlinked iterative cycles on floating, pitch and roll trim equilibrium conditions. Besides the ship-wave equilibrium parameters, the ship's girder wave induced loads are obtained. As numerical study case we have considered a large LPG liquefied petroleum gas carrier. The numerical results of the large LPG are compared with the statistical design values from several ships' classification societies’ rules. This study makes possible to obtain the oblique wave conditions that are inducing the maximum loads into the large LPG ship's girder. The numerical results of this study are pointing out that the non-linear iterative approach is necessary for the computation of the extreme loads induced by the oblique waves, ensuring better accuracy of the large LPG ship's longitudinal strength assessment.
Bayesian ensemble refinement by replica simulations and reweighting.

PubMed

Hummer, Gerhard; Köfinger, Jürgen

2015-12-28

We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
Bayesian ensemble refinement by replica simulations and reweighting

NASA Astrophysics Data System (ADS)

Hummer, Gerhard; Köfinger, Jürgen

2015-12-01

We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
Bayesian non-parametric inference for stochastic epidemic models using Gaussian Processes.

PubMed

Xu, Xiaoguang; Kypraios, Theodore; O'Neill, Philip D

2016-10-01

This paper considers novel Bayesian non-parametric methods for stochastic epidemic models. Many standard modeling and data analysis methods use underlying assumptions (e.g. concerning the rate at which new cases of disease will occur) which are rarely challenged or tested in practice. To relax these assumptions, we develop a Bayesian non-parametric approach using Gaussian Processes, specifically to estimate the infection process. The methods are illustrated with both simulated and real data sets, the former illustrating that the methods can recover the true infection process quite well in practice, and the latter illustrating that the methods can be successfully applied in different settings. © The Author 2016. Published by Oxford University Press.
Practical differences among probabilities, possibilities, and credibilities

NASA Astrophysics Data System (ADS)

Grandin, Jean-Francois; Moulin, Caroline

2002-03-01

This paper presents some important differences that exist between theories, which allow the uncertainty management in data fusion. The main comparative results illustrated in this paper are the followings: Incompatibility between decisions got from probabilities and credibilities is highlighted. In the dynamic frame, as remarked in [19] or [17], belief and plausibility of Dempster-Shafer model do not frame the Bayesian probability. This framing can however be obtained by the Modified Dempster-Shafer approach. It also can be obtained in the Bayesian framework either by simulation techniques, or with a studentization. The uncommitted in the Dempster-Shafer way, e.g. the mass accorded to the ignorance, gives a mechanism similar to the reliability in the Bayesian model. Uncommitted mass in Dempster-Shafer theory or reliability in Bayes theory act like a filter that weakens extracted information, and improves robustness to outliners. So, it is logical to observe on examples like the one presented particularly by D.M. Buede, a faster convergence of a Bayesian method that doesn't take into account the reliability, in front of Dempster-Shafer method which uses uncommitted mass. But, on Bayesian masses, if reliability is taken into account, at the same level that the uncommited, e.g. F=1-m, we observe an equivalent rate for convergence. When Dempster-Shafer and Bayes operator are informed by uncertainty, faster or lower convergence can be exhibited on non Bayesian masses. This is due to positive or negative synergy between information delivered by sensors. This effect is a direct consequence of non additivity when considering non Bayesian masses. Unknowledge of the prior in bayesian techniques can be quickly compensated by information accumulated as time goes on by a set of sensors. All these results are presented on simple examples, and developed when necessary.
Development of a Bayesian response-adaptive trial design for the Dexamethasone for Excessive Menstruation study.

PubMed

Holm Hansen, Christian; Warner, Pamela; Parker, Richard A; Walker, Brian R; Critchley, Hilary Od; Weir, Christopher J

2017-12-01

It is often unclear what specific adaptive trial design features lead to an efficient design which is also feasible to implement. This article describes the preparatory simulation study for a Bayesian response-adaptive dose-finding trial design. Dexamethasone for Excessive Menstruation aims to assess the efficacy of Dexamethasone in reducing excessive menstrual bleeding and to determine the best dose for further study. To maximise learning about the dose response, patients receive placebo or an active dose with randomisation probabilities adapting based on evidence from patients already recruited. The dose-response relationship is estimated using a flexible Bayesian Normal Dynamic Linear Model. Several competing design options were considered including: number of doses, proportion assigned to placebo, adaptation criterion, and number and timing of adaptations. We performed a fractional factorial study using SAS software to simulate virtual trial data for candidate adaptive designs under a variety of scenarios and to invoke WinBUGS for Bayesian model estimation. We analysed the simulated trial results using Normal linear models to estimate the effects of each design feature on empirical type I error and statistical power. Our readily-implemented approach using widely available statistical software identified a final design which performed robustly across a range of potential trial scenarios.
Hybrid ICA-Bayesian network approach reveals distinct effective connectivity differences in schizophrenia.

PubMed

Kim, D; Burge, J; Lane, T; Pearlson, G D; Kiehl, K A; Calhoun, V D

2008-10-01

We utilized a discrete dynamic Bayesian network (dDBN) approach (Burge, J., Lane, T., Link, H., Qiu, S., Clark, V.P., 2007. Discrete dynamic Bayesian network analysis of fMRI data. Hum Brain Mapp.) to determine differences in brain regions between patients with schizophrenia and healthy controls on a measure of effective connectivity, termed the approximate conditional likelihood score (ACL) (Burge, J., Lane, T., 2005. Learning Class-Discriminative Dynamic Bayesian Networks. Proceedings of the International Conference on Machine Learning, Bonn, Germany, pp. 97-104.). The ACL score represents a class-discriminative measure of effective connectivity by measuring the relative likelihood of the correlation between brain regions in one group versus another. The algorithm is capable of finding non-linear relationships between brain regions because it uses discrete rather than continuous values and attempts to model temporal relationships with a first-order Markov and stationary assumption constraint (Papoulis, A., 1991. Probability, random variables, and stochastic processes. McGraw-Hill, New York.). Since Bayesian networks are overly sensitive to noisy data, we introduced an independent component analysis (ICA) filtering approach that attempted to reduce the noise found in fMRI data by unmixing the raw datasets into a set of independent spatial component maps. Components that represented noise were removed and the remaining components reconstructed into the dimensions of the original fMRI datasets. We applied the dDBN algorithm to a group of 35 patients with schizophrenia and 35 matched healthy controls using an ICA filtered and unfiltered approach. We determined that filtering the data significantly improved the magnitude of the ACL score. Patients showed the greatest ACL scores in several regions, most markedly the cerebellar vermis and hemispheres. Our findings suggest that schizophrenia patients exhibit weaker connectivity than healthy controls in multiple regions, including bilateral temporal, frontal, and cerebellar regions during an auditory paradigm.
Evaluating and improving the representation of heteroscedastic errors in hydrological models

NASA Astrophysics Data System (ADS)

McInerney, D. J.; Thyer, M. A.; Kavetski, D.; Kuczera, G. A.

2013-12-01

Appropriate representation of residual errors in hydrological modelling is essential for accurate and reliable probabilistic predictions. In particular, residual errors of hydrological models are often heteroscedastic, with large errors associated with high rainfall and runoff events. Recent studies have shown that using a weighted least squares (WLS) approach - where the magnitude of residuals are assumed to be linearly proportional to the magnitude of the flow - captures some of this heteroscedasticity. In this study we explore a range of Bayesian approaches for improving the representation of heteroscedasticity in residual errors. We compare several improved formulations of the WLS approach, the well-known Box-Cox transformation and the more recent log-sinh transformation. Our results confirm that these approaches are able to stabilize the residual error variance, and that it is possible to improve the representation of heteroscedasticity compared with the linear WLS approach. We also find generally good performance of the Box-Cox and log-sinh transformations, although as indicated in earlier publications, the Box-Cox transform sometimes produces unrealistically large prediction limits. Our work explores the trade-offs between these different uncertainty characterization approaches, investigates how their performance varies across diverse catchments and models, and recommends practical approaches suitable for large-scale applications.
Bayesian state space models for dynamic genetic network construction across multiple tissues.

PubMed

Liang, Yulan; Kelemen, Arpad

2016-08-01

Construction of gene-gene interaction networks and potential pathways is a challenging and important problem in genomic research for complex diseases while estimating the dynamic changes of the temporal correlations and non-stationarity are the keys in this process. In this paper, we develop dynamic state space models with hierarchical Bayesian settings to tackle this challenge for inferring the dynamic profiles and genetic networks associated with disease treatments. We treat both the stochastic transition matrix and the observation matrix time-variant and include temporal correlation structures in the covariance matrix estimations in the multivariate Bayesian state space models. The unevenly spaced short time courses with unseen time points are treated as hidden state variables. Hierarchical Bayesian approaches with various prior and hyper-prior models with Monte Carlo Markov Chain and Gibbs sampling algorithms are used to estimate the model parameters and the hidden state variables. We apply the proposed Hierarchical Bayesian state space models to multiple tissues (liver, skeletal muscle, and kidney) Affymetrix time course data sets following corticosteroid (CS) drug administration. Both simulation and real data analysis results show that the genomic changes over time and gene-gene interaction in response to CS treatment can be well captured by the proposed models. The proposed dynamic Hierarchical Bayesian state space modeling approaches could be expanded and applied to other large scale genomic data, such as next generation sequence (NGS) combined with real time and time varying electronic health record (EHR) for more comprehensive and robust systematic and network based analysis in order to transform big biomedical data into predictions and diagnostics for precision medicine and personalized healthcare with better decision making and patient outcomes.
Quantifying temporal trends in fisheries abundance using Bayesian dynamic linear models: A case study of riverine Smallmouth Bass populations

USGS Publications Warehouse

Schall, Megan K.; Blazer, Vicki S.; Lorantas, Robert M.; Smith, Geoffrey; Mullican, John E.; Keplinger, Brandon J.; Wagner, Tyler

2018-01-01

Detecting temporal changes in fish abundance is an essential component of fisheries management. Because of the need to understand short‐term and nonlinear changes in fish abundance, traditional linear models may not provide adequate information for management decisions. This study highlights the utility of Bayesian dynamic linear models (DLMs) as a tool for quantifying temporal dynamics in fish abundance. To achieve this goal, we quantified temporal trends of Smallmouth Bass Micropterus dolomieu catch per effort (CPE) from rivers in the mid‐Atlantic states, and we calculated annual probabilities of decline from the posterior distributions of annual rates of change in CPE. We were interested in annual declines because of recent concerns about fish health in portions of the study area. In general, periods of decline were greatest within the Susquehanna River basin, Pennsylvania. The declines in CPE began in the late 1990s—prior to observations of fish health problems—and began to stabilize toward the end of the time series (2011). In contrast, many of the other rivers investigated did not have the same magnitude or duration of decline in CPE. Bayesian DLMs provide information about annual changes in abundance that can inform management and are easily communicated with managers and stakeholders.
Introduction to Bayesian statistical approaches to compositional analyses of transgenic crops 1. Model validation and setting the stage.

PubMed

Harrison, Jay M; Breeze, Matthew L; Harrigan, George G

2011-08-01

Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.

BFLCRM: A BAYESIAN FUNCTIONAL LINEAR COX REGRESSION MODEL FOR PREDICTING TIME TO CONVERSION TO ALZHEIMER’S DISEASE*

PubMed Central

Lee, Eunjee; Zhu, Hongtu; Kong, Dehan; Wang, Yalin; Giovanello, Kelly Sullivan; Ibrahim, Joseph G

2015-01-01

The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer’s disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer’s Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months. The functional linear Cox regression model was used to establish that functional covariates including hippocampus surface morphology and scalar covariates including brain MRI volumes, cognitive performance (ADAS-Cog), and APOE status can accurately predict time to onset of AD. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFLCRM. PMID:26900412
Employment of CB models for non-linear dynamic analysis

NASA Technical Reports Server (NTRS)

Klein, M. R. M.; Deloo, P.; Fournier-Sicre, A.

1990-01-01

The non-linear dynamic analysis of large structures is always very time, effort and CPU consuming. Whenever possible the reduction of the size of the mathematical model involved is of main importance to speed up the computational procedures. Such reduction can be performed for the part of the structure which perform linearly. Most of the time, the classical Guyan reduction process is used. For non-linear dynamic process where the non-linearity is present at interfaces between different structures, Craig-Bampton models can provide a very rich information, and allow easy selection of the relevant modes with respect to the phenomenon driving the non-linearity. The paper presents the employment of Craig-Bampton models combined with Newmark direct integration for solving non-linear friction problems appearing at the interface between the Hubble Space Telescope and its solar arrays during in-orbit maneuvers. Theory, implementation in the FEM code ASKA, and practical results are shown.
Multi-decadal trend and space-time variability of sea level over the Indian Ocean since the 1950s: impact of decadal climate modes

NASA Astrophysics Data System (ADS)

Han, W.; Stammer, D.; Meehl, G. A.; Hu, A.; Sienz, F.

2016-12-01

Sea level varies on decadal and multi-decadal timescales over the Indian Ocean. The variations are not spatially uniform, and can deviate considerably from the global mean sea level rise (SLR) due to various geophysical processes. One of these processes is the change of ocean circulation, which can be partly attributed to natural internal modes of climate variability. Over the Indian Ocean, the most influential climate modes on decadal and multi-decadal timescales are the Interdecadal Pacific Oscillation (IPO) and decadal variability of the Indian Ocean dipole (IOD). Here, we first analyze observational datasets to investigate the impacts of IPO and IOD on spatial patterns of decadal and interdecadal (hereafter decal) sea level variability & multi-decadal trend over the Indian Ocean since the 1950s, using a new statistical approach of Bayesian Dynamical Linear regression Model (DLM). The Bayesian DLM overcomes the limitation of "time-constant (static)" regression coefficients in conventional multiple linear regression model, by allowing the coefficients to vary with time and therefore measuring "time-evolving (dynamical)" relationship between climate modes and sea level. For the multi-decadal sea level trend since the 1950s, our results show that climate modes and non-climate modes (the part that cannot be explained by climate modes) have comparable contributions in magnitudes but with different spatial patterns, with each dominating different regions of the Indian Ocean. For decadal variability, climate modes are the major contributors for sea level variations over most region of the tropical Indian Ocean. The relative importance of IPO and decadal variability of IOD, however, varies spatially. For example, while IOD decadal variability dominates IPO in the eastern equatorial basin (85E-100E, 5S-5N), IPO dominates IOD in causing sea level variations in the tropical southwest Indian Ocean (45E-65E, 12S-2S). To help decipher the possible contribution of external forcing to the multi-decadal sea level trend and decadal variability, we also analyze the model outputs from NCAR's Community Earth System Model (CESM) Large Ensemble Experiments, and compare the results with our observational analyses.
Bayesian approach for assessing non-inferiority in a three-arm trial with pre-specified margin.

PubMed

Ghosh, Samiran; Ghosh, Santu; Tiwari, Ram C

2016-02-28

Non-inferiority trials are becoming increasingly popular for comparative effectiveness research. However, inclusion of the placebo arm, whenever possible, gives rise to a three-arm trial which has lesser burdensome assumptions than a standard two-arm non-inferiority trial. Most of the past developments in a three-arm trial consider defining a pre-specified fraction of unknown effect size of reference drug, that is, without directly specifying a fixed non-inferiority margin. However, in some recent developments, a more direct approach is being considered with pre-specified fixed margin albeit in the frequentist setup. Bayesian paradigm provides a natural path to integrate historical and current trials' information via sequential learning. In this paper, we propose a Bayesian approach for simultaneous testing of non-inferiority and assay sensitivity in a three-arm trial with normal responses. For the experimental arm, in absence of historical information, non-informative priors are assumed under two situations, namely when (i) variance is known and (ii) variance is unknown. A Bayesian decision criteria is derived and compared with the frequentist method using simulation studies. Finally, several published clinical trial examples are reanalyzed to demonstrate the benefit of the proposed procedure. Copyright © 2015 John Wiley & Sons, Ltd.
An interactive Bayesian model for prediction of lymph node ratio and survival in pancreatic cancer patients.

PubMed

Smith, Brian J; Mezhir, James J

2014-10-01

Regional lymph node status has long been used as a dichotomous predictor of clinical outcomes in cancer patients. More recently, interest has turned to the prognostic utility of lymph node ratio (LNR), quantified as the proportion of positive nodes examined. However, statistical tools for the joint modeling of LNR and its effect on cancer survival are lacking. Data were obtained from the NCI SEER cancer registry on 6400 patients diagnosed with pancreatic ductal adenocarcinoma from 2004 to 2010 and who underwent radical oncologic resection. A novel Bayesian statistical approach was developed and applied to model simultaneously patients' true, but unobservable, LNR statuses and overall survival. New web development tools were then employed to create an interactive web application for individualized patient prediction. Histologic grade and T and M stages were important predictors of LNR status. Significant predictors of survival included age, gender, marital status, grade, histology, T and M stages, tumor size, and radiation therapy. LNR was found to have a highly significant, non-linear effect on survival. Furthermore, predictive performance of the survival model compared favorably to those from studies with more homogeneous patients and individualized predictors. We provide a new approach and tool set for the prediction of LNR and survival that are generally applicable to a host of cancer types, including breast, colon, melanoma, and stomach. Our methods are illustrated with the development of a validated model and web applications for the prediction of survival in a large set of pancreatic cancer patients. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Simultaneous Optimization of Decisions Using a Linear Utility Function.

ERIC Educational Resources Information Center

Vos, Hans J.

1990-01-01

An approach is presented to simultaneously optimize decision rules for combinations of elementary decisions through a framework derived from Bayesian decision theory. The developed linear utility model for selection-mastery decisions was applied to a sample of 43 first year medical students to illustrate the procedure. (SLD)
Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination.

PubMed

Zhao, Qibin; Zhang, Liqing; Cichocki, Andrzej

2015-09-01

CANDECOMP/PARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. To address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. To learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance.
Tales from Two Cores: Bayesian Re-Analyses of the Summit Lake and Blue Lakes Pollen Cores

NASA Astrophysics Data System (ADS)

Hall, M.

2016-12-01

Pollen cores from Summit Lake and Blue Lakes in Humboldt Co., Nevada provide palaeoclimatic information for the last 2000 yearsin the NW Great Basin. Summit Lake is in the northern Black Rock Range (41.5 N -119.1 W) and is at an elevation of 1780 m. The Blue Lakes sit at an elevation of 2434 m in the southern Pine Forest Range (41.6 N -118.6 W). The distance between the two lakes is 33.5 km. The cores were originally taken to reconstruct the fire history in the NW Great Basin. In this study, stochastic climate histories are created using a Bayesian methodology as implemented in the Bclim program. This Bayesian approach takes: 1) a multivariate approach based on modern pollen analogs, 2) accounts for the non-linear and non-Gaussian relationship between the climate and the pollen proxy, and 3) accounts for the uncertainties in the radiocarbon record and climate histories. For both cores, the following climatic variables are reported for the last 2 kya: Mean Temperature of the Coldest month (MTCO), Growing Degree Days above 5 Centigrade (GDD5), the ratio of Actual to Potential Evapotranspiration (AET/PET). Because it was sequentially sampled,the Artemesia/Chenopodiaceae ratio (A/C), an indicator of wetness, and the Grasses/Shrubs (G/S) ratio, an indicator of thevegetation communities, is calculated for each section of the Summit Lake core. Bayesian changepoint analyses of the Summit Lake core indicates that there is no significant difference in the mean or variance of the A/C ratio for the last 2 kya cal BP, but there is a significant decrease in G/S ratio dating to circa 700 ya cal BP. At Summit Lake, a statistically significant decrease in the GDD5 occurs at 1.4-1.5 kya cal BP, and a significant increase in the GDD5 occurs for the last 200 ya cal BP. The GDD5 and MTCO for Blue Lakes has a significant increase at 600 ya cal BP, and afterwards decreases in the next century. The regional archaeological record will be discussed in light of these changes.
Behavioural and neurobiological implications of linear and non-linear features in larynx phonations of horseshoe bats

PubMed Central

Kobayasi, Kohta I.; Hage, Steffen R.; Berquist, Sean; Feng, Jiang; Zhang, Shuyi; Metzner, Walter

2012-01-01

Mammalian vocalizations exhibit large variations in their spectrotemporal features, although it is still largely unknown which result from intrinsic biomechanical properties of the larynx and which are under direct neuromuscular control. Here we show that mere changes in laryngeal air flow yield several non-linear effects on sound production, in an isolated larynx preparation from horseshoe bats. Most notably, there are sudden jumps between two frequency bands used for either echolocation or communication in natural vocalizations. These jumps resemble changes in “registers” as in yodelling. In contrast, simulated contractions of the main larynx muscle produce linear frequency changes, but are limited to echolocation or communication frequencies. Only by combining non-linear and linear properties can this larynx therefore produce sounds covering the entire frequency range of natural calls. This may give behavioural meaning to yodelling-like vocal behaviour and reshape our thinking about how the brain controls the multitude of spectral vocal features in mammals. PMID:23149729
Bayesian analysis of non-homogeneous Markov chains: application to mental health data.

PubMed

Sung, Minje; Soyer, Refik; Nhan, Nguyen

2007-07-10

In this paper we present a formal treatment of non-homogeneous Markov chains by introducing a hierarchical Bayesian framework. Our work is motivated by the analysis of correlated categorical data which arise in assessment of psychiatric treatment programs. In our development, we introduce a Markovian structure to describe the non-homogeneity of transition patterns. In doing so, we introduce a logistic regression set-up for Markov chains and incorporate covariates in our model. We present a Bayesian model using Markov chain Monte Carlo methods and develop inference procedures to address issues encountered in the analyses of data from psychiatric treatment programs. Our model and inference procedures are implemented to some real data from a psychiatric treatment study. Copyright 2006 John Wiley & Sons, Ltd.
Invited commentary: Lost in estimation--searching for alternatives to markov chains to fit complex Bayesian models.

PubMed

Molitor, John

2012-03-01

Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.
Characterizing driver-response relationships in marine pelagic ecosystems for improved ocean management.

PubMed

Hunsicker, Mary E; Kappel, Carrie V; Selkoe, Kimberly A; Halpern, Benjamin S; Scarborough, Courtney; Mease, Lindley; Amrhein, Alisan

2016-04-01

Scientists and resource managers often use methods and tools that assume ecosystem components respond linearly to environmental drivers and human stressors. However, a growing body of literature demonstrates that many relationships are-non-linear, where small changes in a driver prompt a disproportionately large ecological response. We aim to provide a comprehensive assessment of the relationships between drivers and ecosystem components to identify where and when non-linearities are likely to occur. We focused our analyses on one of the best-studied marine systems, pelagic ecosystems, which allowed us to apply robust statistical techniques on a large pool of previously published studies. In this synthesis, we (1) conduct a wide literature review on single driver-response relationships in pelagic systems, (2) use statistical models to identify the degree of non-linearity in these relationships, and (3) assess whether general patterns exist in the strengths and shapes of non-linear relationships across drivers. Overall we found that non-linearities are common in pelagic ecosystems, comprising at least 52% of all driver-response relation- ships. This is likely an underestimate, as papers with higher quality data and analytical approaches reported non-linear relationships at a higher frequency (on average 11% more). Consequently, in the absence of evidence for a linear relationship, it is safer to assume a relationship is non-linear. Strong non-linearities can lead to greater ecological and socioeconomic consequences if they are unknown (and/or unanticipated), but if known they may provide clear thresholds to inform management targets. In pelagic systems, strongly non-linear relationships are often driven by climate and trophodynamic variables but are also associated with local stressors, such as overfishing and pollution, that can be more easily controlled by managers. Even when marine resource managers cannot influence ecosystem change, they can use information about threshold responses to guide how other stressors are managed and to adapt to new ocean conditions. As methods to detect and reduce uncertainty around threshold values improve, managers will be able to better understand and account for ubiquitous non-linear relationships.
Bayesian hierarchical piecewise regression models: a tool to detect trajectory divergence between groups in long-term observational studies.

PubMed

Buscot, Marie-Jeanne; Wotherspoon, Simon S; Magnussen, Costan G; Juonala, Markus; Sabin, Matthew A; Burgner, David P; Lehtimäki, Terho; Viikari, Jorma S A; Hutri-Kähönen, Nina; Raitakari, Olli T; Thomson, Russell J

2017-06-06

Bayesian hierarchical piecewise regression (BHPR) modeling has not been previously formulated to detect and characterise the mechanism of trajectory divergence between groups of participants that have longitudinal responses with distinct developmental phases. These models are useful when participants in a prospective cohort study are grouped according to a distal dichotomous health outcome. Indeed, a refined understanding of how deleterious risk factor profiles develop across the life-course may help inform early-life interventions. Previous techniques to determine between-group differences in risk factors at each age may result in biased estimate of the age at divergence. We demonstrate the use of Bayesian hierarchical piecewise regression (BHPR) to generate a point estimate and credible interval for the age at which trajectories diverge between groups for continuous outcome measures that exhibit non-linear within-person response profiles over time. We illustrate our approach by modeling the divergence in childhood-to-adulthood body mass index (BMI) trajectories between two groups of adults with/without type 2 diabetes mellitus (T2DM) in the Cardiovascular Risk in Young Finns Study (YFS). Using the proposed BHPR approach, we estimated the BMI profiles of participants with T2DM diverged from healthy participants at age 16 years for males (95% credible interval (CI):13.5-18 years) and 21 years for females (95% CI: 19.5-23 years). These data suggest that a critical window for weight management intervention in preventing T2DM might exist before the age when BMI growth rate is naturally expected to decrease. Simulation showed that when using pairwise comparison of least-square means from categorical mixed models, smaller sample sizes tended to conclude a later age of divergence. In contrast, the point estimate of the divergence time is not biased by sample size when using the proposed BHPR method. BHPR is a powerful analytic tool to model long-term non-linear longitudinal outcomes, enabling the identification of the age at which risk factor trajectories diverge between groups of participants. The method is suitable for the analysis of unbalanced longitudinal data, with only a limited number of repeated measures per participants and where the time-related outcome is typically marked by transitional changes or by distinct phases of change over time.
Stochastic determination of matrix determinants

NASA Astrophysics Data System (ADS)

Dorn, Sebastian; Enßlin, Torsten A.

2015-07-01

Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations—matrices—acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.
Stochastic determination of matrix determinants.

PubMed

Dorn, Sebastian; Ensslin, Torsten A

2015-07-01

Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations-matrices-acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.
Analysis of statistical and standard algorithms for detecting muscle onset with surface electromyography

PubMed Central

Tweedell, Andrew J.; Haynes, Courtney A.

2017-01-01

The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60–90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity. PMID:28489897
Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

PubMed Central

Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

2017-01-01

Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564
Towards quantifying uncertainty in Greenland's contribution to 21st century sea-level rise

NASA Astrophysics Data System (ADS)

Perego, M.; Tezaur, I.; Price, S. F.; Jakeman, J.; Eldred, M.; Salinger, A.; Hoffman, M. J.

2015-12-01

We present recent work towards developing a methodology for quantifying uncertainty in Greenland's 21st century contribution to sea-level rise. While we focus on uncertainties associated with the optimization and calibration of the basal sliding parameter field, the methodology is largely generic and could be applied to other (or multiple) sets of uncertain model parameter fields. The first step in the workflow is the solution of a large-scale, deterministic inverse problem, which minimizes the mismatch between observed and computed surface velocities by optimizing the two-dimensional coefficient field in a linear-friction sliding law. We then expand the deviation in this coefficient field from its estimated "mean" state using a reduced basis of Karhunen-Loeve Expansion (KLE) vectors. A Bayesian calibration is used to determine the optimal coefficient values for this expansion. The prior for the Bayesian calibration can be computed using the Hessian of the deterministic inversion or using an exponential covariance kernel. The posterior distribution is then obtained using Markov Chain Monte Carlo run on an emulator of the forward model. Finally, the uncertainty in the modeled sea-level rise is obtained by performing an ensemble of forward propagation runs. We present and discuss preliminary results obtained using a moderate-resolution model of the Greenland Ice sheet. As demonstrated in previous work, the primary difficulty in applying the complete workflow to realistic, high-resolution problems is that the effective dimension of the parameter space is very large.
Probabilistic Volcanic Multi-Hazard Assessment at Somma-Vesuvius (Italy): coupling Bayesian Belief Networks with a physical model for lahar propagation

NASA Astrophysics Data System (ADS)

Tierz, Pablo; Woodhouse, Mark; Phillips, Jeremy; Sandri, Laura; Selva, Jacopo; Marzocchi, Warner; Odbert, Henry

2017-04-01

Volcanoes are extremely complex physico-chemical systems where magma formed at depth breaks into the planet's surface resulting in major hazards from local to global scales. Volcano physics are dominated by non-linearities, and complicated spatio-temporal interrelationships which make volcanic hazards stochastic (i.e. not deterministic) by nature. In this context, probabilistic assessments are required to quantify the large uncertainties related to volcanic hazards. Moreover, volcanoes are typically multi-hazard environments where different hazardous processes can occur whether simultaneously or in succession. In particular, explosive volcanoes are able to accumulate, through tephra fallout and Pyroclastic Density Currents (PDCs), large amounts of pyroclastic material into the drainage basins surrounding the volcano. This addition of fresh particulate material alters the local/regional hydrogeological equilibrium and increases the frequency and magnitude of sediment-rich aqueous flows, commonly known as lahars. The initiation and volume of rain-triggered lahars may depend on: rainfall intensity and duration; antecedent rainfall; terrain slope; thickness, permeability and hydraulic diffusivity of the tephra deposit; etc. Quantifying these complex interrelationships (and their uncertainties), in a tractable manner, requires a structured but flexible probabilistic approach. A Bayesian Belief Network (BBN) is a directed acyclic graph that allows the representation of the joint probability distribution for a set of uncertain variables in a compact and efficient way, by exploiting unconditional and conditional independences between these variables. Once constructed and parametrized, the BBN uses Bayesian inference to perform causal (e.g. forecast) and/or evidential reasoning (e.g. explanation) about query variables, given some evidence. In this work, we illustrate how BBNs can be used to model the influence of several variables on the generation of rain-triggered lahars and, finally, assess the probability of occurrence of lahars of different volumes. The information utilized to parametrize the BBNs includes: (1) datasets of lahar observations; (2) numerical modelling of tephra fallout and PDCs; and (3) literature data. The BBN framework provides an opportunity to quantitatively combine these different types of evidence and use them to derive a rational approach to lahar forecasting. Lastly, we couple the BBN assessments with a shallow-water physical model for lahar propagation in order to attach probabilities to the simulated hazard footprints. We develop our methodology at Somma-Vesuvius (Italy), an explosive volcano prone to rain-triggered lahars or debris flows whether right after an eruption or during inter-eruptive periods. Accounting for the variability in tephra-fallout and dense-PDC propagation and the main geomorphological features of the catchments around Somma-Vesuvius, the areas most likely of forming medium-large lahars are the flanks of the volcano and the Sarno mountains towards the east.
Developing Large-Scale Bayesian Networks by Composition: Fault Diagnosis of Electrical Power Systems in Aircraft and Spacecraft

NASA Technical Reports Server (NTRS)

Mengshoel, Ole Jakob; Poll, Scott; Kurtoglu, Tolga

2009-01-01

This CD contains files that support the talk (see CASI ID 20100021404). There are 24 models that relate to the ADAPT system and 1 Excel worksheet. In the paper an investigation into the use of Bayesian networks to construct large-scale diagnostic systems is described. The high-level specifications, Bayesian networks, clique trees, and arithmetic circuits representing 24 different electrical power systems are described in the talk. The data in the CD are the models of the 24 different power systems.

Identifiability of large-scale non-linear dynamic network models applied to the ADM1-case study.

PubMed

Nimmegeers, Philippe; Lauwers, Joost; Telen, Dries; Logist, Filip; Impe, Jan Van

2017-06-01

In this work, both the structural and practical identifiability of the Anaerobic Digestion Model no. 1 (ADM1) is investigated, which serves as a relevant case study of large non-linear dynamic network models. The structural identifiability is investigated using the probabilistic algorithm, adapted to deal with the specifics of the case study (i.e., a large-scale non-linear dynamic system of differential and algebraic equations). The practical identifiability is analyzed using a Monte Carlo parameter estimation procedure for a 'non-informative' and 'informative' experiment, which are heuristically designed. The model structure of ADM1 has been modified by replacing parameters by parameter combinations, to provide a generally locally structurally identifiable version of ADM1. This means that in an idealized theoretical situation, the parameters can be estimated accurately. Furthermore, the generally positive structural identifiability results can be explained from the large number of interconnections between the states in the network structure. This interconnectivity, however, is also observed in the parameter estimates, making uncorrelated parameter estimations in practice difficult. Copyright © 2017. Published by Elsevier Inc.
A novel framework to simulating non-stationary, non-linear, non-Normal hydrological time series using Markov Switching Autoregressive Models

NASA Astrophysics Data System (ADS)

Birkel, C.; Paroli, R.; Spezia, L.; Tetzlaff, D.; Soulsby, C.

2012-12-01

In this paper we present a novel model framework using the class of Markov Switching Autoregressive Models (MSARMs) to examine catchments as complex stochastic systems that exhibit non-stationary, non-linear and non-Normal rainfall-runoff and solute dynamics. Hereby, MSARMs are pairs of stochastic processes, one observed and one unobserved, or hidden. We model the unobserved process as a finite state Markov chain and assume that the observed process, given the hidden Markov chain, is conditionally autoregressive, which means that the current observation depends on its recent past (system memory). The model is fully embedded in a Bayesian analysis based on Markov Chain Monte Carlo (MCMC) algorithms for model selection and uncertainty assessment. Hereby, the autoregressive order and the dimension of the hidden Markov chain state-space are essentially self-selected. The hidden states of the Markov chain represent unobserved levels of variability in the observed process that may result from complex interactions of hydroclimatic variability on the one hand and catchment characteristics affecting water and solute storage on the other. To deal with non-stationarity, additional meteorological and hydrological time series along with a periodic component can be included in the MSARMs as covariates. This extension allows identification of potential underlying drivers of temporal rainfall-runoff and solute dynamics. We applied the MSAR model framework to streamflow and conservative tracer (deuterium and oxygen-18) time series from an intensively monitored 2.3 km2 experimental catchment in eastern Scotland. Statistical time series analysis, in the form of MSARMs, suggested that the streamflow and isotope tracer time series are not controlled by simple linear rules. MSARMs showed that the dependence of current observations on past inputs observed by transport models often in form of the long-tailing of travel time and residence time distributions can be efficiently explained by non-stationarity either of the system input (climatic variability) and/or the complexity of catchment storage characteristics. The statistical model is also capable of reproducing short (event) and longer-term (inter-event) and wet and dry dynamical "hydrological states". These reflect the non-linear transport mechanisms of flow pathways induced by transient climatic and hydrological variables and modified by catchment characteristics. We conclude that MSARMs are a powerful tool to analyze the temporal dynamics of hydrological data, allowing for explicit integration of non-stationary, non-linear and non-Normal characteristics.
Bayesian hierarchical model for large-scale covariance matrix estimation.

PubMed

Zhu, Dongxiao; Hero, Alfred O

2007-12-01

Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.
Large Spatial and Temporal Separations of Cause and Effect in Policy Making - Dealing with Non-linear Effects

NASA Astrophysics Data System (ADS)

McCaskill, John

There can be large spatial and temporal separation of cause and effect in policy making. Determining the correct linkage between policy inputs and outcomes can be highly impractical in the complex environments faced by policy makers. In attempting to see and plan for the probable outcomes, standard linear models often overlook, ignore, or are unable to predict catastrophic events that only seem improbable due to the issue of multiple feedback loops. There are several issues with the makeup and behaviors of complex systems that explain the difficulty many mathematical models (factor analysis/structural equation modeling) have in dealing with non-linear effects in complex systems. This chapter highlights those problem issues and offers insights to the usefulness of ABM in dealing with non-linear effects in complex policy making environments.
Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests

ERIC Educational Resources Information Center

Veldkamp, Bernard P.

2010-01-01

Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…
A Bayesian partition modelling approach to resolve spatial variability in climate records from borehole temperature inversion

NASA Astrophysics Data System (ADS)

Hopcroft, Peter O.; Gallagher, Kerry; Pain, Christopher C.

2009-08-01

Collections of suitably chosen borehole profiles can be used to infer large-scale trends in ground-surface temperature (GST) histories for the past few hundred years. These reconstructions are based on a large database of carefully selected borehole temperature measurements from around the globe. Since non-climatic thermal influences are difficult to identify, representative temperature histories are derived by averaging individual reconstructions to minimize the influence of these perturbing factors. This may lead to three potentially important drawbacks: the net signal of non-climatic factors may not be zero, meaning that the average does not reflect the best estimate of past climate; the averaging over large areas restricts the useful amount of more local climate change information available; and the inversion methods used to reconstruct the past temperatures at each site must be mathematically identical and are therefore not necessarily best suited to all data sets. In this work, we avoid these issues by using a Bayesian partition model (BPM), which is computed using a trans-dimensional form of a Markov chain Monte Carlo algorithm. This then allows the number and spatial distribution of different GST histories to be inferred from a given set of borehole data by partitioning the geographical area into discrete partitions. Profiles that are heavily influenced by non-climatic factors will be partitioned separately. Conversely, profiles with climatic information, which is consistent with neighbouring profiles, will then be inferred to lie in the same partition. The geographical extent of these partitions then leads to information on the regional extent of the climatic signal. In this study, three case studies are described using synthetic and real data. The first demonstrates that the Bayesian partition model method is able to correctly partition a suite of synthetic profiles according to the inferred GST history. In the second, more realistic case, a series of temperature profiles are calculated using surface air temperatures of a global climate model simulation. In the final case, 23 real boreholes from the United Kingdom, previously used for climatic reconstructions, are examined and the results compared with a local instrumental temperature series and the previous estimate derived from the same borehole data. The results indicate that the majority (17) of the 23 boreholes are unsuitable for climatic reconstruction purposes, at least without including other thermal processes in the forward model.
A Spatio-Temporally Explicit Random Encounter Model for Large-Scale Population Surveys

PubMed Central

Jousimo, Jussi; Ovaskainen, Otso

2016-01-01

Random encounter models can be used to estimate population abundance from indirect data collected by non-invasive sampling methods, such as track counts or camera-trap data. The classical Formozov–Malyshev–Pereleshin (FMP) estimator converts track counts into an estimate of mean population density, assuming that data on the daily movement distances of the animals are available. We utilize generalized linear models with spatio-temporal error structures to extend the FMP estimator into a flexible Bayesian modelling approach that estimates not only total population size, but also spatio-temporal variation in population density. We also introduce a weighting scheme to estimate density on habitats that are not covered by survey transects, assuming that movement data on a subset of individuals is available. We test the performance of spatio-temporal and temporal approaches by a simulation study mimicking the Finnish winter track count survey. The results illustrate how the spatio-temporal modelling approach is able to borrow information from observations made on neighboring locations and times when estimating population density, and that spatio-temporal and temporal smoothing models can provide improved estimates of total population size compared to the FMP method. PMID:27611683
On the null distribution of Bayes factors in linear regression

USDA-ARS?s Scientific Manuscript database

We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Next Steps in Bayesian Structural Equation Models: Comments on, Variations of, and Extensions to Muthen and Asparouhov (2012)

ERIC Educational Resources Information Center

Rindskopf, David

2012-01-01

Muthen and Asparouhov (2012) made a strong case for the advantages of Bayesian methodology in factor analysis and structural equation models. I show additional extensions and adaptations of their methods and show how non-Bayesians can take advantage of many (though not all) of these advantages by using interval restrictions on parameters. By…
Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

ERIC Educational Resources Information Center

Marcoulides, Katerina M.

2018-01-01

This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…
Non-Bayesian Noun Generalization in 3-to 5-Year-Old Children: Probing the Role of Prior Knowledge in the Suspicious Coincidence Effect

ERIC Educational Resources Information Center

Jenkins, Gavin W.; Samuelson, Larissa K.; Smith, Jodi R.; Spencer, John P.

2015-01-01

It is unclear how children learn labels for multiple overlapping categories such as "Labrador," "dog," and "animal." Xu and Tenenbaum (2007a) suggested that learners infer correct meanings with the help of Bayesian inference. They instantiated these claims in a Bayesian model, which they tested with preschoolers and…
Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

NASA Astrophysics Data System (ADS)

Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

2018-05-01

Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
Singularity-sensitive gauge-based radar rainfall adjustment methods for urban hydrological applications

NASA Astrophysics Data System (ADS)

Wang, L.-P.; Ochoa-Rodríguez, S.; Onof, C.; Willems, P.

2015-09-01

Gauge-based radar rainfall adjustment techniques have been widely used to improve the applicability of radar rainfall estimates to large-scale hydrological modelling. However, their use for urban hydrological applications is limited as they were mostly developed based upon Gaussian approximations and therefore tend to smooth off so-called "singularities" (features of a non-Gaussian field) that can be observed in the fine-scale rainfall structure. Overlooking the singularities could be critical, given that their distribution is highly consistent with that of local extreme magnitudes. This deficiency may cause large errors in the subsequent urban hydrological modelling. To address this limitation and improve the applicability of adjustment techniques at urban scales, a method is proposed herein which incorporates a local singularity analysis into existing adjustment techniques and allows the preservation of the singularity structures throughout the adjustment process. In this paper the proposed singularity analysis is incorporated into the Bayesian merging technique and the performance of the resulting singularity-sensitive method is compared with that of the original Bayesian (non singularity-sensitive) technique and the commonly used mean field bias adjustment. This test is conducted using as case study four storm events observed in the Portobello catchment (53 km2) (Edinburgh, UK) during 2011 and for which radar estimates, dense rain gauge and sewer flow records, as well as a recently calibrated urban drainage model were available. The results suggest that, in general, the proposed singularity-sensitive method can effectively preserve the non-normality in local rainfall structure, while retaining the ability of the original adjustment techniques to generate nearly unbiased estimates. Moreover, the ability of the singularity-sensitive technique to preserve the non-normality in rainfall estimates often leads to better reproduction of the urban drainage system's dynamics, particularly of peak runoff flows.
Update on Bayesian Blocks: Segmented Models for Sequential Data

NASA Technical Reports Server (NTRS)

Scargle, Jeff

2017-01-01

The Bayesian Block algorithm, in wide use in astronomy and other areas, has been improved in several ways. The model for block shape has been generalized to include other than constant signal rate - e.g., linear, exponential, or other parametric models. In addition the computational efficiency has been improved, so that instead of O(N**2) the basic algorithm is O(N) in most cases. Other improvements in the theory and application of segmented representations will be described.
Evolution of diffraction and self-diffraction phenomena in thin films of Gelite Bloom/Hibiscus Sabdariffa

NASA Astrophysics Data System (ADS)

Cano-Lara, Miroslava; Severiano-Carrillo, Israel; Trejo-Durán, Mónica; Alvarado-Méndez, Edgar

2017-09-01

In this work, we present a study of non-linear optical response in thin films elaborated with Gelite Bloom and extract of Hibiscus Sabdariffa. Non-linear refraction and absorption effects were studied experimentally (Z-scan technique) and numerically, by considering the transmittance as non-linear absorption and refraction contribution. We observe large phase shifts to far field, and diffraction due to self-phase modulation of the sample. Diffraction and self-diffraction effects were observed as time function. The aim of studying non-linear optical properties in thin films is to eliminate thermal vortex effects that occur in liquids. This is desirable in applications such as non-linear phase contrast, optical limiting, optics switches, etc. Finally, we find good agreement between experimental and theoretical results.
Bayesian least squares deconvolution

NASA Astrophysics Data System (ADS)

Asensio Ramos, A.; Petit, P.

2015-11-01

Aims: We develop a fully Bayesian least squares deconvolution (LSD) that can be applied to the reliable detection of magnetic signals in noise-limited stellar spectropolarimetric observations using multiline techniques. Methods: We consider LSD under the Bayesian framework and we introduce a flexible Gaussian process (GP) prior for the LSD profile. This prior allows the result to automatically adapt to the presence of signal. We exploit several linear algebra identities to accelerate the calculations. The final algorithm can deal with thousands of spectral lines in a few seconds. Results: We demonstrate the reliability of the method with synthetic experiments and we apply it to real spectropolarimetric observations of magnetic stars. We are able to recover the magnetic signals using a small number of spectral lines, together with the uncertainty at each velocity bin. This allows the user to consider if the detected signal is reliable. The code to compute the Bayesian LSD profile is freely available.
Bayesian Group Bridge for Bi-level Variable Selection.

PubMed

Mallick, Himel; Yi, Nengjun

2017-06-01

A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.
Asteroid orbital error analysis: Theory and application

NASA Technical Reports Server (NTRS)

Muinonen, K.; Bowell, Edward

1992-01-01

We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).
Two Step Acceleration Process of Electrons in the Outer Van Allen Radiation Belt by Time Domain Electric Field Bursts and Large Amplitude Chorus Waves

NASA Astrophysics Data System (ADS)

Agapitov, O. V.; Mozer, F.; Artemyev, A.; Krasnoselskikh, V.; Lejosne, S.

2014-12-01

A huge number of different non-linear structures (double layers, electron holes, non-linear whistlers, etc) have been observed by the electric field experiment on the Van Allen Probes in conjunction with relativistic electron acceleration in the Earth's outer radiation belt. These structures, found as short duration (~0.1 msec) quasi-periodic bursts of electric field in the high time resolution electric field waveform, have been called Time Domain Structures (TDS). They can quite effectively interact with radiation belt electrons. Due to the trapping of electrons into these non-linear structures, they are accelerated up to ~10 keV and their pitch angles are changed, especially for low energies (˜1 keV). Large amplitude electric field perturbations cause non-linear resonant trapping of electrons into the effective potential of the TDS and these electrons are then accelerated in the non-homogeneous magnetic field. These locally accelerated electrons create the "seed population" of several keV electrons that can be accelerated by coherent, large amplitude, upper band whistler waves to MeV energies in this two step acceleration process. All the elements of this chain acceleration mechanism have been observed by the Van Allen Probes.
Developing Large-Scale Bayesian Networks by Composition: Fault Diagnosis of Electrical Power Systems in Aircraft and Spacecraft

NASA Technical Reports Server (NTRS)

Mengshoel, Ole Jakob; Poll, Scott; Kurtoglu, Tolga

2009-01-01

In this paper, we investigate the use of Bayesian networks to construct large-scale diagnostic systems. In particular, we consider the development of large-scale Bayesian networks by composition. This compositional approach reflects how (often redundant) subsystems are architected to form systems such as electrical power systems. We develop high-level specifications, Bayesian networks, clique trees, and arithmetic circuits representing 24 different electrical power systems. The largest among these 24 Bayesian networks contains over 1,000 random variables. Another BN represents the real-world electrical power system ADAPT, which is representative of electrical power systems deployed in aerospace vehicles. In addition to demonstrating the scalability of the compositional approach, we briefly report on experimental results from the diagnostic competition DXC, where the ProADAPT team, using techniques discussed here, obtained the highest scores in both Tier 1 (among 9 international competitors) and Tier 2 (among 6 international competitors) of the industrial track. While we consider diagnosis of power systems specifically, we believe this work is relevant to other system health management problems, in particular in dependable systems such as aircraft and spacecraft. (See CASI ID 20100021910 for supplemental data disk.)

Quantifying sleep architecture dynamics and individual differences using big data and Bayesian networks

PubMed Central

Shelton, Christian; Mednick, Sara C.

2018-01-01

The pattern of sleep stages across a night (sleep architecture) is influenced by biological, behavioral, and clinical variables. However, traditional measures of sleep architecture such as stage proportions, fail to capture sleep dynamics. Here we quantify the impact of individual differences on the dynamics of sleep architecture and determine which factors or set of factors best predict the next sleep stage from current stage information. We investigated the influence of age, sex, body mass index, time of day, and sleep time on static (e.g. minutes in stage, sleep efficiency) and dynamic measures of sleep architecture (e.g. transition probabilities and stage duration distributions) using a large dataset of 3202 nights from a non-clinical population. Multi-level regressions show that sex effects duration of all Non-Rapid Eye Movement (NREM) stages, and age has a curvilinear relationship for Wake After Sleep Onset (WASO) and slow wave sleep (SWS) minutes. Bayesian network modeling reveals sleep architecture depends on time of day, total sleep time, age and sex, but not BMI. Older adults, and particularly males, have shorter bouts (more fragmentation) of Stage 2, SWS, and they transition less frequently to these stages. Additionally, we showed that the next sleep stage and its duration can be optimally predicted by the prior 2 stages and age. Our results demonstrate the potential benefit of big data and Bayesian network approaches in quantifying static and dynamic architecture of normal sleep. PMID:29641599
Quantifying sleep architecture dynamics and individual differences using big data and Bayesian networks.

PubMed

Yetton, Benjamin D; McDevitt, Elizabeth A; Cellini, Nicola; Shelton, Christian; Mednick, Sara C

2018-01-01

The pattern of sleep stages across a night (sleep architecture) is influenced by biological, behavioral, and clinical variables. However, traditional measures of sleep architecture such as stage proportions, fail to capture sleep dynamics. Here we quantify the impact of individual differences on the dynamics of sleep architecture and determine which factors or set of factors best predict the next sleep stage from current stage information. We investigated the influence of age, sex, body mass index, time of day, and sleep time on static (e.g. minutes in stage, sleep efficiency) and dynamic measures of sleep architecture (e.g. transition probabilities and stage duration distributions) using a large dataset of 3202 nights from a non-clinical population. Multi-level regressions show that sex effects duration of all Non-Rapid Eye Movement (NREM) stages, and age has a curvilinear relationship for Wake After Sleep Onset (WASO) and slow wave sleep (SWS) minutes. Bayesian network modeling reveals sleep architecture depends on time of day, total sleep time, age and sex, but not BMI. Older adults, and particularly males, have shorter bouts (more fragmentation) of Stage 2, SWS, and they transition less frequently to these stages. Additionally, we showed that the next sleep stage and its duration can be optimally predicted by the prior 2 stages and age. Our results demonstrate the potential benefit of big data and Bayesian network approaches in quantifying static and dynamic architecture of normal sleep.
Scale Mixture Models with Applications to Bayesian Inference

NASA Astrophysics Data System (ADS)

Qin, Zhaohui S.; Damien, Paul; Walker, Stephen

2003-11-01

Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Non-linear Frequency Shifts, Mode Couplings, and Decay Instability of Plasma Waves

NASA Astrophysics Data System (ADS)

Affolter, Mathew; Anderegg, F.; Driscoll, C. F.; Valentini, F.

2015-11-01

We present experiments and theory for non-linear plasma wave decay to longer wavelengths, in both the oscillatory coupling and exponential decay regimes. The experiments are conducted on non-neutral plasmas in cylindrical Penning-Malmberg traps, θ-symmetric standing plasma waves have near acoustic dispersion ω (kz) ~kz - αkz2 , discretized by kz =mz (π /Lp) . Large amplitude waves exhibit non-linear frequency shifts δf / f ~A2 and Fourier harmonic content, both of which are increased as the plasma dispersion is reduced. Non-linear coupling rates are measured between large amplitude mz = 2 waves and small amplitude mz = 1 waves, which have a small detuning Δω = 2ω1 -ω2 . At small excitation amplitudes, this detuning causes the mz = 1 mode amplitude to ``bounce'' at rate Δω , with amplitude excursions ΔA1 ~ δn2 /n0 consistent with cold fluid theory and Vlasov simulations. At larger excitation amplitudes, where the non-linear coupling exceeds the dispersion, phase-locked exponential growth of the mz = 1 mode is observed, in qualitative agreement with simple 3-wave instability theory. However, significant variations are observed experimentally, and N-wave theory gives stunningly divergent predictions that depend sensitively on the dispersion-moderated harmonic content. Measurements on higher temperature Langmuir waves and the unusual ``EAW'' (KEEN) waves are being conducted to investigate the effects of wave-particle kinetics on the non-linear coupling rates. Department of Energy Grants DE-SC0002451and DE-SC0008693.
Waveform Design for Wireless Power Transfer

NASA Astrophysics Data System (ADS)

Clerckx, Bruno; Bayguzina, Ekaterina

2016-12-01

Far-field Wireless Power Transfer (WPT) has attracted significant attention in recent years. Despite the rapid progress, the emphasis of the research community in the last decade has remained largely concentrated on improving the design of energy harvester (so-called rectenna) and has left aside the effect of transmitter design. In this paper, we study the design of transmit waveform so as to enhance the DC power at the output of the rectenna. We derive a tractable model of the non-linearity of the rectenna and compare with a linear model conventionally used in the literature. We then use those models to design novel multisine waveforms that are adaptive to the channel state information (CSI). Interestingly, while the linear model favours narrowband transmission with all the power allocated to a single frequency, the non-linear model favours a power allocation over multiple frequencies. Through realistic simulations, waveforms designed based on the non-linear model are shown to provide significant gains (in terms of harvested DC power) over those designed based on the linear model and over non-adaptive waveforms. We also compute analytically the theoretical scaling laws of the harvested energy for various waveforms as a function of the number of sinewaves and transmit antennas. Those scaling laws highlight the benefits of CSI knowledge at the transmitter in WPT and of a WPT design based on a non-linear rectenna model over a linear model. Results also motivate the study of a promising architecture relying on large-scale multisine multi-antenna waveforms for WPT. As a final note, results stress the importance of modeling and accounting for the non-linearity of the rectenna in any system design involving wireless power.
Bayesian design criteria: computation, comparison, and application to a pharmacokinetic and a pharmacodynamic model.

PubMed

Merlé, Y; Mentré, F

1995-02-01

In this paper 3 criteria to design experiments for Bayesian estimation of the parameters of nonlinear models with respect to their parameters, when a prior distribution is available, are presented: the determinant of the Bayesian information matrix, the determinant of the pre-posterior covariance matrix, and the expected information provided by an experiment. A procedure to simplify the computation of these criteria is proposed in the case of continuous prior distributions and is compared with the criterion obtained from a linearization of the model about the mean of the prior distribution for the parameters. This procedure is applied to two models commonly encountered in the area of pharmacokinetics and pharmacodynamics: the one-compartment open model with bolus intravenous single-dose injection and the Emax model. They both involve two parameters. Additive as well as multiplicative gaussian measurement errors are considered with normal prior distributions. Various combinations of the variances of the prior distribution and of the measurement error are studied. Our attention is restricted to designs with limited numbers of measurements (1 or 2 measurements). This situation often occurs in practice when Bayesian estimation is performed. The optimal Bayesian designs that result vary with the variances of the parameter distribution and with the measurement error. The two-point optimal designs sometimes differ from the D-optimal designs for the mean of the prior distribution and may consist of replicating measurements. For the studied cases, the determinant of the Bayesian information matrix and its linearized form lead to the same optimal designs. In some cases, the pre-posterior covariance matrix can be far from its lower bound, namely, the inverse of the Bayesian information matrix, especially for the Emax model and a multiplicative measurement error. The expected information provided by the experiment and the determinant of the pre-posterior covariance matrix generally lead to the same designs except for the Emax model and the multiplicative measurement error. Results show that these criteria can be easily computed and that they could be incorporated in modules for designing experiments.
Tau-REx: A new look at the retrieval of exoplanetary atmospheres

NASA Astrophysics Data System (ADS)

Waldmann, Ingo

2014-11-01

The field of exoplanetary spectroscopy is as fast moving as it is new. With an increasing amount of space and ground based instruments obtaining data on a large set of extrasolar planets we are indeed entering the era of exoplanetary characterisation. Permanently at the edge of instrument feasibility, it is as important as it is difficult to find the most optimal and objective methodologies to analysing and interpreting current data. This is particularly true for smaller and fainter Earth and Super-Earth type planets.For low to mid signal to noise (SNR) observations, we are prone to two sources of biases: 1) Prior selection in the data reduction and analysis; 2) Prior constraints on the spectral retrieval. In Waldmann et al. (2013), Morello et al. (2014) and Waldmann (2012, 2014) we have shown a prior-free approach to data analysis based on non-parametric machine learning techniques. Following these approaches we will present a new take on the spectral retrieval of extrasolar planets. Tau-REx (tau-retrieval of exoplanets) is a new line-by-line, atmospheric retrieval framework. In the past the decision on what opacity sources go into an atmospheric model were usually user defined. Manual input can lead to model biases and poor convergence of the atmospheric model to the data. In Tau-REx we have set out to solve this. Through custom built pattern recognition software, Tau-REx is able to rapidly identify the most likely atmospheric opacities from a large number of possible absorbers/emitters (ExoMol or HiTran data bases) and non-parametrically constrain the prior space for the Bayesian retrieval. Unlike other (MCMC based) techniques, Tau-REx is able to fully integrate high-dimensional log-likelihood spaces and to calculate the full Bayesian Evidence of the atmospheric models. We achieve this through a combination of Nested Sampling and a high degree of code parallelisation. This allows for an exact and unbiased Bayesian model selection and a fully mapping of potential model-data degeneracies. Together with non-parametric data de-trending of exoplanetary spectra, we can reach an un- precedented level of objectivity in our atmospheric characterisation of these foreign worlds.
MEG and fMRI Fusion for Non-Linear Estimation of Neural and BOLD Signal Changes

PubMed Central

Plis, Sergey M.; Calhoun, Vince D.; Weisend, Michael P.; Eichele, Tom; Lane, Terran

2010-01-01

The combined analysis of magnetoencephalography (MEG)/electroencephalography and functional magnetic resonance imaging (fMRI) measurements can lead to improvement in the description of the dynamical and spatial properties of brain activity. In this paper we empirically demonstrate this improvement using simulated and recorded task related MEG and fMRI activity. Neural activity estimates were derived using a dynamic Bayesian network with continuous real valued parameters by means of a sequential Monte Carlo technique. In synthetic data, we show that MEG and fMRI fusion improves estimation of the indirectly observed neural activity and smooths tracking of the blood oxygenation level dependent (BOLD) response. In recordings of task related neural activity the combination of MEG and fMRI produces a result with greater signal-to-noise ratio, that confirms the expectation arising from the nature of the experiment. The highly non-linear model of the BOLD response poses a difficult inference problem for neural activity estimation; computational requirements are also high due to the time and space complexity. We show that joint analysis of the data improves the system's behavior by stabilizing the differential equations system and by requiring fewer computational resources. PMID:21120141
Bayesian reconstruction of projection reconstruction NMR (PR-NMR).

PubMed

Yoon, Ji Won

2014-11-01

Projection reconstruction nuclear magnetic resonance (PR-NMR) is a technique for generating multidimensional NMR spectra. A small number of projections from lower-dimensional NMR spectra are used to reconstruct the multidimensional NMR spectra. In our previous work, it was shown that multidimensional NMR spectra are efficiently reconstructed using peak-by-peak based reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. We propose an extended and generalized RJMCMC algorithm replacing a simple linear model with a linear mixed model to reconstruct close NMR spectra into true spectra. This statistical method generates samples in a Bayesian scheme. Our proposed algorithm is tested on a set of six projections derived from the three-dimensional 700 MHz HNCO spectrum of a protein HasA. Copyright © 2014 Elsevier Ltd. All rights reserved.
Bayesian Correction for Misclassification in Multilevel Count Data Models.

PubMed

Nelson, Tyler; Song, Joon Jin; Chin, Yoo-Mi; Stamey, James D

2018-01-01

Covariate misclassification is well known to yield biased estimates in single level regression models. The impact on hierarchical count models has been less studied. A fully Bayesian approach to modeling both the misclassified covariate and the hierarchical response is proposed. Models with a single diagnostic test and with multiple diagnostic tests are considered. Simulation studies show the ability of the proposed model to appropriately account for the misclassification by reducing bias and improving performance of interval estimators. A real data example further demonstrated the consequences of ignoring the misclassification. Ignoring misclassification yielded a model that indicated there was a significant, positive impact on the number of children of females who observed spousal abuse between their parents. When the misclassification was accounted for, the relationship switched to negative, but not significant. Ignoring misclassification in standard linear and generalized linear models is well known to lead to biased results. We provide an approach to extend misclassification modeling to the important area of hierarchical generalized linear models.
Bayesian inference of radiation belt loss timescales.

NASA Astrophysics Data System (ADS)

Camporeale, E.; Chandorkar, M.

2017-12-01

Electron fluxes in the Earth's radiation belts are routinely studied using the classical quasi-linear radial diffusion model. Although this simplified linear equation has proven to be an indispensable tool in understanding the dynamics of the radiation belt, it requires specification of quantities such as the diffusion coefficient and electron loss timescales that are never directly measured. Researchers have so far assumed a-priori parameterisations for radiation belt quantities and derived the best fit using satellite data. The state of the art in this domain lacks a coherent formulation of this problem in a probabilistic framework. We present some recent progress that we have made in performing Bayesian inference of radial diffusion parameters. We achieve this by making extensive use of the theory connecting Gaussian Processes and linear partial differential equations, and performing Markov Chain Monte Carlo sampling of radial diffusion parameters. These results are important for understanding the role and the propagation of uncertainties in radiation belt simulations and, eventually, for providing a probabilistic forecast of energetic electron fluxes in a Space Weather context.
Brain white matter fiber estimation and tractography using Q-ball imaging and Bayesian MODEL.

PubMed

Lu, Meng

2015-01-01

Diffusion tensor imaging allows for the non-invasive in vivo mapping of the brain tractography. However, fiber bundles have complex structures such as fiber crossings, fiber branchings and fibers with large curvatures that tensor imaging (DTI) cannot accurately handle. This study presents a novel brain white matter tractography method using Q-ball imaging as the data source instead of DTI, because QBI can provide accurate information about multiple fiber crossings and branchings in a single voxel using an orientation distribution function (ODF). The presented method also uses graph theory to construct the Bayesian model-based graph, so that the fiber tracking between two voxels can be represented as the shortest path in a graph. Our experiment showed that our new method can accurately handle brain white matter fiber crossings and branchings, and reconstruct brain tractograhpy both in phantom data and real brain data.
Annealed Importance Sampling for Neural Mass Models

PubMed Central

Penny, Will; Sengupta, Biswa

2016-01-01

Neural Mass Models provide a compact description of the dynamical activity of cell populations in neocortical regions. Moreover, models of regional activity can be connected together into networks, and inferences made about the strength of connections, using M/EEG data and Bayesian inference. To date, however, Bayesian methods have been largely restricted to the Variational Laplace (VL) algorithm which assumes that the posterior distribution is Gaussian and finds model parameters that are only locally optimal. This paper explores the use of Annealed Importance Sampling (AIS) to address these restrictions. We implement AIS using proposals derived from Langevin Monte Carlo (LMC) which uses local gradient and curvature information for efficient exploration of parameter space. In terms of the estimation of Bayes factors, VL and AIS agree about which model is best but report different degrees of belief. Additionally, AIS finds better model parameters and we find evidence of non-Gaussianity in their posterior distribution. PMID:26942606
Hierarchical Bayesian sparse image reconstruction with application to MRFM.

PubMed

Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves

2009-09-01

This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.
BAYESIAN ESTIMATION OF THERMONUCLEAR REACTION RATES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Iliadis, C.; Anderson, K. S.; Coc, A.

The problem of estimating non-resonant astrophysical S -factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied to this problem in the past, almost all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extrasolar planets, gravitational waves, and Type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We presentmore » astrophysical S -factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the reactions d(p, γ ){sup 3}He, {sup 3}He({sup 3}He,2p){sup 4}He, and {sup 3}He( α , γ ){sup 7}Be, important for deuterium burning, solar neutrinos, and Big Bang nucleosynthesis.« less
Accessing the uncertainties of seismic velocity and anisotropy structure of Northern Great Plains using a transdimensional Bayesian approach

NASA Astrophysics Data System (ADS)

Gao, C.; Lekic, V.

2017-12-01

Seismic imaging utilizing complementary seismic data provides unique insight on the formation, evolution and current structure of continental lithosphere. While numerous efforts have improved the resolution of seismic structure, the quantification of uncertainties remains challenging due to the non-linearity and the non-uniqueness of geophysical inverse problem. In this project, we use a reverse jump Markov chain Monte Carlo (rjMcMC) algorithm to incorporate seismic observables including Rayleigh and Love wave dispersion, Ps and Sp receiver function to invert for shear velocity (Vs), compressional velocity (Vp), density, and radial anisotropy of the lithospheric structure. The Bayesian nature and the transdimensionality of this approach allow the quantification of the model parameter uncertainties while keeping the models parsimonious. Both synthetic test and inversion of actual data for Ps and Sp receiver functions are performed. We quantify the information gained in different inversions by calculating the Kullback-Leibler divergence. Furthermore, we explore the ability of Rayleigh and Love wave dispersion data to constrain radial anisotropy. We show that when multiple types of model parameters (Vsv, Vsh, and Vp) are inverted simultaneously, the constraints on radial anisotropy are limited by relatively large data uncertainties and trade-off strongly with Vp. We then perform joint inversion of the surface wave dispersion (SWD) and Ps, Sp receiver functions, and show that the constraints on both isotropic Vs and radial anisotropy are significantly improved. To achieve faster convergence of the rjMcMC, we propose a progressive inclusion scheme, and invert SWD measurements and receiver functions from about 400 USArray stations in the Northern Great Plains. We start by only using SWD data due to its fast convergence rate. We then use the average of the ensemble as a starting model for the joint inversion, which is able to resolve distinct seismic signatures of geological structures including the trans-Hudson orogen, Wyoming craton and Yellowstone hotspot. Various analyses are done to access the uncertainties of the seismic velocities and Moho depths. We also address the importance of careful data processing of receiver functions by illustrating artifacts due to unmodelled sediment reverberations.
A small-area ecologic study of myocardial infarction, neighborhood deprivation, and sex: a Bayesian modeling approach.

PubMed

Deguen, Séverine; Lalloue, Benoît; Bard, Denis; Havard, Sabrina; Arveiler, Dominique; Zmirou-Navier, Denis

2010-07-01

Socioeconomic inequalities in the risk of coronary heart disease (CHD) are well documented for men and women. CHD incidence is greater for men but its association with socioeconomic status is usually found to be stronger among women. We explored the sex-specific association between neighborhood deprivation level and the risk of myocardial infarction (MI) at a small-area scale. We studied 1193 myocardial infarction events in people aged 35-74 years in the Strasbourg metropolitan area, France (2000-2003). We used a deprivation index to assess the neighborhood deprivation level. To take into account spatial dependence and the variability of MI rates due to the small number of events, we used a hierarchical Bayesian modeling approach. We fitted hierarchical Bayesian models to estimate sex-specific relative and absolute MI risks across deprivation categories. We tested departure from additive joint effects of deprivation and sex. The risk of MI increased with the deprivation level for both sexes, but was higher for men for all deprivation classes. Relative rates increased along the deprivation scale more steadily for women and followed a different pattern: linear for men and nonlinear for women. Our data provide evidence of effect modification, with departure from an additive joint effect of deprivation and sex. We document sex differences in the socioeconomic gradient of MI risk in Strasbourg. Women appear more susceptible at levels of extreme deprivation; this result is not a chance finding, given the large difference in event rates between men and women.
On a full Bayesian inference for force reconstruction problems

NASA Astrophysics Data System (ADS)

Aucejo, M.; De Smet, O.

2018-05-01

In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
Revisiting the 2004 Sumatra-Andaman earthquake in a Bayesian framework

NASA Astrophysics Data System (ADS)

Bletery, Q.; Sladen, A.; Jiang, J.; Simons, M.

2015-12-01

The 2004 Mw 9.25 Sumatra-Andaman earthquake is the largest seismic event of the modern instrumental era. Despite considerable effort to analyze the characteristics of its rupture, the different available observations have proven difficult to simultaneously integrate jointly into a finite-fault slip model. In particular, the critical near-field geodetic records contain variable and significant post-seismic signal (between 2 weeks and 2 months) while the satellite altimetry records of the associated tsunami are affected by various sources of uncertainties (e.g. source rupture velocity, meso-scale oceanic currents). In this study, we investigate the quasi-static slip distribution of the Sumatra-Andaman earthquake by carefully accounting for the different sources of uncertainties in the joint inversion of an extended set of geodetic and tsunami data. To do so, we use non-diagonal covariance matrices reflecting both data and model uncertainties in a fully Bayesian inversion framework. As model errors are particularly large for mega-earthquakes, we also rely on advanced simulation codes (normal mode theory on a layered spherical Earth for the static displacement field and non-hydrostatic equations for the tsunami) and account for the 3D curvature of the megathrust interface to reduce the associated epistemic uncertainties. The fully Bayesian inversion framework then enables us to derive the families of possible models compatible with the unevenly distributed and sometimes ambiguous measurements. We find two regions of high slip at latitudes 3°-4°N and 7°-8°N with amplitudes that probably reached values as large as 40 m and possibly larger. Such amounts of slip were not proposed by previous studies, which might have been biased by smoothing regularizations. We also find significant slip (around 20 m) offshore Andaman islands absent in earlier studies. Furthermore, we find that the rupture very likely involved shallow slip, with the possibility of reaching the trench.
Mean Field Variational Bayesian Data Assimilation

NASA Astrophysics Data System (ADS)

Vrettas, M.; Cornford, D.; Opper, M.

2012-04-01

Current data assimilation schemes propose a range of approximate solutions to the classical data assimilation problem, particularly state estimation. Broadly there are three main active research areas: ensemble Kalman filter methods which rely on statistical linearization of the model evolution equations, particle filters which provide a discrete point representation of the posterior filtering or smoothing distribution and 4DVAR methods which seek the most likely posterior smoothing solution. In this paper we present a recent extension to our variational Bayesian algorithm which seeks the most probably posterior distribution over the states, within the family of non-stationary Gaussian processes. Our original work on variational Bayesian approaches to data assimilation sought the best approximating time varying Gaussian process to the posterior smoothing distribution for stochastic dynamical systems. This approach was based on minimising the Kullback-Leibler divergence between the true posterior over paths, and our Gaussian process approximation. So long as the observation density was sufficiently high to bring the posterior smoothing density close to Gaussian the algorithm proved very effective, on lower dimensional systems. However for higher dimensional systems, the algorithm was computationally very demanding. We have been developing a mean field version of the algorithm which treats the state variables at a given time as being independent in the posterior approximation, but still accounts for their relationships between each other in the mean solution arising from the original dynamical system. In this work we present the new mean field variational Bayesian approach, illustrating its performance on a range of classical data assimilation problems. We discuss the potential and limitations of the new approach. We emphasise that the variational Bayesian approach we adopt, in contrast to other variational approaches, provides a bound on the marginal likelihood of the observations given parameters in the model which also allows inference of parameters such as observation errors, and parameters in the model and model error representation, particularly if this is written as a deterministic form with small additive noise. We stress that our approach can address very long time window and weak constraint settings. However like traditional variational approaches our Bayesian variational method has the benefit of being posed as an optimisation problem. We finish with a sketch of the future directions for our approach.

Uncertainty Estimates of Psychoacoustic Thresholds Obtained from Group Tests

NASA Technical Reports Server (NTRS)

Rathsam, Jonathan; Christian, Andrew

2016-01-01

Adaptive psychoacoustic test methods, in which the next signal level depends on the response to the previous signal, are the most efficient for determining psychoacoustic thresholds of individual subjects. In many tests conducted in the NASA psychoacoustic labs, the goal is to determine thresholds representative of the general population. To do this economically, non-adaptive testing methods are used in which three or four subjects are tested at the same time with predetermined signal levels. This approach requires us to identify techniques for assessing the uncertainty in resulting group-average psychoacoustic thresholds. In this presentation we examine the Delta Method of frequentist statistics, the Generalized Linear Model (GLM), the Nonparametric Bootstrap, a frequentist method, and Markov Chain Monte Carlo Posterior Estimation and a Bayesian approach. Each technique is exercised on a manufactured, theoretical dataset and then on datasets from two psychoacoustics facilities at NASA. The Delta Method is the simplest to implement and accurate for the cases studied. The GLM is found to be the least robust, and the Bootstrap takes the longest to calculate. The Bayesian Posterior Estimate is the most versatile technique examined because it allows the inclusion of prior information.
Artificial intelligence applied to the automatic analysis of absorption spectra. Objective measurement of the fine structure constant

NASA Astrophysics Data System (ADS)

Bainbridge, Matthew B.; Webb, John K.

2017-06-01

A new and automated method is presented for the analysis of high-resolution absorption spectra. Three established numerical methods are unified into one `artificial intelligence' process: a genetic algorithm (Genetic Voigt Profile FIT, gvpfit); non-linear least-squares with parameter constraints (vpfit); and Bayesian model averaging (BMA). The method has broad application but here we apply it specifically to the problem of measuring the fine structure constant at high redshift. For this we need objectivity and reproducibility. gvpfit is also motivated by the importance of obtaining a large statistical sample of measurements of Δα/α. Interactive analyses are both time consuming and complex and automation makes obtaining a large sample feasible. In contrast to previous methodologies, we use BMA to derive results using a large set of models and show that this procedure is more robust than a human picking a single preferred model since BMA avoids the systematic uncertainties associated with model choice. Numerical simulations provide stringent tests of the whole process and we show using both real and simulated spectra that the unified automated fitting procedure out-performs a human interactive analysis. The method should be invaluable in the context of future instrumentation like ESPRESSO on the VLT and indeed future ELTs. We apply the method to the zabs = 1.8389 absorber towards the zem = 2.145 quasar J110325-264515. The derived constraint of Δα/α = 3.3 ± 2.9 × 10-6 is consistent with no variation and also consistent with the tentative spatial variation reported in Webb et al. and King et al.
Resolution analysis of marine seismic full waveform data by Bayesian inversion

NASA Astrophysics Data System (ADS)

Ray, A.; Sekar, A.; Hoversten, G. M.; Albertin, U.

2015-12-01

The Bayesian posterior density function (PDF) of earth models that fit full waveform seismic data convey information on the uncertainty with which the elastic model parameters are resolved. In this work, we apply the trans-dimensional reversible jump Markov Chain Monte Carlo method (RJ-MCMC) for the 1D inversion of noisy synthetic full-waveform seismic data in the frequency-wavenumber domain. While seismic full waveform inversion (FWI) is a powerful method for characterizing subsurface elastic parameters, the uncertainty in the inverted models has remained poorly known, if at all and is highly initial model dependent. The Bayesian method we use is trans-dimensional in that the number of model layers is not fixed, and flexible such that the layer boundaries are free to move around. The resulting parameterization does not require regularization to stabilize the inversion. Depth resolution is traded off with the number of layers, providing an estimate of uncertainty in elastic parameters (compressional and shear velocities Vp and Vs as well as density) with depth. We find that in the absence of additional constraints, Bayesian inversion can result in a wide range of posterior PDFs on Vp, Vs and density. These PDFs range from being clustered around the true model, to those that contain little resolution of any particular features other than those in the near surface, depending on the particular data and target geometry. We present results for a suite of different frequencies and offset ranges, examining the differences in the posterior model densities thus derived. Though these results are for a 1D earth, they are applicable to areas with simple, layered geology and provide valuable insight into the resolving capabilities of FWI, as well as highlight the challenges in solving a highly non-linear problem. The RJ-MCMC method also presents a tantalizing possibility for extension to 2D and 3D Bayesian inversion of full waveform seismic data in the future, as it objectively tackles the problem of model selection (i.e., the number of layers or cells for parameterization), which could ease the computational burden of evaluating forward models with many parameters.
A computationally efficient Bayesian sequential simulation approach for the assimilation of vast and diverse hydrogeophysical datasets

NASA Astrophysics Data System (ADS)

Nussbaumer, Raphaël; Gloaguen, Erwan; Mariéthoz, Grégoire; Holliger, Klaus

2016-04-01

Bayesian sequential simulation (BSS) is a powerful geostatistical technique, which notably has shown significant potential for the assimilation of datasets that are diverse with regard to the spatial resolution and their relationship. However, these types of applications of BSS require a large number of realizations to adequately explore the solution space and to assess the corresponding uncertainties. Moreover, such simulations generally need to be performed on very fine grids in order to adequately exploit the technique's potential for characterizing heterogeneous environments. Correspondingly, the computational cost of BSS algorithms in their classical form is very high, which so far has limited an effective application of this method to large models and/or vast datasets. In this context, it is also important to note that the inherent assumption regarding the independence of the considered datasets is generally regarded as being too strong in the context of sequential simulation. To alleviate these problems, we have revisited the classical implementation of BSS and incorporated two key features to increase the computational efficiency. The first feature is a combined quadrant spiral - superblock search, which targets run-time savings on large grids and adds flexibility with regard to the selection of neighboring points using equal directional sampling and treating hard data and previously simulated points separately. The second feature is a constant path of simulation, which enhances the efficiency for multiple realizations. We have also modified the aggregation operator to be more flexible with regard to the assumption of independence of the considered datasets. This is achieved through log-linear pooling, which essentially allows for attributing weights to the various data components. Finally, a multi-grid simulating path was created to enforce large-scale variance and to allow for adapting parameters, such as, for example, the log-linear weights or the type of simulation path at various scales. The newly implemented search method for kriging reduces the computational cost from an exponential dependence with regard to the grid size in the original algorithm to a linear relationship, as each neighboring search becomes independent from the grid size. For the considered examples, our results show a sevenfold reduction in run time for each additional realization when a constant simulation path is used. The traditional criticism that constant path techniques introduce a bias to the simulations was explored and our findings do indeed reveal a minor reduction in the diversity of the simulations. This bias can, however, be largely eliminated by changing the path type at different scales through the use of the multi-grid approach. Finally, we show that adapting the aggregation weight at each scale considered in our multi-grid approach allows for reproducing both the variogram and histogram, and the spatial trend of the underlying data.
Measuring the Acoustic Release of a Chemotherapeutic Agent from Folate-Targeted Polymeric Micelles.

PubMed

Abusara, Ayah; Abdel-Hafez, Mamoun; Husseini, Ghaleb

2018-08-01

In this paper, we compare the use of Bayesian filters for the estimation of release and re-encapsulation rates of a chemotherapeutic agent (namely Doxorubicin) from nanocarriers in an acoustically activated drug release system. The study is implemented using an advanced kinetic model that takes into account cavitation events causing the antineoplastic agent's release from polymeric micelles upon exposure to ultrasound. This model is an improvement over the previous representations of acoustic release that used simple zero-, first- and second-order release and re-encapsulation kinetics to study acoustically triggered drug release from polymeric micelles. The new model incorporates drug release and micellar reassembly events caused by cavitation allowing for the controlled release of chemotherapeutics specially and temporally. Different Bayesian estimators are tested for this purpose including Kalman filters (KF), Extended Kalman filters (EKF), Particle filters (PF), and multi-model KF and EKF. Simulated and experimental results are used to verify the performance of the above-mentioned estimators. The proposed methods demonstrate the utility and high-accuracy of using estimation methods in modeling this drug delivery technique. The results show that, in both cases (linear and non-linear dynamics), the modeling errors are expensive but can be minimized using a multi-model approach. In addition, particle filters are more flexible filters that perform reasonably well compared to the other two filters. The study improved the accuracy of the kinetic models used to capture acoustically activated drug release from polymeric micelles, which may in turn help in designing hardware and software capable of precisely controlling the delivered amount of chemotherapeutics to cancerous tissue.
3-D linear inversion of gravity data: method and application to Basse-Terre volcanic island, Guadeloupe, Lesser Antilles

NASA Astrophysics Data System (ADS)

Barnoud, Anne; Coutant, Olivier; Bouligand, Claire; Gunawan, Hendra; Deroussi, Sébastien

2016-04-01

We use a Bayesian formalism combined with a grid node discretization for the linear inversion of gravimetric data in terms of 3-D density distribution. The forward modelling and the inversion method are derived from seismological inversion techniques in order to facilitate joint inversion or interpretation of density and seismic velocity models. The Bayesian formulation introduces covariance matrices on model parameters to regularize the ill-posed problem and reduce the non-uniqueness of the solution. This formalism favours smooth solutions and allows us to specify a spatial correlation length and to perform inversions at multiple scales. We also extract resolution parameters from the resolution matrix to discuss how well our density models are resolved. This method is applied to the inversion of data from the volcanic island of Basse-Terre in Guadeloupe, Lesser Antilles. A series of synthetic tests are performed to investigate advantages and limitations of the methodology in this context. This study results in the first 3-D density models of the island of Basse-Terre for which we identify: (i) a southward decrease of densities parallel to the migration of volcanic activity within the island, (ii) three dense anomalies beneath Petite Plaine Valley, Beaugendre Valley and the Grande-Découverte-Carmichaël-Soufrière Complex that may reflect the trace of former major volcanic feeding systems, (iii) shallow low-density anomalies in the southern part of Basse-Terre, especially around La Soufrière active volcano, Piton de Bouillante edifice and along the western coast, reflecting the presence of hydrothermal systems and fractured and altered rocks.
Large linear magnetoresistance in topological crystalline insulator Pb{sub 0.6}Sn{sub 0.4}Te

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roychowdhury, Subhajit; Ghara, Somnath; Guin, Satya N.

2016-01-15

Classical magnetoresistance generally follows the quadratic dependence of the magnetic field at lower field and finally saturates when field is larger. Here, we report the large positive non-saturating linear magnetoresistance in topological crystalline insulator, Pb{sub 0.6}Sn{sub 0.4}Te, at different temperatures between 3 K and 300 K in magnetic field up to 9 T. Magnetoresistance value as high as ∼200% was achieved at 3 K at magnetic field of 9 T. Linear magnetoresistance observed in Pb{sub 0.6}Sn{sub 0.4}Te is mainly governed by the spatial fluctuation carrier mobility due to distortions in the current paths in inhomogeneous conductor. - Graphical abstract: Largemore » non-saturating linear magnetoresistance has been evidenced in topological crystalline insulator, Pb{sub 0.6}Sn{sub 0.4}Te, at different temperatures between 3 K and 300 K in magnetic field up to 9 T. - Highlights: • Large non-saturating linear magnetoresistance was achieved in the topological crystalline insulator, Pb{sub 0.6}Sn{sub 0.4}Te. • Highest magnetoresistance value as high as ~200% was achieved at 3 K at magnetic field of 9 T. • Linear magnetoresistance in Pb{sub 0.6}Sn{sub 0.4}Te is mainly governed by the spatial fluctuation of the carrier mobility.« less
Propagation of population pharmacokinetic information using a Bayesian approach: comparison with meta-analysis.

PubMed

Dokoumetzidis, Aristides; Aarons, Leon

2005-08-01

We investigated the propagation of population pharmacokinetic information across clinical studies by applying Bayesian techniques. The aim was to summarize the population pharmacokinetic estimates of a study in appropriate statistical distributions in order to use them as Bayesian priors in consequent population pharmacokinetic analyses. Various data sets of simulated and real clinical data were fitted with WinBUGS, with and without informative priors. The posterior estimates of fittings with non-informative priors were used to build parametric informative priors and the whole procedure was carried on in a consecutive manner. The posterior distributions of the fittings with informative priors where compared to those of the meta-analysis fittings of the respective combinations of data sets. Good agreement was found, for the simulated and experimental datasets when the populations were exchangeable, with the posterior distribution from the fittings with the prior to be nearly identical to the ones estimated with meta-analysis. However, when populations were not exchangeble an alternative parametric form for the prior, the natural conjugate prior, had to be used in order to have consistent results. In conclusion, the results of a population pharmacokinetic analysis may be summarized in Bayesian prior distributions that can be used consecutively with other analyses. The procedure is an alternative to meta-analysis and gives comparable results. It has the advantage that it is faster than the meta-analysis, due to the large datasets used with the latter and can be performed when the data included in the prior are not actually available.
An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations

PubMed Central

Majumdar, Arunabha; Haldar, Tanushree; Bhattacharya, Sourabh; Witte, John S.

2018-01-01

Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy). For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes) that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC) technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package ‘CPBayes’ implementing the proposed method. PMID:29432419
Bayesian data analysis in population ecology: motivations, methods, and benefits

USGS Publications Warehouse

Dorazio, Robert

2016-01-01

During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.
A class of non-linear exposure-response models suitable for health impact assessment applicable to large cohort studies of ambient air pollution.

PubMed

Nasari, Masoud M; Szyszkowicz, Mieczysław; Chen, Hong; Crouse, Daniel; Turner, Michelle C; Jerrett, Michael; Pope, C Arden; Hubbell, Bryan; Fann, Neal; Cohen, Aaron; Gapstur, Susan M; Diver, W Ryan; Stieb, David; Forouzanfar, Mohammad H; Kim, Sun-Young; Olives, Casey; Krewski, Daniel; Burnett, Richard T

2016-01-01

The effectiveness of regulatory actions designed to improve air quality is often assessed by predicting changes in public health resulting from their implementation. Risk of premature mortality from long-term exposure to ambient air pollution is the single most important contributor to such assessments and is estimated from observational studies generally assuming a log-linear, no-threshold association between ambient concentrations and death. There has been only limited assessment of this assumption in part because of a lack of methods to estimate the shape of the exposure-response function in very large study populations. In this paper, we propose a new class of variable coefficient risk functions capable of capturing a variety of potentially non-linear associations which are suitable for health impact assessment. We construct the class by defining transformations of concentration as the product of either a linear or log-linear function of concentration multiplied by a logistic weighting function. These risk functions can be estimated using hazard regression survival models with currently available computer software and can accommodate large population-based cohorts which are increasingly being used for this purpose. We illustrate our modeling approach with two large cohort studies of long-term concentrations of ambient air pollution and mortality: the American Cancer Society Cancer Prevention Study II (CPS II) cohort and the Canadian Census Health and Environment Cohort (CanCHEC). We then estimate the number of deaths attributable to changes in fine particulate matter concentrations over the 2000 to 2010 time period in both Canada and the USA using both linear and non-linear hazard function models.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States

NASA Astrophysics Data System (ADS)

Yang, J.; Astitha, M.; Schwartz, C. S.

2017-12-01

Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Bayesian inversion of marine CSEM data from the Scarborough gas field using a transdimensional 2-D parametrization

NASA Astrophysics Data System (ADS)

Ray, Anandaroop; Key, Kerry; Bodin, Thomas; Myer, David; Constable, Steven

2014-12-01

We apply a reversible-jump Markov chain Monte Carlo method to sample the Bayesian posterior model probability density function of 2-D seafloor resistivity as constrained by marine controlled source electromagnetic data. This density function of earth models conveys information on which parts of the model space are illuminated by the data. Whereas conventional gradient-based inversion approaches require subjective regularization choices to stabilize this highly non-linear and non-unique inverse problem and provide only a single solution with no model uncertainty information, the method we use entirely avoids model regularization. The result of our approach is an ensemble of models that can be visualized and queried to provide meaningful information about the sensitivity of the data to the subsurface, and the level of resolution of model parameters. We represent models in 2-D using a Voronoi cell parametrization. To make the 2-D problem practical, we use a source-receiver common midpoint approximation with 1-D forward modelling. Our algorithm is transdimensional and self-parametrizing where the number of resistivity cells within a 2-D depth section is variable, as are their positions and geometries. Two synthetic studies demonstrate the algorithm's use in the appraisal of a thin, segmented, resistive reservoir which makes for a challenging exploration target. As a demonstration example, we apply our method to survey data collected over the Scarborough gas field on the Northwest Australian shelf.
Applying Bayesian statistics to the study of psychological trauma: A suggestion for future research.

PubMed

Yalch, Matthew M

2016-03-01

Several contemporary researchers have noted the virtues of Bayesian methods of data analysis. Although debates continue about whether conventional or Bayesian statistics is the "better" approach for researchers in general, there are reasons why Bayesian methods may be well suited to the study of psychological trauma in particular. This article describes how Bayesian statistics offers practical solutions to the problems of data non-normality, small sample size, and missing data common in research on psychological trauma. After a discussion of these problems and the effects they have on trauma research, this article explains the basic philosophical and statistical foundations of Bayesian statistics and how it provides solutions to these problems using an applied example. Results of the literature review and the accompanying example indicates the utility of Bayesian statistics in addressing problems common in trauma research. Bayesian statistics provides a set of methodological tools and a broader philosophical framework that is useful for trauma researchers. Methodological resources are also provided so that interested readers can learn more. (c) 2016 APA, all rights reserved).
Macroweather Predictions and Climate Projections using Scaling and Historical Observations

NASA Astrophysics Data System (ADS)

Hébert, R.; Lovejoy, S.; Del Rio Amador, L.

2017-12-01

There are two fundamental time scales that are pertinent to decadal forecasts and multidecadal projections. The first is the lifetime of planetary scale structures, about 10 days (equal to the deterministic predictability limit), and the second is - in the anthropocene - the scale at which the forced anthropogenic variability exceeds the internal variability (around 16 - 18 years). These two time scales define three regimes of variability: weather, macroweather and climate that are respectively characterized by increasing, decreasing and then increasing varibility with scale.We discuss how macroweather temperature variability can be skilfully predicted to its theoretical stochastic predictability limits by exploiting its long-range memory with the Stochastic Seasonal and Interannual Prediction System (StocSIPS). At multi-decadal timescales, the temperature response to forcing is approximately linear and this can be exploited to make projections with a Green's function, or Climate Response Function (CRF). To make the problem tractable, we exploit the temporal scaling symmetry and restrict our attention to global mean forcing and temperature response using a scaling CRF characterized by the scaling exponent H and an inner scale of linearity τ. An aerosol linear scaling factor α and a non-linear volcanic damping exponent ν were introduced to account for the large uncertainty in these forcings. We estimate the model and forcing parameters by Bayesian inference using historical data and these allow us to analytically calculate a median (and likely 66% range) for the transient climate response, and for the equilibrium climate sensitivity: 1.6K ([1.5,1.8]K) and 2.4K ([1.9,3.4]K) respectively. Aerosol forcing typically has large uncertainty and we find a modern (2005) forcing very likely range (90%) of [-1.0, -0.3] Wm-2 with median at -0.7 Wm-2. Projecting to 2100, we find that to keep the warming below 1.5 K, future emissions must undergo cuts similar to Representative Concentration Pathway (RCP) 2.6 for which the probability to remain under 1.5 K is 48%. RCP 4.5 and RCP 8.5-like futures overshoot with very high probability. This underscores that over the next century, the state of the environment will be strongly influenced by past, present and future economical policies.
Bayesian screening for active compounds in high-dimensional chemical spaces combining property descriptors and molecular fingerprints.

PubMed

Vogt, Martin; Bajorath, Jürgen

2008-01-01

Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
Bayesian aggregation versus majority vote in the characterization of non-specific arm pain based on quantitative needle electromyography

PubMed Central

2010-01-01

Background Methods for the calculation and application of quantitative electromyographic (EMG) statistics for the characterization of EMG data detected from forearm muscles of individuals with and without pain associated with repetitive strain injury are presented. Methods A classification procedure using a multi-stage application of Bayesian inference is presented that characterizes a set of motor unit potentials acquired using needle electromyography. The utility of this technique in characterizing EMG data obtained from both normal individuals and those presenting with symptoms of "non-specific arm pain" is explored and validated. The efficacy of the Bayesian technique is compared with simple voting methods. Results The aggregate Bayesian classifier presented is found to perform with accuracy equivalent to that of majority voting on the test data, with an overall accuracy greater than 0.85. Theoretical foundations of the technique are discussed, and are related to the observations found. Conclusions Aggregation of motor unit potential conditional probability distributions estimated using quantitative electromyographic analysis, may be successfully used to perform electrodiagnostic characterization of "non-specific arm pain." It is expected that these techniques will also be able to be applied to other types of electrodiagnostic data. PMID:20156353
Model verification of large structural systems. [space shuttle model response

NASA Technical Reports Server (NTRS)

Lee, L. T.; Hasselman, T. K.

1978-01-01

A computer program for the application of parameter identification on the structural dynamic models of space shuttle and other large models with hundreds of degrees of freedom is described. Finite element, dynamic, analytic, and modal models are used to represent the structural system. The interface with math models is such that output from any structural analysis program applied to any structural configuration can be used directly. Processed data from either sine-sweep tests or resonant dwell tests are directly usable. The program uses measured modal data to condition the prior analystic model so as to improve the frequency match between model and test. A Bayesian estimator generates an improved analytical model and a linear estimator is used in an iterative fashion on highly nonlinear equations. Mass and stiffness scaling parameters are generated for an improved finite element model, and the optimum set of parameters is obtained in one step.
Bivariate Left-Censored Bayesian Model for Predicting Exposure: Preliminary Analysis of Worker Exposure during the Deepwater Horizon Oil Spill.

PubMed

Groth, Caroline; Banerjee, Sudipto; Ramachandran, Gurumurthy; Stenzel, Mark R; Sandler, Dale P; Blair, Aaron; Engel, Lawrence S; Kwok, Richard K; Stewart, Patricia A

2017-01-01

In April 2010, the Deepwater Horizon oil rig caught fire and exploded, releasing almost 5 million barrels of oil into the Gulf of Mexico over the ensuing 3 months. Thousands of oil spill workers participated in the spill response and clean-up efforts. The GuLF STUDY being conducted by the National Institute of Environmental Health Sciences is an epidemiological study to investigate potential adverse health effects among these oil spill clean-up workers. Many volatile chemicals were released from the oil into the air, including total hydrocarbons (THC), which is a composite of the volatile components of oil including benzene, toluene, ethylbenzene, xylene, and hexane (BTEXH). Our goal is to estimate exposure levels to these toxic chemicals for groups of oil spill workers in the study (hereafter called exposure groups, EGs) with likely comparable exposure distributions. A large number of air measurements were collected, but many EGs are characterized by datasets with a large percentage of censored measurements (below the analytic methods' limits of detection) and/or a limited number of measurements. We use THC for which there was less censoring to develop predictive linear models for specific BTEXH air exposures with higher degrees of censoring. We present a novel Bayesian hierarchical linear model that allows us to predict, for different EGs simultaneously, exposure levels of a second chemical while accounting for censoring in both THC and the chemical of interest. We illustrate the methodology by estimating exposure levels for several EGs on the Development Driller III, a rig vessel charged with drilling one of the relief wells. The model provided credible estimates in this example for geometric means, arithmetic means, variances, correlations, and regression coefficients for each group. This approach should be considered when estimating exposures in situations when multiple chemicals are correlated and have varying degrees of censoring. © The Author 2017. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Sparse Bayesian Learning for Identifying Imaging Biomarkers in AD Prediction

PubMed Central

Shen, Li; Qi, Yuan; Kim, Sungeun; Nho, Kwangsik; Wan, Jing; Risacher, Shannon L.; Saykin, Andrew J.

2010-01-01

We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer’s disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures. PMID:20879451

Bayesian Analysis of the Association between Family-Level Factors and Siblings' Dental Caries.

PubMed

Wen, A; Weyant, R J; McNeil, D W; Crout, R J; Neiswanger, K; Marazita, M L; Foxman, B

2017-07-01

We conducted a Bayesian analysis of the association between family-level socioeconomic status and smoking and the prevalence of dental caries among siblings (children from infant to 14 y) among children living in rural and urban Northern Appalachia using data from the Center for Oral Health Research in Appalachia (COHRA). The observed proportion of siblings sharing caries was significantly different from predicted assuming siblings' caries status was independent. Using a Bayesian hierarchical model, we found the inclusion of a household factor significantly improved the goodness of fit. Other findings showed an inverse association between parental education and siblings' caries and a positive association between households with smokers and siblings' caries. Our study strengthens existing evidence suggesting that increased parental education and decreased parental cigarette smoking are associated with reduced childhood caries in the household. Our results also demonstrate the value of a Bayesian approach, which allows us to include household as a random effect, thereby providing more accurate estimates than obtained using generalized linear mixed models.
Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

PubMed

Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

2018-03-01

Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.
Inferring climate variability from skewed proxy records

NASA Astrophysics Data System (ADS)

Emile-Geay, J.; Tingley, M.

2013-12-01

Many paleoclimate analyses assume a linear relationship between the proxy and the target climate variable, and that both the climate quantity and the errors follow normal distributions. An ever-increasing number of proxy records, however, are better modeled using distributions that are heavy-tailed, skewed, or otherwise non-normal, on account of the proxies reflecting non-normally distributed climate variables, or having non-linear relationships with a normally distributed climate variable. The analysis of such proxies requires a different set of tools, and this work serves as a cautionary tale on the danger of making conclusions about the underlying climate from applications of classic statistical procedures to heavily skewed proxy records. Inspired by runoff proxies, we consider an idealized proxy characterized by a nonlinear, thresholded relationship with climate, and describe three approaches to using such a record to infer past climate: (i) applying standard methods commonly used in the paleoclimate literature, without considering the non-linearities inherent to the proxy record; (ii) applying a power transform prior to using these standard methods; (iii) constructing a Bayesian model to invert the mechanistic relationship between the climate and the proxy. We find that neglecting the skewness in the proxy leads to erroneous conclusions and often exaggerates changes in climate variability between different time intervals. In contrast, an explicit treatment of the skewness, using either power transforms or a Bayesian inversion of the mechanistic model for the proxy, yields significantly better estimates of past climate variations. We apply these insights in two paleoclimate settings: (1) a classical sedimentary record from Laguna Pallcacocha, Ecuador (Moy et al., 2002). Our results agree with the qualitative aspects of previous analyses of this record, but quantitative departures are evident and hold implications for how such records are interpreted, and compared to other proxy records. (2) a multiproxy reconstruction of temperature over the Common Era (Mann et al., 2009), where we find that about one third of the records display significant departures from normality. Accordingly, accounting for skewness in proxy predictors has a notable influence on both reconstructed global mean and spatial patterns of temperature change. Inferring climate variability from skewed proxy records thus requires cares, but can be done with relatively simple tools. References - Mann, M. E., Z. Zhang, S. Rutherford, R. S. Bradley, M. K. Hughes, D. Shindell, C. Ammann, G. Faluvegi, and F. Ni (2009), Global signatures and dynamical origins of the little ice age and medieval climate anomaly, Science, 326(5957), 1256-1260, doi:10.1126/science.1177303. - Moy, C., G. Seltzer, D. Rodbell, and D. Anderson (2002), Variability of El Niño/Southern Oscillation activ- ity at millennial timescales during the Holocene epoch, Nature, 420(6912), 162-165.
Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition.

PubMed

Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen

2017-12-27

Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.
Tree Biomass Estimation of Chinese fir (Cunninghamia lanceolata) Based on Bayesian Method

PubMed Central

Zhang, Jianguo

2013-01-01

Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass. PMID:24278198
Tree biomass estimation of Chinese fir (Cunninghamia lanceolata) based on Bayesian method.

PubMed

Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo

2013-01-01

Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation W = a(D2H)b was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass.
Hybrid ICA-Bayesian Network approach reveals distinct effective connectivity differences in schizophrenia

PubMed Central

Kim, D.; Burge, J.; Lane, T.; Pearlson, G. D; Kiehl, K. A; Calhoun, V. D.

2008-01-01

We utilized a discrete dynamic Bayesian network (dDBN) approach (Burge et al., 2007) to determine differences in brain regions between patients with schizophrenia and healthy controls on a measure of effective connectivity, termed the approximate conditional likelihood score (ACL) (Burge and Lane, 2005). The ACL score represents a class-discriminative measure of effective connectivity by measuring the relative likelihood of the correlation between brain regions in one group versus another. The algorithm is capable of finding non-linear relationships between brain regions because it uses discrete rather than continuous values and attempts to model temporal relationships with a first-order Markov and stationary assumption constraint (Papoulis, 1991). Since Bayesian networks are overly sensitive to noisy data, we introduced an independent component analysis (ICA) filtering approach that attempted to reduce the noise found in fMRI data by unmixing the raw datasets into a set of independent spatial component maps. Components that represented noise were removed and the remaining components reconstructed into the dimensions of the original fMRI datasets. We applied the dDBN algorithm to a group of 35 patients with schizophrenia and 35 matched healthy controls using an ICA filtered and unfiltered approach. We determined that filtering the data significantly improved the magnitude of the ACL score. Patients showed the greatest ACL scores in several regions, most markedly the cerebellar vermis and hemispheres. Our findings suggest that schizophrenia patients exhibit weaker connectivity than healthy controls in multiple regions, including bilateral temporal and frontal cortices, plus cerebellum during an auditory paradigm. PMID:18602482
Combining peptide recognition specificity and context information for the prediction of the 14-3-3-mediated interactome in S. cerevisiae and H. sapiens.

PubMed

Panni, Simona; Montecchi-Palazzi, Luisa; Kiemer, Lars; Cabibbo, Andrea; Paoluzi, Serena; Santonico, Elena; Landgraf, Christiane; Volkmer-Engert, Rudolf; Bachi, Angela; Castagnoli, Luisa; Cesareni, Gianni

2011-01-01

Large-scale interaction studies contribute the largest fraction of protein interactions information in databases. However, co-purification of non-specific or indirect ligands, often results in data sets that are affected by a considerable number of false positives. For the fraction of interactions mediated by short linear peptides, we present here a combined experimental and computational strategy for ranking the reliability of the inferred partners. We apply this strategy to the family of 14-3-3 domains. We have first characterized the recognition specificity of this domain family, largely confirming the results of previous analyses, while revealing new features of the preferred sequence context of 14-3-3 phospho-peptide partners. Notably, a proline next to the carboxy side of the phospho-amino acid functions as a potent inhibitor of 14-3-3 binding. The position-specific information about residue preference was encoded in a scoring matrix and two regular expressions. The integration of these three features in a single predictive model outperforms publicly available prediction tools. Next we have combined, by a naïve Bayesian approach, these "peptide features" with "protein features", such as protein co-expression and co-localization. Our approach provides an orthogonal reliability assessment and maps with high confidence the 14-3-3 peptide target on the partner proteins. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Electron acceleration via magnetic island coalescence

NASA Astrophysics Data System (ADS)

Shinohara, I.; Yumura, T.; Tanaka, K. G.; Fujimoto, M.

2009-06-01

Electron acceleration via fast magnetic island coalescence that happens as quick magnetic reconnection triggering (QMRT) proceeds has been studied. We have carried out a three-dimensional full kinetic simulation of the Harris current sheet with a large enough simulation run for two magnetic islands coalescence. Due to the strong inductive electric field associated with the non-linear evolution of the lower-hybrid-drift instability and the magnetic island coalescence process observed in the non-linear stage of the collisionless tearing mode, electrons are significantly accelerated at around the neutral sheet and the subsequent X-line. The accelerated meandering electrons generated by the non-linear evolution of the lower-hybrid-drift instability are resulted in QMRT, and QMRT leads to fast magnetic island coalescence. As a whole, the reconnection triggering and its transition to large-scale structure work as an effective electron accelerator.
Lifting primordial non-Gaussianity above the noise

DOE Office of Scientific and Technical Information (OSTI.GOV)

Welling, Yvette; Woude, Drian van der; Pajer, Enrico, E-mail: welling@strw.leidenuniv.nl, E-mail: D.C.vanderWoude@uu.nl, E-mail: enrico.pajer@gmail.com

2016-08-01

Primordial non-Gaussianity (PNG) in Large Scale Structures is obfuscated by the many additional sources of non-linearity. Within the Effective Field Theory approach to Standard Perturbation Theory, we show that matter non-linearities in the bispectrum can be modeled sufficiently well to strengthen current bounds with near future surveys, such as Euclid. We find that the EFT corrections are crucial to this improvement in sensitivity. Yet, our understanding of non-linearities is still insufficient to reach important theoretical benchmarks for equilateral PNG, while, for local PNG, our forecast is more optimistic. We consistently account for the theoretical error intrinsic to the perturbative approachmore » and discuss the details of its implementation in Fisher forecasts.« less
Stiffness optimization of non-linear elastic structures

DOE PAGES

Wallin, Mathias; Ivarsson, Niklas; Tortorelli, Daniel

2017-11-13

Our paper revisits stiffness optimization of non-linear elastic structures. Due to the non-linearity, several possible stiffness measures can be identified and in this work conventional compliance, i.e. secant stiffness designs are compared to tangent stiffness designs. The optimization problem is solved by the method of moving asymptotes and the sensitivities are calculated using the adjoint method. And for the tangent cost function it is shown that although the objective involves the third derivative of the strain energy an efficient formulation for calculating the sensitivity can be obtained. Loss of convergence due to large deformations in void regions is addressed bymore » using a fictitious strain energy such that small strain linear elasticity is approached in the void regions. We formulate a well-posed topology optimization problem by using restriction which is achieved via a Helmholtz type filter. The numerical examples provided show that for low load levels, the designs obtained from the different stiffness measures coincide whereas for large deformations significant differences are observed.« less
Stiffness optimization of non-linear elastic structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wallin, Mathias; Ivarsson, Niklas; Tortorelli, Daniel

Our paper revisits stiffness optimization of non-linear elastic structures. Due to the non-linearity, several possible stiffness measures can be identified and in this work conventional compliance, i.e. secant stiffness designs are compared to tangent stiffness designs. The optimization problem is solved by the method of moving asymptotes and the sensitivities are calculated using the adjoint method. And for the tangent cost function it is shown that although the objective involves the third derivative of the strain energy an efficient formulation for calculating the sensitivity can be obtained. Loss of convergence due to large deformations in void regions is addressed bymore » using a fictitious strain energy such that small strain linear elasticity is approached in the void regions. We formulate a well-posed topology optimization problem by using restriction which is achieved via a Helmholtz type filter. The numerical examples provided show that for low load levels, the designs obtained from the different stiffness measures coincide whereas for large deformations significant differences are observed.« less
Inferring source attribution from a multiyear multisource data set of Salmonella in Minnesota.

PubMed

Ahlstrom, C; Muellner, P; Spencer, S E F; Hong, S; Saupe, A; Rovira, A; Hedberg, C; Perez, A; Muellner, U; Alvarez, J

2017-12-01

Salmonella enterica is a global health concern because of its widespread association with foodborne illness. Bayesian models have been developed to attribute the burden of human salmonellosis to specific sources with the ultimate objective of prioritizing intervention strategies. Important considerations of source attribution models include the evaluation of the quality of input data, assessment of whether attribution results logically reflect the data trends and identification of patterns within the data that might explain the detailed contribution of different sources to the disease burden. Here, more than 12,000 non-typhoidal Salmonella isolates from human, bovine, porcine, chicken and turkey sources that originated in Minnesota were analysed. A modified Bayesian source attribution model (available in a dedicated R package), accounting for non-sampled sources of infection, attributed 4,672 human cases to sources assessed here. Most (60%) cases were attributed to chicken, although there was a spike in cases attributed to a non-sampled source in the second half of the study period. Molecular epidemiological analysis methods were used to supplement risk modelling, and a visual attribution application was developed to facilitate data exploration and comprehension of the large multiyear data set assessed here. A large amount of within-source diversity and low similarity between sources was observed, and visual exploration of data provided clues into variations driving the attribution modelling results. Results from this pillared approach provided first attribution estimates for Salmonella in Minnesota and offer an understanding of current data gaps as well as key pathogen population features, such as serotype frequency, similarity and diversity across the sources. Results here will be used to inform policy and management strategies ultimately intended to prevent and control Salmonella infection in the state. © 2017 Blackwell Verlag GmbH.
Prediction of road accidents: A Bayesian hierarchical approach.

PubMed

Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H

2013-03-01

In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any road network provided that the required data are available. Copyright © 2012 Elsevier Ltd. All rights reserved.
An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling

NASA Astrophysics Data System (ADS)

Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton

2013-10-01

Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.
Non-Linear Dynamics and Emergence in Laboratory Fusion Plasmas

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hnat, B.

2011-09-22

Turbulent behaviour of laboratory fusion plasma system is modelled using extended Hasegawa-Wakatani equations. The model is solved numerically using finite difference techniques. We discuss non-linear effects in such a system in the presence of the micro-instabilities, specifically a drift wave instability. We explore particle dynamics in different range of parameters and show that the transport changes from diffusive to non-diffusive when large directional flows are developed.
Bayesian-information-gap decision theory with an application to CO 2 sequestration

DOE PAGES

O'Malley, D.; Vesselinov, V. V.

2015-09-04

Decisions related to subsurface engineering problems such as groundwater management, fossil fuel production, and geologic carbon sequestration are frequently challenging because of an overabundance of uncertainties (related to conceptualizations, parameters, observations, etc.). Because of the importance of these problems to agriculture, energy, and the climate (respectively), good decisions that are scientifically defensible must be made despite the uncertainties. We describe a general approach to making decisions for challenging problems such as these in the presence of severe uncertainties that combines probabilistic and non-probabilistic methods. The approach uses Bayesian sampling to assess parametric uncertainty and Information-Gap Decision Theory (IGDT) to addressmore » model inadequacy. The combined approach also resolves an issue that frequently arises when applying Bayesian methods to real-world engineering problems related to the enumeration of possible outcomes. In the case of zero non-probabilistic uncertainty, the method reduces to a Bayesian method. Lastly, to illustrate the approach, we apply it to a site-selection decision for geologic CO 2 sequestration.« less
Non-Gaussian probabilistic MEG source localisation based on kernel density estimation☆

PubMed Central

Mohseni, Hamid R.; Kringelbach, Morten L.; Woolrich, Mark W.; Baker, Adam; Aziz, Tipu Z.; Probert-Smith, Penny

2014-01-01

There is strong evidence to suggest that data recorded from magnetoencephalography (MEG) follows a non-Gaussian distribution. However, existing standard methods for source localisation model the data using only second order statistics, and therefore use the inherent assumption of a Gaussian distribution. In this paper, we present a new general method for non-Gaussian source estimation of stationary signals for localising brain activity from MEG data. By providing a Bayesian formulation for MEG source localisation, we show that the source probability density function (pdf), which is not necessarily Gaussian, can be estimated using multivariate kernel density estimators. In the case of Gaussian data, the solution of the method is equivalent to that of widely used linearly constrained minimum variance (LCMV) beamformer. The method is also extended to handle data with highly correlated sources using the marginal distribution of the estimated joint distribution, which, in the case of Gaussian measurements, corresponds to the null-beamformer. The proposed non-Gaussian source localisation approach is shown to give better spatial estimates than the LCMV beamformer, both in simulations incorporating non-Gaussian signals, and in real MEG measurements of auditory and visual evoked responses, where the highly correlated sources are known to be difficult to estimate. PMID:24055702
Influence of Climate Change on Flood Hazard using Climate Informed Bayesian Hierarchical Model in Johnson Creek River

NASA Astrophysics Data System (ADS)

Zarekarizi, M.; Moradkhani, H.

2015-12-01

Extreme events are proven to be affected by climate change, influencing hydrologic simulations for which stationarity is usually a main assumption. Studies have discussed that this assumption would lead to large bias in model estimations and higher flood hazard consequently. Getting inspired by the importance of non-stationarity, we determined how the exceedance probabilities have changed over time in Johnson Creek River, Oregon. This could help estimate the probability of failure of a structure that was primarily designed to resist less likely floods according to common practice. Therefore, we built a climate informed Bayesian hierarchical model and non-stationarity was considered in modeling framework. Principle component analysis shows that North Atlantic Oscillation (NAO), Western Pacific Index (WPI) and Eastern Asia (EA) are mostly affecting stream flow in this river. We modeled flood extremes using peaks over threshold (POT) method rather than conventional annual maximum flood (AMF) mainly because it is possible to base the model on more information. We used available threshold selection methods to select a suitable threshold for the study area. Accounting for non-stationarity, model parameters vary through time with climate indices. We developed a couple of model scenarios and chose one which could best explain the variation in data based on performance measures. We also estimated return periods under non-stationarity condition. Results show that ignoring stationarity could increase the flood hazard up to four times which could increase the probability of an in-stream structure being overtopped.
Astrophysical data analysis with information field theory

NASA Astrophysics Data System (ADS)

Enßlin, Torsten

2014-12-01

Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian signal inference problems. The alleviation of a perception threshold for recovering signals of unknown correlation structure by using IFT will be discussed in particular as well as a novel improvement on instrumental self-calibration schemes. IFT can be applied to many areas. Here, applications in in cosmology (cosmic microwave background, large-scale structure) and astrophysics (galactic magnetism, radio interferometry) are presented.

The current state of Bayesian methods in medical product development: survey results and recommendations from the DIA Bayesian Scientific Working Group.

PubMed

Natanegara, Fanni; Neuenschwander, Beat; Seaman, John W; Kinnersley, Nelson; Heilmann, Cory R; Ohlssen, David; Rochester, George

2014-01-01

Bayesian applications in medical product development have recently gained popularity. Despite many advances in Bayesian methodology and computations, increase in application across the various areas of medical product development has been modest. The DIA Bayesian Scientific Working Group (BSWG), which includes representatives from industry, regulatory agencies, and academia, has adopted the vision to ensure Bayesian methods are well understood, accepted more broadly, and appropriately utilized to improve decision making and enhance patient outcomes. As Bayesian applications in medical product development are wide ranging, several sub-teams were formed to focus on various topics such as patient safety, non-inferiority, prior specification, comparative effectiveness, joint modeling, program-wide decision making, analytical tools, and education. The focus of this paper is on the recent effort of the BSWG Education sub-team to administer a Bayesian survey to statisticians across 17 organizations involved in medical product development. We summarize results of this survey, from which we provide recommendations on how to accelerate progress in Bayesian applications throughout medical product development. The survey results support findings from the literature and provide additional insight on regulatory acceptance of Bayesian methods and information on the need for a Bayesian infrastructure within an organization. The survey findings support the claim that only modest progress in areas of education and implementation has been made recently, despite substantial progress in Bayesian statistical research and software availability. Copyright © 2013 John Wiley & Sons, Ltd.
Internal Medicine residents use heuristics to estimate disease probability.

PubMed

Phang, Sen Han; Ravani, Pietro; Schaefer, Jeffrey; Wright, Bruce; McLaughlin, Kevin

2015-01-01

Training in Bayesian reasoning may have limited impact on accuracy of probability estimates. In this study, our goal was to explore whether residents previously exposed to Bayesian reasoning use heuristics rather than Bayesian reasoning to estimate disease probabilities. We predicted that if residents use heuristics then post-test probability estimates would be increased by non-discriminating clinical features or a high anchor for a target condition. We randomized 55 Internal Medicine residents to different versions of four clinical vignettes and asked them to estimate probabilities of target conditions. We manipulated the clinical data for each vignette to be consistent with either 1) using a representative heuristic, by adding non-discriminating prototypical clinical features of the target condition, or 2) using anchoring with adjustment heuristic, by providing a high or low anchor for the target condition. When presented with additional non-discriminating data the odds of diagnosing the target condition were increased (odds ratio (OR) 2.83, 95% confidence interval [1.30, 6.15], p = 0.009). Similarly, the odds of diagnosing the target condition were increased when a high anchor preceded the vignette (OR 2.04, [1.09, 3.81], p = 0.025). Our findings suggest that despite previous exposure to the use of Bayesian reasoning, residents use heuristics, such as the representative heuristic and anchoring with adjustment, to estimate probabilities. Potential reasons for attribute substitution include the relative cognitive ease of heuristics vs. Bayesian reasoning or perhaps residents in their clinical practice use gist traces rather than precise probability estimates when diagnosing.
Artificial Intelligence (AI) Center of Excellence at the University of Pennsylvania

DTIC Science & Technology

1995-07-01

that controls impact forces. Robust Location Estimation for MLR and Non-MLR Distributions (Dissertation Proposal) Gerda L. Kamberova MS-CIS-92-28...Bayesian Approach To Computer Vision Problems Gerda L. Kamberova MS-CIS-92-29 GRASP LAB 310 The object of our study is the Bayesian approach in...Estimation for MLR and Non-MLR Distributions (Dissertation) Gerda L. Kamberova MS-CIS-92-93 GRASP LAB 340 We study the problem of estimating an unknown
Bayesian deconvolution and quantification of metabolites in complex 1D NMR spectra using BATMAN.

PubMed

Hao, Jie; Liebeke, Manuel; Astle, William; De Iorio, Maria; Bundy, Jacob G; Ebbels, Timothy M D

2014-01-01

Data processing for 1D NMR spectra is a key bottleneck for metabolomic and other complex-mixture studies, particularly where quantitative data on individual metabolites are required. We present a protocol for automated metabolite deconvolution and quantification from complex NMR spectra by using the Bayesian automated metabolite analyzer for NMR (BATMAN) R package. BATMAN models resonances on the basis of a user-controllable set of templates, each of which specifies the chemical shifts, J-couplings and relative peak intensities for a single metabolite. Peaks are allowed to shift position slightly between spectra, and peak widths are allowed to vary by user-specified amounts. NMR signals not captured by the templates are modeled non-parametrically by using wavelets. The protocol covers setting up user template libraries, optimizing algorithmic input parameters, improving prior information on peak positions, quality control and evaluation of outputs. The outputs include relative concentration estimates for named metabolites together with associated Bayesian uncertainty estimates, as well as the fit of the remainder of the spectrum using wavelets. Graphical diagnostics allow the user to examine the quality of the fit for multiple spectra simultaneously. This approach offers a workflow to analyze large numbers of spectra and is expected to be useful in a wide range of metabolomics studies.
Analysing the natural population growth of a large marine mammal after a depletive harvest.

PubMed

Romero, M A; Grandi, M F; Koen-Alonso, M; Svendsen, G; Ocampo Reinaldo, M; García, N A; Dans, S L; González, R; Crespo, E A

2017-07-13

An understanding of the underlying processes and comprehensive history of population growth after a harvest-driven depletion is necessary when assessing the long-term effectiveness of management and conservation strategies. The South American sea lion (SASL), Otaria flavescens, is the most conspicuous marine mammal along the South American coasts, where it has been heavily exploited. As a consequence of this exploitation, many of its populations were decimated during the early 20th century but currently show a clear recovery. The aim of this study was to assess SASL population recovery by applying a Bayesian state-space modelling framework. We were particularly interested in understanding how the population responds at low densities, how human-induced mortality interplays with natural mechanisms, and how density-dependence may regulate population growth. The observed population trajectory of SASL shows a non-linear relationship with density, recovering with a maximum increase rate of 0.055. However, 50 years after hunting cessation, the population still represents only 40% of its pre-exploitation abundance. Considering that the SASL population in this region represents approximately 72% of the species abundance within the Atlantic Ocean, the present analysis provides insights into the potential mechanisms regulating the dynamics of SASL populations across the global distributional range of the species.
Fully probabilistic earthquake source inversion on teleseismic scales

NASA Astrophysics Data System (ADS)

Stähler, Simon; Sigloch, Karin

2017-04-01

Seismic source inversion is a non-linear problem in seismology where not just the earthquake parameters but also estimates of their uncertainties are of great practical importance. We have developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. These unknowns are parameterised efficiently by harnessing as prior knowledge solutions from a large number of non-Bayesian inversions. The source time function is expressed as a weighted sum of a small number of empirical orthogonal functions, which were derived from a catalogue of >1000 source time functions (STFs) by a principal component analysis. We use a likelihood model based on the cross-correlation misfit between observed and predicted waveforms. The resulting ensemble of solutions provides full uncertainty and covariance information for the source parameters, and permits propagating these source uncertainties into travel time estimates used for seismic tomography. The computational effort is such that routine, global estimation of earthquake mechanisms and source time functions from teleseismic broadband waveforms is feasible. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. References: Stähler, S. C. and Sigloch, K.: Fully probabilistic seismic source inversion - Part 1: Efficient parameterisation, Solid Earth, 5, 1055-1069, doi:10.5194/se-5-1055-2014, 2014. Stähler, S. C. and Sigloch, K.: Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances, Solid Earth, 7, 1521-1536, doi:10.5194/se-7-1521-2016, 2016.
Construction of monitoring model and algorithm design on passenger security during shipping based on improved Bayesian network.

PubMed

Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

2014-01-01

A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping.
Construction of Monitoring Model and Algorithm Design on Passenger Security during Shipping Based on Improved Bayesian Network

PubMed Central

Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

2014-01-01

A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping. PMID:25254227
Estimating relative risks in multicenter studies with a small number of centers - which methods to use? A simulation study.

PubMed

Pedroza, Claudia; Truong, Van Thi Thanh

2017-11-02

Analyses of multicenter studies often need to account for center clustering to ensure valid inference. For binary outcomes, it is particularly challenging to properly adjust for center when the number of centers or total sample size is small, or when there are few events per center. Our objective was to evaluate the performance of generalized estimating equation (GEE) log-binomial and Poisson models, generalized linear mixed models (GLMMs) assuming binomial and Poisson distributions, and a Bayesian binomial GLMM to account for center effect in these scenarios. We conducted a simulation study with few centers (≤30) and 50 or fewer subjects per center, using both a randomized controlled trial and an observational study design to estimate relative risk. We compared the GEE and GLMM models with a log-binomial model without adjustment for clustering in terms of bias, root mean square error (RMSE), and coverage. For the Bayesian GLMM, we used informative neutral priors that are skeptical of large treatment effects that are almost never observed in studies of medical interventions. All frequentist methods exhibited little bias, and the RMSE was very similar across the models. The binomial GLMM had poor convergence rates, ranging from 27% to 85%, but performed well otherwise. The results show that both GEE models need to use small sample corrections for robust SEs to achieve proper coverage of 95% CIs. The Bayesian GLMM had similar convergence rates but resulted in slightly more biased estimates for the smallest sample sizes. However, it had the smallest RMSE and good coverage across all scenarios. These results were very similar for both study designs. For the analyses of multicenter studies with a binary outcome and few centers, we recommend adjustment for center with either a GEE log-binomial or Poisson model with appropriate small sample corrections or a Bayesian binomial GLMM with informative priors.
Maximum a posteriori Bayesian estimation of mycophenolic Acid area under the concentration-time curve: is this clinically useful for dosage prediction yet?

PubMed

Staatz, Christine E; Tett, Susan E

2011-12-01

This review seeks to summarize the available data about Bayesian estimation of area under the plasma concentration-time curve (AUC) and dosage prediction for mycophenolic acid (MPA) and evaluate whether sufficient evidence is available for routine use of Bayesian dosage prediction in clinical practice. A literature search identified 14 studies that assessed the predictive performance of maximum a posteriori Bayesian estimation of MPA AUC and one report that retrospectively evaluated how closely dosage recommendations based on Bayesian forecasting achieved targeted MPA exposure. Studies to date have mostly been undertaken in renal transplant recipients, with limited investigation in patients treated with MPA for autoimmune disease or haematopoietic stem cell transplantation. All of these studies have involved use of the mycophenolate mofetil (MMF) formulation of MPA, rather than the enteric-coated mycophenolate sodium (EC-MPS) formulation. Bias associated with estimation of MPA AUC using Bayesian forecasting was generally less than 10%. However some difficulties with imprecision was evident, with values ranging from 4% to 34% (based on estimation involving two or more concentration measurements). Evaluation of whether MPA dosing decisions based on Bayesian forecasting (by the free website service https://pharmaco.chu-limoges.fr) achieved target drug exposure has only been undertaken once. When MMF dosage recommendations were applied by clinicians, a higher proportion (72-80%) of subsequent estimated MPA AUC values were within the 30-60 mg · h/L target range, compared with when dosage recommendations were not followed (only 39-57% within target range). Such findings provide evidence that Bayesian dosage prediction is clinically useful for achieving target MPA AUC. This study, however, was retrospective and focussed only on adult renal transplant recipients. Furthermore, in this study, Bayesian-generated AUC estimations and dosage predictions were not compared with a later full measured AUC but rather with a further AUC estimate based on a second Bayesian analysis. This study also provided some evidence that a useful monitoring schedule for MPA AUC following adult renal transplant would be every 2 weeks during the first month post-transplant, every 1-3 months between months 1 and 12, and each year thereafter. It will be interesting to see further validations in different patient groups using the free website service. In summary, the predictive performance of Bayesian estimation of MPA, comparing estimated with measured AUC values, has been reported in several studies. However, the next step of predicting dosages based on these Bayesian-estimated AUCs, and prospectively determining how closely these predicted dosages give drug exposure matching targeted AUCs, remains largely unaddressed. Further prospective studies are required, particularly in non-renal transplant patients and with the EC-MPS formulation. Other important questions remain to be answered, such as: do Bayesian forecasting methods devised to date use the best population pharmacokinetic models or most accurate algorithms; are the methods simple to use for routine clinical practice; do the algorithms actually improve dosage estimations beyond empirical recommendations in all groups that receive MPA therapy; and, importantly, do the dosage predictions, when followed, improve patient health outcomes?
Classification of emotional states from electrocardiogram signals: a non-linear approach based on hurst

PubMed Central

2013-01-01

Background Identifying the emotional state is helpful in applications involving patients with autism and other intellectual disabilities; computer-based training, human computer interaction etc. Electrocardiogram (ECG) signals, being an activity of the autonomous nervous system (ANS), reflect the underlying true emotional state of a person. However, the performance of various methods developed so far lacks accuracy, and more robust methods need to be developed to identify the emotional pattern associated with ECG signals. Methods Emotional ECG data was obtained from sixty participants by inducing the six basic emotional states (happiness, sadness, fear, disgust, surprise and neutral) using audio-visual stimuli. The non-linear feature ‘Hurst’ was computed using Rescaled Range Statistics (RRS) and Finite Variance Scaling (FVS) methods. New Hurst features were proposed by combining the existing RRS and FVS methods with Higher Order Statistics (HOS). The features were then classified using four classifiers – Bayesian Classifier, Regression Tree, K- nearest neighbor and Fuzzy K-nearest neighbor. Seventy percent of the features were used for training and thirty percent for testing the algorithm. Results Analysis of Variance (ANOVA) conveyed that Hurst and the proposed features were statistically significant (p < 0.001). Hurst computed using RRS and FVS methods showed similar classification accuracy. The features obtained by combining FVS and HOS performed better with a maximum accuracy of 92.87% and 76.45% for classifying the six emotional states using random and subject independent validation respectively. Conclusions The results indicate that the combination of non-linear analysis and HOS tend to capture the finer emotional changes that can be seen in healthy ECG data. This work can be further fine tuned to develop a real time system. PMID:23680041
Power in Bayesian Mediation Analysis for Small Sample Research

PubMed Central

Miočević, Milica; MacKinnon, David P.; Levy, Roy

2018-01-01

It was suggested that Bayesian methods have potential for increasing power in mediation analysis (Koopman, Howe, Hollenbeck, & Sin, 2015; Yuan & MacKinnon, 2009). This paper compares the power of Bayesian credibility intervals for the mediated effect to the power of normal theory, distribution of the product, percentile, and bias-corrected bootstrap confidence intervals at N≤ 200. Bayesian methods with diffuse priors have power comparable to the distribution of the product and bootstrap methods, and Bayesian methods with informative priors had the most power. Varying degrees of precision of prior distributions were also examined. Increased precision led to greater power only when N≥ 100 and the effects were small, N < 60 and the effects were large, and N < 200 and the effects were medium. An empirical example from psychology illustrated a Bayesian analysis of the single mediator model from prior selection to interpreting results. PMID:29662296
Power in Bayesian Mediation Analysis for Small Sample Research.

PubMed

Miočević, Milica; MacKinnon, David P; Levy, Roy

2017-01-01

It was suggested that Bayesian methods have potential for increasing power in mediation analysis (Koopman, Howe, Hollenbeck, & Sin, 2015; Yuan & MacKinnon, 2009). This paper compares the power of Bayesian credibility intervals for the mediated effect to the power of normal theory, distribution of the product, percentile, and bias-corrected bootstrap confidence intervals at N≤ 200. Bayesian methods with diffuse priors have power comparable to the distribution of the product and bootstrap methods, and Bayesian methods with informative priors had the most power. Varying degrees of precision of prior distributions were also examined. Increased precision led to greater power only when N≥ 100 and the effects were small, N < 60 and the effects were large, and N < 200 and the effects were medium. An empirical example from psychology illustrated a Bayesian analysis of the single mediator model from prior selection to interpreting results.
Bayesian approach to non-Gaussian field statistics for diffusive broadband terahertz pulses.

PubMed

Pearce, Jeremy; Jian, Zhongping; Mittleman, Daniel M

2005-11-01

We develop a closed-form expression for the probability distribution function for the field components of a diffusive broadband wave propagating through a random medium. We consider each spectral component to provide an individual observation of a random variable, the configurationally averaged spectral intensity. Since the intensity determines the variance of the field distribution at each frequency, this random variable serves as the Bayesian prior that determines the form of the non-Gaussian field statistics. This model agrees well with experimental results.
Bayes multiple decision functions.

PubMed

Wu, Wensong; Peña, Edsel A

2013-01-01

This paper deals with the problem of simultaneously making many ( M ) binary decisions based on one realization of a random data matrix X . M is typically large and X will usually have M rows associated with each of the M decisions to make, but for each row the data may be low dimensional. Such problems arise in many practical areas such as the biological and medical sciences, where the available dataset is from microarrays or other high-throughput technology and with the goal being to decide which among of many genes are relevant with respect to some phenotype of interest; in the engineering and reliability sciences; in astronomy; in education; and in business. A Bayesian decision-theoretic approach to this problem is implemented with the overall loss function being a cost-weighted linear combination of Type I and Type II loss functions. The class of loss functions considered allows for use of the false discovery rate (FDR), false nondiscovery rate (FNR), and missed discovery rate (MDR) in assessing the quality of decision. Through this Bayesian paradigm, the Bayes multiple decision function (BMDF) is derived and an efficient algorithm to obtain the optimal Bayes action is described. In contrast to many works in the literature where the rows of the matrix X are assumed to be stochastically independent, we allow a dependent data structure with the associations obtained through a class of frailty-induced Archimedean copulas. In particular, non-Gaussian dependent data structure, which is typical with failure-time data, can be entertained. The numerical implementation of the determination of the Bayes optimal action is facilitated through sequential Monte Carlo techniques. The theory developed could also be extended to the problem of multiple hypotheses testing, multiple classification and prediction, and high-dimensional variable selection. The proposed procedure is illustrated for the simple versus simple hypotheses setting and for the composite hypotheses setting through simulation studies. The procedure is also applied to a subset of a microarray data set from a colon cancer study.
CMB hemispherical asymmetry from non-linear isocurvature perturbations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Assadullahi, Hooshyar; Wands, David; Firouzjahi, Hassan

2015-04-01

We investigate whether non-adiabatic perturbations from inflation could produce an asymmetric distribution of temperature anisotropies on large angular scales in the cosmic microwave background (CMB). We use a generalised non-linear δ N formalism to calculate the non-Gaussianity of the primordial density and isocurvature perturbations due to the presence of non-adiabatic, but approximately scale-invariant field fluctuations during multi-field inflation. This local-type non-Gaussianity leads to a correlation between very long wavelength inhomogeneities, larger than our observable horizon, and smaller scale fluctuations in the radiation and matter density. Matter isocurvature perturbations contribute primarily to low CMB multipoles and hence can lead to a hemisphericalmore » asymmetry on large angular scales, with negligible asymmetry on smaller scales. In curvaton models, where the matter isocurvature perturbation is partly correlated with the primordial density perturbation, we are unable to obtain a significant asymmetry on large angular scales while respecting current observational constraints on the observed quadrupole. However in the axion model, where the matter isocurvature and primordial density perturbations are uncorrelated, we find it may be possible to obtain a significant asymmetry due to isocurvature modes on large angular scales. Such an isocurvature origin for the hemispherical asymmetry would naturally give rise to a distinctive asymmetry in the CMB polarisation on large scales.« less
Iterative Assessment of Statistically-Oriented and Standard Algorithms for Determining Muscle Onset with Intramuscular Electromyography.

PubMed

Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A

2017-12-01

The onset of muscle activity, as measured by electromyography (EMG), is a commonly applied metric in biomechanics. Intramuscular EMG is often used to examine deep musculature and there are currently no studies examining the effectiveness of algorithms for intramuscular EMG onset. The present study examines standard surface EMG onset algorithms (linear envelope, Teager-Kaiser Energy Operator, and sample entropy) and novel algorithms (time series mean-variance analysis, sequential/batch processing with parametric and nonparametric methods, and Bayesian changepoint analysis). Thirteen male and 5 female subjects had intramuscular EMG collected during isolated biceps brachii and vastus lateralis contractions, resulting in 103 trials. EMG onset was visually determined twice by 3 blinded reviewers. Since the reliability of visual onset was high (ICC (1,1) : 0.92), the mean of the 6 visual assessments was contrasted with the algorithmic approaches. Poorly performing algorithms were stepwise eliminated via (1) root mean square error analysis, (2) algorithm failure to identify onset/premature onset, (3) linear regression analysis, and (4) Bland-Altman plots. The top performing algorithms were all based on Bayesian changepoint analysis of rectified EMG and were statistically indistinguishable from visual analysis. Bayesian changepoint analysis has the potential to produce more reliable, accurate, and objective intramuscular EMG onset results than standard methodologies.
Assessing Local Model Adequacy in Bayesian Hierarchical Models Using the Partitioned Deviance Information Criterion

PubMed Central

Wheeler, David C.; Hickson, DeMarc A.; Waller, Lance A.

2010-01-01

Many diagnostic tools and goodness-of-fit measures, such as the Akaike information criterion (AIC) and the Bayesian deviance information criterion (DIC), are available to evaluate the overall adequacy of linear regression models. In addition, visually assessing adequacy in models has become an essential part of any regression analysis. In this paper, we focus on a spatial consideration of the local DIC measure for model selection and goodness-of-fit evaluation. We use a partitioning of the DIC into the local DIC, leverage, and deviance residuals to assess local model fit and influence for both individual observations and groups of observations in a Bayesian framework. We use visualization of the local DIC and differences in local DIC between models to assist in model selection and to visualize the global and local impacts of adding covariates or model parameters. We demonstrate the utility of the local DIC in assessing model adequacy using HIV prevalence data from pregnant women in the Butare province of Rwanda during 1989-1993 using a range of linear model specifications, from global effects only to spatially varying coefficient models, and a set of covariates related to sexual behavior. Results of applying the diagnostic visualization approach include more refined model selection and greater understanding of the models as applied to the data. PMID:21243121
Wavelets, non-linearity and turbulence in fusion plasmas

NASA Astrophysics Data System (ADS)

van Milligen, B. Ph.

Introduction Linear spectral analysis tools Wavelet analysis Wavelet spectra and coherence Joint wavelet phase-frequency spectra Non-linear spectral analysis tools Wavelet bispectra and bicoherence Interpretation of the bicoherence Analysis of computer-generated data Coupled van der Pol oscillators A large eddy simulation model for two-fluid plasma turbulence A long wavelength plasma drift wave model Analysis of plasma edge turbulence from Langmuir probe data Radial coherence observed on the TJ-IU torsatron Bicoherence profile at the L/H transition on CCT Conclusions
A network model of successive partitioning-limited solute diffusion through the stratum corneum.

PubMed

Schumm, Phillip; Scoglio, Caterina M; van der Merwe, Deon

2010-02-07

As the most exposed point of contact with the external environment, the skin is an important barrier to many chemical exposures, including medications, potentially toxic chemicals and cosmetics. Traditional dermal absorption models treat the stratum corneum lipids as a homogenous medium through which solutes diffuse according to Fick's first law of diffusion. This approach does not explain non-linear absorption and irregular distribution patterns within the stratum corneum lipids as observed in experimental data. A network model, based on successive partitioning-limited solute diffusion through the stratum corneum, where the lipid structure is represented by a large, sparse, and regular network where nodes have variable characteristics, offers an alternative, efficient, and flexible approach to dermal absorption modeling that simulates non-linear absorption data patterns. Four model versions are presented: two linear models, which have unlimited node capacities, and two non-linear models, which have limited node capacities. The non-linear model outputs produce absorption to dose relationships that can be best characterized quantitatively by using power equations, similar to the equations used to describe non-linear experimental data.

Benchmarking for Bayesian Reinforcement Learning

PubMed Central

Ernst, Damien; Couëtoux, Adrien

2016-01-01

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. PMID:27304891
Benchmarking for Bayesian Reinforcement Learning.

PubMed

Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael

2016-01-01

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.
Whole Organism High-Content Screening by Label-Free, Image-Based Bayesian Classification for Parasitic Diseases

PubMed Central

Paveley, Ross A.; Mansour, Nuha R.; Hallyburton, Irene; Bleicher, Leo S.; Benn, Alex E.; Mikic, Ivana; Guidi, Alessandra; Gilbert, Ian H.; Hopkins, Andrew L.; Bickle, Quentin D.

2012-01-01

Sole reliance on one drug, Praziquantel, for treatment and control of schistosomiasis raises concerns about development of widespread resistance, prompting renewed interest in the discovery of new anthelmintics. To discover new leads we designed an automated label-free, high content-based, high throughput screen (HTS) to assess drug-induced effects on in vitro cultured larvae (schistosomula) using bright-field imaging. Automatic image analysis and Bayesian prediction models define morphological damage, hit/non-hit prediction and larval phenotype characterization. Motility was also assessed from time-lapse images. In screening a 10,041 compound library the HTS correctly detected 99.8% of the hits scored visually. A proportion of these larval hits were also active in an adult worm ex-vivo screen and are the subject of ongoing studies. The method allows, for the first time, screening of large compound collections against schistosomes and the methods are adaptable to other whole organism and cell-based screening by morphology and motility phenotyping. PMID:22860151
Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers.

PubMed

Campbell, Kieran R; Yau, Christopher

2017-03-15

Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses.
Classifying environmentally significant urban land uses with satellite imagery.

PubMed

Park, Mi-Hyun; Stenstrom, Michael K

2008-01-01

We investigated Bayesian networks to classify urban land use from satellite imagery. Landsat Enhanced Thematic Mapper Plus (ETM(+)) images were used for the classification in two study areas: (1) Marina del Rey and its vicinity in the Santa Monica Bay Watershed, CA and (2) drainage basins adjacent to the Sweetwater Reservoir in San Diego, CA. Bayesian networks provided 80-95% classification accuracy for urban land use using four different classification systems. The classifications were robust with small training data sets with normal and reduced radiometric resolution. The networks needed only 5% of the total data (i.e., 1500 pixels) for sample size and only 5- or 6-bit information for accurate classification. The network explicitly showed the relationship among variables from its structure and was also capable of utilizing information from non-spectral data. The classification can be used to provide timely and inexpensive land use information over large areas for environmental purposes such as estimating stormwater pollutant loads.
Automatic control: the vertebral column of dogfish sharks behaves as a continuously variable transmission with smoothly shifting functions.

PubMed

Porter, Marianne E; Ewoldt, Randy H; Long, John H

2016-09-15

During swimming in dogfish sharks, Squalus acanthias, both the intervertebral joints and the vertebral centra undergo significant strain. To investigate this system, unique among vertebrates, we cyclically bent isolated segments of 10 vertebrae and nine joints. For the first time in the biomechanics of fish vertebral columns, we simultaneously characterized non-linear elasticity and viscosity throughout the bending oscillation, extending recently proposed techniques for large-amplitude oscillatory shear (LAOS) characterization to large-amplitude oscillatory bending (LAOB). The vertebral column segments behave as non-linear viscoelastic springs. Elastic properties dominate for all frequencies and curvatures tested, increasing as either variable increases. Non-linearities within a bending cycle are most in evidence at the highest frequency, 2.0 Hz, and curvature, 5 m -1 Viscous bending properties are greatest at low frequencies and high curvatures, with non-linear effects occurring at all frequencies and curvatures. The range of mechanical behaviors includes that of springs and brakes, with smooth transitions between them that allow for continuously variable power transmission by the vertebral column to assist in the mechanics of undulatory propulsion. © 2016. Published by The Company of Biologists Ltd.
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies

PubMed Central

2010-01-01

Background All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. Results The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. Conclusions This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general. PMID:20144194
Constraining mass anomalies in the interior of spherical bodies using Trans-dimensional Bayesian Hierarchical inference.

NASA Astrophysics Data System (ADS)

Izquierdo, K.; Lekic, V.; Montesi, L.

2017-12-01

Gravity inversions are especially important for planetary applications since measurements of the variations in gravitational acceleration are often the only constraint available to map out lateral density variations in the interiors of planets and other Solar system objects. Currently, global gravity data is available for the terrestrial planets and the Moon. Although several methods for inverting these data have been developed and applied, the non-uniqueness of global density models that fit the data has not yet been fully characterized. We make use of Bayesian inference and a Reversible Jump Markov Chain Monte Carlo (RJMCMC) approach to develop a Trans-dimensional Hierarchical Bayesian (THB) inversion algorithm that yields a large sample of models that fit a gravity field. From this group of models, we can determine the most likely value of parameters of a global density model and a measure of the non-uniqueness of each parameter when the number of anomalies describing the gravity field is not fixed a priori. We explore the use of a parallel tempering algorithm and fast multipole method to reduce the number of iterations and computing time needed. We applied this method to a synthetic gravity field of the Moon and a long wavelength synthetic model of density anomalies in the Earth's lower mantle. We obtained a good match between the given gravity field and the gravity field produced by the most likely model in each inversion. The number of anomalies of the models showed parsimony of the algorithm, the value of the noise variance of the input data was retrieved, and the non-uniqueness of the models was quantified. Our results show that the ability to constrain the latitude and longitude of density anomalies, which is excellent at shallow locations (<200 km), decreases with increasing depth. With higher computational resources, this THB method for gravity inversion could give new information about the overall density distribution of celestial bodies even when there is no other geophysical data available.
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies.

PubMed

David, Maria Pamela C; Concepcion, Gisela P; Padlan, Eduardo A

2010-02-08

All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.
2D Bayesian automated tilted-ring fitting of disc galaxies in large H I galaxy surveys: 2DBAT

NASA Astrophysics Data System (ADS)

Oh, Se-Heon; Staveley-Smith, Lister; Spekkens, Kristine; Kamphuis, Peter; Koribalski, Bärbel S.

2018-01-01

We present a novel algorithm based on a Bayesian method for 2D tilted-ring analysis of disc galaxy velocity fields. Compared to the conventional algorithms based on a chi-squared minimization procedure, this new Bayesian-based algorithm suffers less from local minima of the model parameters even with highly multimodal posterior distributions. Moreover, the Bayesian analysis, implemented via Markov Chain Monte Carlo sampling, only requires broad ranges of posterior distributions of the parameters, which makes the fitting procedure fully automated. This feature will be essential when performing kinematic analysis on the large number of resolved galaxies expected to be detected in neutral hydrogen (H I) surveys with the Square Kilometre Array and its pathfinders. The so-called 2D Bayesian Automated Tilted-ring fitter (2DBAT) implements Bayesian fits of 2D tilted-ring models in order to derive rotation curves of galaxies. We explore 2DBAT performance on (a) artificial H I data cubes built based on representative rotation curves of intermediate-mass and massive spiral galaxies, and (b) Australia Telescope Compact Array H I data from the Local Volume H I Survey. We find that 2DBAT works best for well-resolved galaxies with intermediate inclinations (20° < i < 70°), complementing 3D techniques better suited to modelling inclined galaxies.
Krylov Subspace Methods for Complex Non-Hermitian Linear Systems. Thesis

NASA Technical Reports Server (NTRS)

Freund, Roland W.

1991-01-01

We consider Krylov subspace methods for the solution of large sparse linear systems Ax = b with complex non-Hermitian coefficient matrices. Such linear systems arise in important applications, such as inverse scattering, numerical solution of time-dependent Schrodinger equations, underwater acoustics, eddy current computations, numerical computations in quantum chromodynamics, and numerical conformal mapping. Typically, the resulting coefficient matrices A exhibit special structures, such as complex symmetry, or they are shifted Hermitian matrices. In this paper, we first describe a Krylov subspace approach with iterates defined by a quasi-minimal residual property, the QMR method, for solving general complex non-Hermitian linear systems. Then, we study special Krylov subspace methods designed for the two families of complex symmetric respectively shifted Hermitian linear systems. We also include some results concerning the obvious approach to general complex linear systems by solving equivalent real linear systems for the real and imaginary parts of x. Finally, numerical experiments for linear systems arising from the complex Helmholtz equation are reported.
A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.

PubMed

Wang, Tingting; Chen, Yi-Ping Phoebe; Bowman, Phil J; Goddard, Michael E; Hayes, Ben J

2016-09-21

Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets.
Internal Medicine residents use heuristics to estimate disease probability

PubMed Central

Phang, Sen Han; Ravani, Pietro; Schaefer, Jeffrey; Wright, Bruce; McLaughlin, Kevin

2015-01-01

Background Training in Bayesian reasoning may have limited impact on accuracy of probability estimates. In this study, our goal was to explore whether residents previously exposed to Bayesian reasoning use heuristics rather than Bayesian reasoning to estimate disease probabilities. We predicted that if residents use heuristics then post-test probability estimates would be increased by non-discriminating clinical features or a high anchor for a target condition. Method We randomized 55 Internal Medicine residents to different versions of four clinical vignettes and asked them to estimate probabilities of target conditions. We manipulated the clinical data for each vignette to be consistent with either 1) using a representative heuristic, by adding non-discriminating prototypical clinical features of the target condition, or 2) using anchoring with adjustment heuristic, by providing a high or low anchor for the target condition. Results When presented with additional non-discriminating data the odds of diagnosing the target condition were increased (odds ratio (OR) 2.83, 95% confidence interval [1.30, 6.15], p = 0.009). Similarly, the odds of diagnosing the target condition were increased when a high anchor preceded the vignette (OR 2.04, [1.09, 3.81], p = 0.025). Conclusions Our findings suggest that despite previous exposure to the use of Bayesian reasoning, residents use heuristics, such as the representative heuristic and anchoring with adjustment, to estimate probabilities. Potential reasons for attribute substitution include the relative cognitive ease of heuristics vs. Bayesian reasoning or perhaps residents in their clinical practice use gist traces rather than precise probability estimates when diagnosing. PMID:27004080
Bayesian spatial transformation models with applications in neuroimaging data

PubMed Central

Miranda, Michelle F.; Zhu, Hongtu; Ibrahim, Joseph G.

2013-01-01

Summary The aim of this paper is to develop a class of spatial transformation models (STM) to spatially model the varying association between imaging measures in a three-dimensional (3D) volume (or 2D surface) and a set of covariates. Our STMs include a varying Box-Cox transformation model for dealing with the issue of non-Gaussian distributed imaging data and a Gaussian Markov Random Field model for incorporating spatial smoothness of the imaging data. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. Simulations and real data analysis demonstrate that the STM significantly outperforms the voxel-wise linear model with Gaussian noise in recovering meaningful geometric patterns. Our STM is able to reveal important brain regions with morphological changes in children with attention deficit hyperactivity disorder. PMID:24128143
Low-rank separated representation surrogates of high-dimensional stochastic functions: Application in Bayesian inference

NASA Astrophysics Data System (ADS)

Validi, AbdoulAhad

2014-03-01

This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.
Single and multiple phenotype QTL analyses of downy mildew resistance in interspecific grapevines.

PubMed

Divilov, Konstantin; Barba, Paola; Cadle-Davidson, Lance; Reisch, Bruce I

2018-05-01

Downy mildew resistance across days post-inoculation, experiments, and years in two interspecific grapevine F 1 families was investigated using linear mixed models and Bayesian networks, and five new QTL were identified. Breeding grapevines for downy mildew disease resistance has traditionally relied on qualitative gene resistance, which can be overcome by pathogen evolution. Analyzing two interspecific F 1 families, both having ancestry derived from Vitis vinifera and wild North American Vitis species, across 2 years and multiple experiments, we found multiple loci associated with downy mildew sporulation and hypersensitive response in both families using a single phenotype model. The loci explained between 7 and 17% of the variance for either phenotype, suggesting a complex genetic architecture for these traits in the two families studied. For two loci, we used RNA-Seq to detect differentially transcribed genes and found that the candidate genes at these loci were likely not NBS-LRR genes. Additionally, using a multiple phenotype Bayesian network analysis, we found effects between the leaf trichome density, hypersensitive response, and sporulation phenotypes. Moderate-high heritabilities were found for all three phenotypes, suggesting that selection for downy mildew resistance is an achievable goal by breeding for either physical- or non-physical-based resistance mechanisms, with the combination of the two possibly providing durable resistance.
Discussion of “Bayesian design of experiments for industrial and scientific applications via gaussian processes”

DOE PAGES

Anderson-Cook, Christine M.; Burke, Sarah E.

2016-10-18

First, we would like to commend Dr. Woods on his thought-provoking paper and insightful presentation at the 4th Annual Stu Hunter conference. We think that the material presented highlights some important needs in the area of design of experiments for generalized linear models (GLMs). In addition, we agree with Dr. Woods that design of experiements of GLMs does implicitly require expert judgement about model parameters, and hence using a Bayesian approach to capture this knowledge is a natural strategy to summarize what is known with the opportunity to incorporate associated uncertainty about that information.
Discussion of “Bayesian design of experiments for industrial and scientific applications via gaussian processes”

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anderson-Cook, Christine M.; Burke, Sarah E.

First, we would like to commend Dr. Woods on his thought-provoking paper and insightful presentation at the 4th Annual Stu Hunter conference. We think that the material presented highlights some important needs in the area of design of experiments for generalized linear models (GLMs). In addition, we agree with Dr. Woods that design of experiements of GLMs does implicitly require expert judgement about model parameters, and hence using a Bayesian approach to capture this knowledge is a natural strategy to summarize what is known with the opportunity to incorporate associated uncertainty about that information.
Design of a 6 TeV muon collider

DOE PAGES

Wang, M-H.; Nosochkov, Y.; Cai, Y.; ...

2016-09-09

Here, a preliminary design of a muon collider ring with the center of mass (CM) energy of 6 TeV is presented. The ring circumference is 6.3 km, and themore » $$\\beta$$ functions at collision point are 1 cm in each plane. The ring linear optics, the non-linear chromaticity compensation in the Interaction Region (IR), and the additional non-linear orthogonal correcting knobs are described. Magnet specifications are based on the maximum pole-tip field of 20T in dipoles and 15T in quadrupoles. Careful compensation of the non-linear chromatic and amplitude dependent effects provide a sufficiently large dynamic aperture for the momentum range of up to $$\\pm$$0.5% without considering magnet errors.« less
Combined N-of-1 trials to investigate mexiletine in non-dystrophic myotonia using a Bayesian approach; study rationale and protocol.

PubMed

Stunnenberg, Bas C; Woertman, Willem; Raaphorst, Joost; Statland, Jeffrey M; Griggs, Robert C; Timmermans, Janneke; Saris, Christiaan G; Schouwenberg, Bas J; Groenewoud, Hans M; Stegeman, Dick F; van Engelen, Baziel G M; Drost, Gea; van der Wilt, Gert Jan

2015-03-25

To obtain evidence for the clinical and cost-effectiveness of treatments for patients with rare diseases is a challenge. Non-dystrophic myotonia (NDM) is a group of inherited, rare muscle diseases characterized by muscle stiffness. The reimbursement of mexiletine, the expert opinion drug for NDM, has been discontinued in some countries due to a lack of independent randomized controlled trials (RCTs). It remains unclear however, which concessions can be accepted towards the level 1 evidence needed for coverage decisions, in rare diseases. Considering the large number of rare diseases with a lack of treatment evidence, more experience with innovative trial designs is needed. Both NDM and mexiletine are well suited for an N-of-1 trial design. A Bayesian approach allows for the combination of N-of-1 trials, which enables the assessment of outcomes on the patient and group level simultaneously. We will combine 30 individual, double-blind, randomized, placebo-controlled N-of-1 trials of mexiletine (600 mg daily) vs. placebo in genetically confirmed NDM patients using hierarchical Bayesian modeling. Our results will be compared and combined with the main results of an international cross-over RCT (mexiletine vs. placebo in NDM) published in 2012 that will be used as an informative prior. Similar criteria of eligibility, treatment regimen, end-points and measurement instruments are employed as used in the international cross-over RCT. The treatment of patients with NDM with mexiletine offers a unique opportunity to compare outcomes and efficiency of novel N-of-1 trial-based designs and conventional approaches in producing evidence of clinical and cost-effectiveness of treatments for patients with rare diseases. ClinicalTrials.gov Identifier: NCT02045667.

A non-parametric postprocessor for bias-correcting multi-model ensemble forecasts of hydrometeorological and hydrologic variables

NASA Astrophysics Data System (ADS)

Brown, James; Seo, Dong-Jun

2010-05-01

Operational forecasts of hydrometeorological and hydrologic variables often contain large uncertainties, for which ensemble techniques are increasingly used. However, the utility of ensemble forecasts depends on the unbiasedness of the forecast probabilities. We describe a technique for quantifying and removing biases from ensemble forecasts of hydrometeorological and hydrologic variables, intended for use in operational forecasting. The technique makes no a priori assumptions about the distributional form of the variables, which is often unknown or difficult to model parametrically. The aim is to estimate the conditional cumulative distribution function (ccdf) of the observed variable given a (possibly biased) real-time ensemble forecast from one or several forecasting systems (multi-model ensembles). The technique is based on Bayesian optimal linear estimation of indicator variables, and is analogous to indicator cokriging (ICK) in geostatistics. By developing linear estimators for the conditional expectation of the observed variable at many thresholds, ICK provides a discrete approximation of the full ccdf. Since ICK minimizes the conditional error variance of the indicator expectation at each threshold, it effectively minimizes the Continuous Ranked Probability Score (CRPS) when infinitely many thresholds are employed. However, the ensemble members used as predictors in ICK, and other bias-correction techniques, are often highly cross-correlated, both within and between models. Thus, we propose an orthogonal transform of the predictors used in ICK, which is analogous to using their principal components in the linear system of equations. This leads to a well-posed problem in which a minimum number of predictors are used to provide maximum information content in terms of the total variance explained. The technique is used to bias-correct precipitation ensemble forecasts from the NCEP Global Ensemble Forecast System (GEFS), for which independent validation results are presented. Extension to multimodel ensembles from the NCEP GFS and Short Range Ensemble Forecast (SREF) systems is also proposed.
Random forests in non-invasive sensorimotor rhythm brain-computer interfaces: a practical and convenient non-linear classifier.

PubMed

Steyrl, David; Scherer, Reinhold; Faller, Josef; Müller-Putz, Gernot R

2016-02-01

There is general agreement in the brain-computer interface (BCI) community that although non-linear classifiers can provide better results in some cases, linear classifiers are preferable. Particularly, as non-linear classifiers often involve a number of parameters that must be carefully chosen. However, new non-linear classifiers were developed over the last decade. One of them is the random forest (RF) classifier. Although popular in other fields of science, RFs are not common in BCI research. In this work, we address three open questions regarding RFs in sensorimotor rhythm (SMR) BCIs: parametrization, online applicability, and performance compared to regularized linear discriminant analysis (LDA). We found that the performance of RF is constant over a large range of parameter values. We demonstrate - for the first time - that RFs are applicable online in SMR-BCIs. Further, we show in an offline BCI simulation that RFs statistically significantly outperform regularized LDA by about 3%. These results confirm that RFs are practical and convenient non-linear classifiers for SMR-BCIs. Taking into account further properties of RFs, such as independence from feature distributions, maximum margin behavior, multiclass and advanced data mining capabilities, we argue that RFs should be taken into consideration for future BCIs.
Modeling Error Distributions of Growth Curve Models through Bayesian Methods

ERIC Educational Resources Information Center

Zhang, Zhiyong

2016-01-01

Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is…
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data

PubMed Central

Tian, Ting; McLachlan, Geoffrey J.; Dieters, Mark J.; Basford, Kaye E.

2015-01-01

It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances. PMID:26689369
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data.

PubMed

Tian, Ting; McLachlan, Geoffrey J; Dieters, Mark J; Basford, Kaye E

2015-01-01

It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.
Clinical Outcome Prediction in Aneurysmal Subarachnoid Hemorrhage Using Bayesian Neural Networks with Fuzzy Logic Inferences

PubMed Central

Lo, Benjamin W. Y.; Macdonald, R. Loch; Baker, Andrew; Levine, Mitchell A. H.

2013-01-01

Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH). Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients). Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs). Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique) denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication. PMID:23690884
On combination of strict Bayesian principles with model reduction technique or how stochastic model calibration can become feasible for large-scale applications

NASA Astrophysics Data System (ADS)

Oladyshkin, S.; Schroeder, P.; Class, H.; Nowak, W.

2013-12-01

Predicting underground carbon dioxide (CO2) storage represents a challenging problem in a complex dynamic system. Due to lacking information about reservoir parameters, quantification of uncertainties may become the dominant question in risk assessment. Calibration on past observed data from pilot-scale test injection can improve the predictive power of the involved geological, flow, and transport models. The current work performs history matching to pressure time series from a pilot storage site operated in Europe, maintained during an injection period. Simulation of compressible two-phase flow and transport (CO2/brine) in the considered site is computationally very demanding, requiring about 12 days of CPU time for an individual model run. For that reason, brute-force approaches for calibration are not feasible. In the current work, we explore an advanced framework for history matching based on the arbitrary polynomial chaos expansion (aPC) and strict Bayesian principles. The aPC [1] offers a drastic but accurate stochastic model reduction. Unlike many previous chaos expansions, it can handle arbitrary probability distribution shapes of uncertain parameters, and can therefore handle directly the statistical information appearing during the matching procedure. We capture the dependence of model output on these multipliers with the expansion-based reduced model. In our study we keep the spatial heterogeneity suggested by geophysical methods, but consider uncertainty in the magnitude of permeability trough zone-wise permeability multipliers. Next combined the aPC with Bootstrap filtering (a brute-force but fully accurate Bayesian updating mechanism) in order to perform the matching. In comparison to (Ensemble) Kalman Filters, our method accounts for higher-order statistical moments and for the non-linearity of both the forward model and the inversion, and thus allows a rigorous quantification of calibrated model uncertainty. The usually high computational costs of accurate filtering become very feasible for our suggested aPC-based calibration framework. However, the power of aPC-based Bayesian updating strongly depends on the accuracy of prior information. In the current study, the prior assumptions on the model parameters were not satisfactory and strongly underestimate the reservoir pressure. Thus, the aPC-based response surface used in Bootstrap filtering is fitted to a distant and poorly chosen region within the parameter space. Thanks to the iterative procedure suggested in [2] we overcome this drawback with small computational costs. The iteration successively improves the accuracy of the expansion around the current estimation of the posterior distribution. The final result is a calibrated model of the site that can be used for further studies, with an excellent match to the data. References [1] Oladyshkin S. and Nowak W. Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliability Engineering and System Safety, 106:179-190, 2012. [2] Oladyshkin S., Class H., Nowak W. Bayesian updating via Bootstrap filtering combined with data-driven polynomial chaos expansions: methodology and application to history matching for carbon dioxide storage in geological formations. Computational Geosciences, 17 (4), 671-687, 2013.
Exploring links between juvenile offenders and social disorganization at a large map scale: a Bayesian spatial modeling approach

NASA Astrophysics Data System (ADS)

Law, Jane; Quick, Matthew

2013-01-01

This paper adopts a Bayesian spatial modeling approach to investigate the distribution of young offender residences in York Region, Southern Ontario, Canada, at the census dissemination area level. Few geographic researches have analyzed offender (as opposed to offense) data at a large map scale (i.e., using a relatively small areal unit of analysis) to minimize aggregation effects. Providing context is the social disorganization theory, which hypothesizes that areas with economic deprivation, high population turnover, and high ethnic heterogeneity exhibit social disorganization and are expected to facilitate higher instances of young offenders. Non-spatial and spatial Poisson models indicate that spatial methods are superior to non-spatial models with respect to model fit and that index of ethnic heterogeneity, residential mobility (1 year moving rate), and percentage of residents receiving government transfer payments are, respectively, the most significant explanatory variables related to young offender location. These findings provide overwhelming support for social disorganization theory as it applies to offender location in York Region, Ontario. Targeting areas where prevalence of young offenders could or could not be explained by social disorganization through decomposing the estimated risk map are helpful for dealing with juvenile offenders in the region. Results prompt discussion into geographically targeted police services and young offender placement pertaining to risk of recidivism. We discuss possible reasons for differences and similarities between the previous findings (that analyzed offense data and/or were conducted at a smaller map scale) and our findings, limitations of our study, and practical outcomes of this research from a law enforcement perspective.
Information Theoretic Approaches to Rapid Discovery of Relationships in Large Climate Data Sets

NASA Technical Reports Server (NTRS)

Knuth, Kevin H.; Rossow, William B.; Clancy, Daniel (Technical Monitor)

2002-01-01

Mutual information as the asymptotic Bayesian measure of independence is an excellent starting point for investigating the existence of possible relationships among climate-relevant variables in large data sets, As mutual information is a nonlinear function of of its arguments, it is not beholden to the assumption of a linear relationship between the variables in question and can reveal features missed in linear correlation analyses. However, as mutual information is symmetric in its arguments, it only has the ability to reveal the probability that two variables are related. it provides no information as to how they are related; specifically, causal interactions or a relation based on a common cause cannot be detected. For this reason we also investigate the utility of a related quantity called the transfer entropy. The transfer entropy can be written as a difference between mutual informations and has the capability to reveal whether and how the variables are causally related. The application of these information theoretic measures is rested on some familiar examples using data from the International Satellite Cloud Climatology Project (ISCCP) to identify relation between global cloud cover and other variables, including equatorial pacific sea surface temperature (SST), over seasonal and El Nino Southern Oscillation (ENSO) cycles.
Progressive Mid-latitude Afforestation: Local and Remote Climate Impacts in the Framework of Two Coupled Earth System Models

NASA Astrophysics Data System (ADS)

Lague, Marysa

Vegetation influences the atmosphere in complex and non-linear ways, such that large-scale changes in vegetation cover can drive changes in climate on both local and global scales. Large-scale land surface changes have been shown to introduce excess energy to one hemisphere, causing a shift in atmospheric circulation on a global scale. However, past work has not quantified how the climate response scales with the area of vegetation. Here, we systematically evaluate the response of climate to linearly increasing the area of forest cover over the northern mid-latitudes. We show that the magnitude of afforestation of the northern mid-latitudes determines the climate response in a non-linear fashion, and identify a threshold in vegetation-induced cloud feedbacks - a concept not previously addressed by large-scale vegetation manipulation experiments. Small increases in tree cover drive compensating cloud feedbacks, while latent heat fluxes reach a threshold after sufficiently large increases in tree cover, causing the troposphere to warm and dry, subsequently reducing cloud cover. Increased absorption of solar radiation at the surface is driven by both surface albedo changes and cloud feedbacks. We identify how vegetation-induced changes in cloud cover further feedback on changes in the global energy balance. We also show how atmospheric cross-equatorial energy transport changes as the area of afforestation is incrementally increased (a relationship which has not previously been demonstrated). This work demonstrates that while some climate effects (such as energy transport) of large scale mid-latitude afforestation scale roughly linearly across a wide range of afforestation areas, others (such as the local partitioning of the surface energy budget) are non-linear, and sensitive to the particular magnitude of mid-latitude forcing. Our results highlight the importance of considering both local and remote climate responses to large-scale vegetation change, and explore the scaling relationship between changes in vegetation cover and the resulting climate impacts.
Bayesian Gaussian regression analysis of malnutrition for children under five years of age in Ethiopia, EMDHS 2014.

PubMed

Mohammed, Seid; Asfaw, Zeytu G

2018-01-01

The term malnutrition generally refers to both under-nutrition and over-nutrition, but this study uses the term to refer solely to a deficiency of nutrition. In Ethiopia, child malnutrition is one of the most serious public health problem and the highest in the world. The purpose of the present study was to identify the high risk factors of malnutrition and test different statistical models for childhood malnutrition and, thereafter weighing the preferable model through model comparison criteria. Bayesian Gaussian regression model was used to analyze the effect of selected socioeconomic, demographic, health and environmental covariates on malnutrition under five years old child's. Inference was made using Bayesian approach based on Markov Chain Monte Carlo (MCMC) simulation techniques in BayesX. The study found that the variables such as sex of a child, preceding birth interval, age of the child, father's education level, source of water, mother's body mass index, head of household sex, mother's age at birth, wealth index, birth order, diarrhea, child's size at birth and duration of breast feeding showed significant effects on children's malnutrition in Ethiopia. The age of child, mother's age at birth and mother's body mass index could also be important factors with a non linear effect for the child's malnutrition in Ethiopia. Thus, the present study emphasizes a special care on variables such as sex of child, preceding birth interval, father's education level, source of water, sex of head of household, wealth index, birth order, diarrhea, child's size at birth, duration of breast feeding, age of child, mother's age at birth and mother's body mass index to combat childhood malnutrition in developing countries.
Bayesian estimation of the discrete coefficient of determination.

PubMed

Chen, Ting; Braga-Neto, Ulisses M

2016-12-01

The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD. We derive analytically the optimal minimum mean-square error (MMSE) CoD estimator, as well as a CoD estimator based on the Optimal Bayesian Predictor (OBP). For the latter estimator, exact expressions for its bias, variance, and root-mean-square (RMS) are given. The accuracy of both Bayesian CoD estimators with non-informative and informative priors, under fixed or random parameters, is studied via analytical and numerical approaches. We also demonstrate the application of the proposed Bayesian approach in the inference of gene regulatory networks, using gene-expression data from a previously published study on metastatic melanoma.
Ensemble Bayesian forecasting system Part I: Theory and algorithms

NASA Astrophysics Data System (ADS)

Herr, Henry D.; Krzysztofowicz, Roman

2015-05-01

The ensemble Bayesian forecasting system (EBFS), whose theory was published in 2001, is developed for the purpose of quantifying the total uncertainty about a discrete-time, continuous-state, non-stationary stochastic process such as a time series of stages, discharges, or volumes at a river gauge. The EBFS is built of three components: an input ensemble forecaster (IEF), which simulates the uncertainty associated with random inputs; a deterministic hydrologic model (of any complexity), which simulates physical processes within a river basin; and a hydrologic uncertainty processor (HUP), which simulates the hydrologic uncertainty (an aggregate of all uncertainties except input). It works as a Monte Carlo simulator: an ensemble of time series of inputs (e.g., precipitation amounts) generated by the IEF is transformed deterministically through a hydrologic model into an ensemble of time series of outputs, which is next transformed stochastically by the HUP into an ensemble of time series of predictands (e.g., river stages). Previous research indicated that in order to attain an acceptable sampling error, the ensemble size must be on the order of hundreds (for probabilistic river stage forecasts and probabilistic flood forecasts) or even thousands (for probabilistic stage transition forecasts). The computing time needed to run the hydrologic model this many times renders the straightforward simulations operationally infeasible. This motivates the development of the ensemble Bayesian forecasting system with randomization (EBFSR), which takes full advantage of the analytic meta-Gaussian HUP and generates multiple ensemble members after each run of the hydrologic model; this auxiliary randomization reduces the required size of the meteorological input ensemble and makes it operationally feasible to generate a Bayesian ensemble forecast of large size. Such a forecast quantifies the total uncertainty, is well calibrated against the prior (climatic) distribution of predictand, possesses a Bayesian coherence property, constitutes a random sample of the predictand, and has an acceptable sampling error-which makes it suitable for rational decision making under uncertainty.
A Comparison of the β-Substitution Method and a Bayesian Method for Analyzing Left-Censored Data

PubMed Central

Huynh, Tran; Quick, Harrison; Ramachandran, Gurumurthy; Banerjee, Sudipto; Stenzel, Mark; Sandler, Dale P.; Engel, Lawrence S.; Kwok, Richard K.; Blair, Aaron; Stewart, Patricia A.

2016-01-01

Classical statistical methods for analyzing exposure data with values below the detection limits are well described in the occupational hygiene literature, but an evaluation of a Bayesian approach for handling such data is currently lacking. Here, we first describe a Bayesian framework for analyzing censored data. We then present the results of a simulation study conducted to compare the β-substitution method with a Bayesian method for exposure datasets drawn from lognormal distributions and mixed lognormal distributions with varying sample sizes, geometric standard deviations (GSDs), and censoring for single and multiple limits of detection. For each set of factors, estimates for the arithmetic mean (AM), geometric mean, GSD, and the 95th percentile (X0.95) of the exposure distribution were obtained. We evaluated the performance of each method using relative bias, the root mean squared error (rMSE), and coverage (the proportion of the computed 95% uncertainty intervals containing the true value). The Bayesian method using non-informative priors and the β-substitution method were generally comparable in bias and rMSE when estimating the AM and GM. For the GSD and the 95th percentile, the Bayesian method with non-informative priors was more biased and had a higher rMSE than the β-substitution method, but use of more informative priors generally improved the Bayesian method’s performance, making both the bias and the rMSE more comparable to the β-substitution method. An advantage of the Bayesian method is that it provided estimates of uncertainty for these parameters of interest and good coverage, whereas the β-substitution method only provided estimates of uncertainty for the AM, and coverage was not as consistent. Selection of one or the other method depends on the needs of the practitioner, the availability of prior information, and the distribution characteristics of the measurement data. We suggest the use of Bayesian methods if the practitioner has the computational resources and prior information, as the method would generally provide accurate estimates and also provides the distributions of all of the parameters, which could be useful for making decisions in some applications. PMID:26209598
How does non-linear dynamics affect the baryon acoustic oscillation?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sugiyama, Naonori S.; Spergel, David N., E-mail: nao.s.sugiyama@gmail.com, E-mail: dns@astro.princeton.edu

2014-02-01

We study the non-linear behavior of the baryon acoustic oscillation in the power spectrum and the correlation function by decomposing the dark matter perturbations into the short- and long-wavelength modes. The evolution of the dark matter fluctuations can be described as a global coordinate transformation caused by the long-wavelength displacement vector acting on short-wavelength matter perturbation undergoing non-linear growth. Using this feature, we investigate the well known cancellation of the high-k solutions in the standard perturbation theory. While the standard perturbation theory naturally satisfies the cancellation of the high-k solutions, some of the recently proposed improved perturbation theories do notmore » guarantee the cancellation. We show that this cancellation clarifies the success of the standard perturbation theory at the 2-loop order in describing the amplitude of the non-linear power spectrum even at high-k regions. We propose an extension of the standard 2-loop level perturbation theory model of the non-linear power spectrum that more accurately models the non-linear evolution of the baryon acoustic oscillation than the standard perturbation theory. The model consists of simple and intuitive parts: the non-linear evolution of the smoothed power spectrum without the baryon acoustic oscillations and the non-linear evolution of the baryon acoustic oscillations due to the large-scale velocity of dark matter and due to the gravitational attraction between dark matter particles. Our extended model predicts the smoothing parameter of the baryon acoustic oscillation peak at z = 0.35 as ∼ 7.7Mpc/h and describes the small non-linear shift in the peak position due to the galaxy random motions.« less
Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

NASA Astrophysics Data System (ADS)

Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

2017-05-01

Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.
Detangling complex relationships in forensic data: principles and use of causal networks and their application to clinical forensic science.

PubMed

Lefèvre, Thomas; Lepresle, Aude; Chariot, Patrick

2015-09-01

The search for complex, nonlinear relationships and causality in data is hindered by the availability of techniques in many domains, including forensic science. Linear multivariable techniques are useful but present some shortcomings. In the past decade, Bayesian approaches have been introduced in forensic science. To date, authors have mainly focused on providing an alternative to classical techniques for quantifying effects and dealing with uncertainty. Causal networks, including Bayesian networks, can help detangle complex relationships in data. A Bayesian network estimates the joint probability distribution of data and graphically displays dependencies between variables and the circulation of information between these variables. In this study, we illustrate the interest in utilizing Bayesian networks for dealing with complex data through an application in clinical forensic science. Evaluating the functional impairment of assault survivors is a complex task for which few determinants are known. As routinely estimated in France, the duration of this impairment can be quantified by days of 'Total Incapacity to Work' ('Incapacité totale de travail,' ITT). In this study, we used a Bayesian network approach to identify the injury type, victim category and time to evaluation as the main determinants of the 'Total Incapacity to Work' (TIW). We computed the conditional probabilities associated with the TIW node and its parents. We compared this approach with a multivariable analysis, and the results of both techniques were converging. Thus, Bayesian networks should be considered a reliable means to detangle complex relationships in data.
A polymer, random walk model for the size-distribution of large DNA fragments after high linear energy transfer radiation

NASA Technical Reports Server (NTRS)

Ponomarev, A. L.; Brenner, D.; Hlatky, L. R.; Sachs, R. K.

2000-01-01

DNA double-strand breaks (DSBs) produced by densely ionizing radiation are not located randomly in the genome: recent data indicate DSB clustering along chromosomes. Stochastic DSB clustering at large scales, from > 100 Mbp down to < 0.01 Mbp, is modeled using computer simulations and analytic equations. A random-walk, coarse-grained polymer model for chromatin is combined with a simple track structure model in Monte Carlo software called DNAbreak and is applied to data on alpha-particle irradiation of V-79 cells. The chromatin model neglects molecular details but systematically incorporates an increase in average spatial separation between two DNA loci as the number of base-pairs between the loci increases. Fragment-size distributions obtained using DNAbreak match data on large fragments about as well as distributions previously obtained with a less mechanistic approach. Dose-response relations, linear at small doses of high linear energy transfer (LET) radiation, are obtained. They are found to be non-linear when the dose becomes so large that there is a significant probability of overlapping or close juxtaposition, along one chromosome, for different DSB clusters from different tracks. The non-linearity is more evident for large fragments than for small. The DNAbreak results furnish an example of the RLC (randomly located clusters) analytic formalism, which generalizes the broken-stick fragment-size distribution of the random-breakage model that is often applied to low-LET data.
Applying Bayesian Modeling and Receiver Operating Characteristic Methodologies for Test Utility Analysis

ERIC Educational Resources Information Center

Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S.

2013-01-01

This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Linear velocity fields in non-Gaussian models for large-scale structure

NASA Technical Reports Server (NTRS)

Scherrer, Robert J.

1992-01-01

Linear velocity fields in two types of physically motivated non-Gaussian models are examined for large-scale structure: seed models, in which the density field is a convolution of a density profile with a distribution of points, and local non-Gaussian fields, derived from a local nonlinear transformation on a Gaussian field. The distribution of a single component of the velocity is derived for seed models with randomly distributed seeds, and these results are applied to the seeded hot dark matter model and the global texture model with cold dark matter. An expression for the distribution of a single component of the velocity in arbitrary local non-Gaussian models is given, and these results are applied to such fields with chi-squared and lognormal distributions. It is shown that all seed models with randomly distributed seeds and all local non-Guassian models have single-component velocity distributions with positive kurtosis.

Cosmological Ohm's law and dynamics of non-minimal electromagnetism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hollenstein, Lukas; Jain, Rajeev Kumar; Urban, Federico R., E-mail: lukas.hollenstein@cea.fr, E-mail: jain@cp3.dias.sdu.dk, E-mail: furban@ulb.ac.be

2013-01-01

The origin of large-scale magnetic fields in cosmic structures and the intergalactic medium is still poorly understood. We explore the effects of non-minimal couplings of electromagnetism on the cosmological evolution of currents and magnetic fields. In this context, we revisit the mildly non-linear plasma dynamics around recombination that are known to generate weak magnetic fields. We use the covariant approach to obtain a fully general and non-linear evolution equation for the plasma currents and derive a generalised Ohm law valid on large scales as well as in the presence of non-minimal couplings to cosmological (pseudo-)scalar fields. Due to the sizeablemore » conductivity of the plasma and the stringent observational bounds on such couplings, we conclude that modifications of the standard (adiabatic) evolution of magnetic fields are severely limited in these scenarios. Even at scales well beyond a Mpc, any departure from flux freezing behaviour is inhibited.« less
Linear time-varying models can reveal non-linear interactions of biomolecular regulatory networks using multiple time-series data.

PubMed

Kim, Jongrae; Bates, Declan G; Postlethwaite, Ian; Heslop-Harrison, Pat; Cho, Kwang-Hyun

2008-05-15

Inherent non-linearities in biomolecular interactions make the identification of network interactions difficult. One of the principal problems is that all methods based on the use of linear time-invariant models will have fundamental limitations in their capability to infer certain non-linear network interactions. Another difficulty is the multiplicity of possible solutions, since, for a given dataset, there may be many different possible networks which generate the same time-series expression profiles. A novel algorithm for the inference of biomolecular interaction networks from temporal expression data is presented. Linear time-varying models, which can represent a much wider class of time-series data than linear time-invariant models, are employed in the algorithm. From time-series expression profiles, the model parameters are identified by solving a non-linear optimization problem. In order to systematically reduce the set of possible solutions for the optimization problem, a filtering process is performed using a phase-portrait analysis with random numerical perturbations. The proposed approach has the advantages of not requiring the system to be in a stable steady state, of using time-series profiles which have been generated by a single experiment, and of allowing non-linear network interactions to be identified. The ability of the proposed algorithm to correctly infer network interactions is illustrated by its application to three examples: a non-linear model for cAMP oscillations in Dictyostelium discoideum, the cell-cycle data for Saccharomyces cerevisiae and a large-scale non-linear model of a group of synchronized Dictyostelium cells. The software used in this article is available from http://sbie.kaist.ac.kr/software
Logistic random effects regression models: a comparison of statistical packages for binary and ordinal outcomes.

PubMed

Li, Baoyue; Lingsma, Hester F; Steyerberg, Ewout W; Lesaffre, Emmanuel

2011-05-23

Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC.Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain.
Sparse Bayesian learning for DOA estimation with mutual coupling.

PubMed

Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi

2015-10-16

Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
Tackling non-linearities with the effective field theory of dark energy and modified gravity

NASA Astrophysics Data System (ADS)

Frusciante, Noemi; Papadomanolakis, Georgios

2017-12-01

We present the extension of the effective field theory framework to the mildly non-linear scales. The effective field theory approach has been successfully applied to the late time cosmic acceleration phenomenon and it has been shown to be a powerful method to obtain predictions about cosmological observables on linear scales. However, mildly non-linear scales need to be consistently considered when testing gravity theories because a large part of the data comes from those scales. Thus, non-linear corrections to predictions on observables coming from the linear analysis can help in discriminating among different gravity theories. We proceed firstly by identifying the necessary operators which need to be included in the effective field theory Lagrangian in order to go beyond the linear order in perturbations and then we construct the corresponding non-linear action. Moreover, we present the complete recipe to map any single field dark energy and modified gravity models into the non-linear effective field theory framework by considering a general action in the Arnowitt-Deser-Misner formalism. In order to illustrate this recipe we proceed to map the beyond-Horndeski theory and low-energy Hořava gravity into the effective field theory formalism. As a final step we derived the 4th order action in term of the curvature perturbation. This allowed us to identify the non-linear contributions coming from the linear order perturbations which at the next order act like source terms. Moreover, we confirm that the stability requirements, ensuring the positivity of the kinetic term and the speed of propagation for scalar mode, are automatically satisfied once the viability of the theory is demanded at linear level. The approach we present here will allow to construct, in a model independent way, all the relevant predictions on observables at mildly non-linear scales.
Estimation of white matter fiber parameters from compressed multiresolution diffusion MRI using sparse Bayesian learning.

PubMed

Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe

2018-02-15

We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.
Prior robust empirical Bayes inference for large-scale data by conditioning on rank with application to microarray data

PubMed Central

Liao, J. G.; Mcmurry, Timothy; Berg, Arthur

2014-01-01

Empirical Bayes methods have been extensively used for microarray data analysis by modeling the large number of unknown parameters as random effects. Empirical Bayes allows borrowing information across genes and can automatically adjust for multiple testing and selection bias. However, the standard empirical Bayes model can perform poorly if the assumed working prior deviates from the true prior. This paper proposes a new rank-conditioned inference in which the shrinkage and confidence intervals are based on the distribution of the error conditioned on rank of the data. Our approach is in contrast to a Bayesian posterior, which conditions on the data themselves. The new method is almost as efficient as standard Bayesian methods when the working prior is close to the true prior, and it is much more robust when the working prior is not close. In addition, it allows a more accurate (but also more complex) non-parametric estimate of the prior to be easily incorporated, resulting in improved inference. The new method’s prior robustness is demonstrated via simulation experiments. Application to a breast cancer gene expression microarray dataset is presented. Our R package rank.Shrinkage provides a ready-to-use implementation of the proposed methodology. PMID:23934072
Bayesian cloud detection for MERIS, AATSR, and their combination

NASA Astrophysics Data System (ADS)

Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.

2014-11-01

A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud masks were designed to be numerically efficient and suited for the processing of large amounts of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient amounts of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.
Bayesian cloud detection for MERIS, AATSR, and their combination

NASA Astrophysics Data System (ADS)

Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.

2015-04-01

A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud detection schemes were designed to be numerically efficient and suited for the processing of large numbers of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient numbers of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.
A Bayesian test for Hardy–Weinberg equilibrium of biallelic X-chromosomal markers

PubMed Central

Puig, X; Ginebra, J; Graffelman, J

2017-01-01

The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy–Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy–Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used. PMID:28900292
A Two-Step Bayesian Approach for Propensity Score Analysis: Simulations and Case Study.

PubMed

Kaplan, David; Chen, Jianshen

2012-07-01

A two-step Bayesian propensity score approach is introduced that incorporates prior information in the propensity score equation and outcome equation without the problems associated with simultaneous Bayesian propensity score approaches. The corresponding variance estimators are also provided. The two-step Bayesian propensity score is provided for three methods of implementation: propensity score stratification, weighting, and optimal full matching. Three simulation studies and one case study are presented to elaborate the proposed two-step Bayesian propensity score approach. Results of the simulation studies reveal that greater precision in the propensity score equation yields better recovery of the frequentist-based treatment effect. A slight advantage is shown for the Bayesian approach in small samples. Results also reveal that greater precision around the wrong treatment effect can lead to seriously distorted results. However, greater precision around the correct treatment effect parameter yields quite good results, with slight improvement seen with greater precision in the propensity score equation. A comparison of coverage rates for the conventional frequentist approach and proposed Bayesian approach is also provided. The case study reveals that credible intervals are wider than frequentist confidence intervals when priors are non-informative.
Working memory training in older adults: Bayesian evidence supporting the absence of transfer.

PubMed

Guye, Sabrina; von Bastian, Claudia C

2017-12-01

The question of whether working memory training leads to generalized improvements in untrained cognitive abilities is a longstanding and heatedly debated one. Previous research provides mostly ambiguous evidence regarding the presence or absence of transfer effects in older adults. Thus, to draw decisive conclusions regarding the effectiveness of working memory training interventions, methodologically sound studies with larger sample sizes are needed. In this study, we investigated whether or not a computer-based working memory training intervention induced near and far transfer in a large sample of 142 healthy older adults (65 to 80 years). Therefore, we randomly assigned participants to either the experimental group, which completed 25 sessions of adaptive, process-based working memory training, or to the active, adaptive visual search control group. Bayesian linear mixed-effects models were used to estimate performance improvements on the level of abilities, using multiple indicator tasks for near (working memory) and far transfer (fluid intelligence, shifting, and inhibition). Our data provided consistent evidence supporting the absence of near transfer to untrained working memory tasks and the absence of far transfer effects to all of the assessed abilities. Our results suggest that working memory training is not an effective way to improve general cognitive functioning in old age. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Nonlinear dynamic model for visual object tracking on Grassmann manifolds with partial occlusion handling.

PubMed

Khan, Zulfiqar Hasan; Gu, Irene Yu-Hua

2013-12-01

This paper proposes a novel Bayesian online learning and tracking scheme for video objects on Grassmann manifolds. Although manifold visual object tracking is promising, large and fast nonplanar (or out-of-plane) pose changes and long-term partial occlusions of deformable objects in video remain a challenge that limits the tracking performance. The proposed method tackles these problems with the main novelties on: 1) online estimation of object appearances on Grassmann manifolds; 2) optimal criterion-based occlusion handling for online updating of object appearances; 3) a nonlinear dynamic model for both the appearance basis matrix and its velocity; and 4) Bayesian formulations, separately for the tracking process and the online learning process, that are realized by employing two particle filters: one is on the manifold for generating appearance particles and another on the linear space for generating affine box particles. Tracking and online updating are performed in an alternating fashion to mitigate the tracking drift. Experiments using the proposed tracker on videos captured by a single dynamic/static camera have shown robust tracking performance, particularly for scenarios when target objects contain significant nonplanar pose changes and long-term partial occlusions. Comparisons with eight existing state-of-the-art/most relevant manifold/nonmanifold trackers with evaluations have provided further support to the proposed scheme.
Bowhead whale localization using time-difference-of-arrival data from asynchronous recorders.

PubMed

Warner, Graham A; Dosso, Stan E; Hannay, David E

2017-03-01

This paper estimates bowhead whale locations and uncertainties using nonlinear Bayesian inversion of the time-difference-of-arrival (TDOA) of low-frequency whale calls recorded on onmi-directional asynchronous recorders in the shallow waters of the northeastern Chukchi Sea, Alaska. A Y-shaped cluster of seven autonomous ocean-bottom hydrophones, separated by 0.5-9.2 km, was deployed for several months over which time their clocks drifted out of synchronization. Hundreds of recorded whale calls are manually associated between recorders. The TDOA between hydrophone pairs are calculated from filtered waveform cross correlations and depend on the whale locations, hydrophone locations, relative recorder clock offsets, and effective waveguide sound speed. A nonlinear Bayesian inversion estimates all of these parameters and their uncertainties as well as data error statistics. The problem is highly nonlinear and a linearized inversion did not produce physically realistic results. Whale location uncertainties from nonlinear inversion can be low enough to allow accurate tracking of migrating whales that vocalize repeatedly over several minutes. Estimates of clock drift rates are obtained from inversions of TDOA data over two weeks and agree with corresponding estimates obtained from long-time averaged ambient noise cross correlations. The inversion is suitable for application to large data sets of manually or automatically detected whale calls.
Bayesian Normalization Model for Label-Free Quantitative Analysis by LC-MS

PubMed Central

Nezami Ranjbar, Mohammad R.; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.

2016-01-01

We introduce a new method for normalization of data acquired by liquid chromatography coupled with mass spectrometry (LC-MS) in label-free differential expression analysis. Normalization of LC-MS data is desired prior to subsequent statistical analysis to adjust variabilities in ion intensities that are not caused by biological differences but experimental bias. There are different sources of bias including variabilities during sample collection and sample storage, poor experimental design, noise, etc. In addition, instrument variability in experiments involving a large number of LC-MS runs leads to a significant drift in intensity measurements. Although various methods have been proposed for normalization of LC-MS data, there is no universally applicable approach. In this paper, we propose a Bayesian normalization model (BNM) that utilizes scan-level information from LC-MS data. Specifically, the proposed method uses peak shapes to model the scan-level data acquired from extracted ion chromatograms (EIC) with parameters considered as a linear mixed effects model. We extended the model into BNM with drift (BNMD) to compensate for the variability in intensity measurements due to long LC-MS runs. We evaluated the performance of our method using synthetic and experimental data. In comparison with several existing methods, the proposed BNM and BNMD yielded significant improvement. PMID:26357332
Bayesian quantile regression-based partially linear mixed-effects joint models for longitudinal data with multiple features.

PubMed

Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara

2017-01-01

In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Genomic prediction based on data from three layer lines using non-linear regression models.

PubMed

Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L

2014-11-06

Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.
Soft tissue modelling through autowaves for surgery simulation.

PubMed

Zhong, Yongmin; Shirinzadeh, Bijan; Alici, Gursel; Smith, Julian

2006-09-01

Modelling of soft tissue deformation is of great importance to virtual reality based surgery simulation. This paper presents a new methodology for simulation of soft tissue deformation by drawing an analogy between autowaves and soft tissue deformation. The potential energy stored in a soft tissue as a result of a deformation caused by an external force is propagated among mass points of the soft tissue by non-linear autowaves. The novelty of the methodology is that (i) autowave techniques are established to describe the potential energy distribution of a deformation for extrapolating internal forces, and (ii) non-linear materials are modelled with non-linear autowaves other than geometric non-linearity. Integration with a haptic device has been achieved to simulate soft tissue deformation with force feedback. The proposed methodology not only deals with large-range deformations, but also accommodates isotropic, anisotropic and inhomogeneous materials by simply changing diffusion coefficients.
A Large-Particle Monte Carlo Code for Simulating Non-Linear High-Energy Processes Near Compact Objects

NASA Technical Reports Server (NTRS)

Stern, Boris E.; Svensson, Roland; Begelman, Mitchell C.; Sikora, Marek

1995-01-01

High-energy radiation processes in compact cosmic objects are often expected to have a strongly non-linear behavior. Such behavior is shown, for example, by electron-positron pair cascades and the time evolution of relativistic proton distributions in dense radiation fields. Three independent techniques have been developed to simulate these non-linear problems: the kinetic equation approach; the phase-space density (PSD) Monte Carlo method; and the large-particle (LP) Monte Carlo method. In this paper, we present the latest version of the LP method and compare it with the other methods. The efficiency of the method in treating geometrically complex problems is illustrated by showing results of simulations of 1D, 2D and 3D systems. The method is shown to be powerful enough to treat non-spherical geometries, including such effects as bulk motion of the background plasma, reflection of radiation from cold matter, and anisotropic distributions of radiating particles. It can therefore be applied to simulate high-energy processes in such astrophysical systems as accretion discs with coronae, relativistic jets, pulsar magnetospheres and gamma-ray bursts.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

PubMed

Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

2015-09-01

A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

Numerical and Experimental Dynamic Characteristics of Thin-Film Membranes

NASA Technical Reports Server (NTRS)

Young, Leyland G.; Ramanathan, Suresh; Hu, Jia-Zhu; Pai, P. Frank

2004-01-01

Presented is a total-Lagrangian displacement-based non-linear finite-element model of thin-film membranes for static and dynamic large-displacement analyses. The membrane theory fully accounts for geometric non-linearities. Fully non-linear static analysis followed by linear modal analysis is performed for an inflated circular cylindrical Kapton membrane tube under different pressures, and for a rectangular membrane under different tension loads at four comers. Finite element results show that shell modes dominate the dynamics of the inflated tube when the inflation pressure is low, and that vibration modes localized along four edges dominate the dynamics of the rectangular membrane. Numerical dynamic characteristics of the two membrane structures were experimentally verified using a Polytec PI PSV-200 scanning laser vibrometer and an EAGLE-500 8-camera motion analysis system.
Non-Linear Structural Dynamics Characterization using a Scanning Laser Vibrometer

NASA Technical Reports Server (NTRS)

Pai, P. F.; Lee, S.-Y.

2003-01-01

This paper presents the use of a scanning laser vibrometer and a signal decomposition method to characterize non-linear dynamics of highly flexible structures. A Polytec PI PSV-200 scanning laser vibrometer is used to measure transverse velocities of points on a structure subjected to a harmonic excitation. Velocity profiles at different times are constructed using the measured velocities, and then each velocity profile is decomposed using the first four linear mode shapes and a least-squares curve-fitting method. From the variations of the obtained modal \\ielocities with time we search for possible non-linear phenomena. A cantilevered titanium alloy beam subjected to harmonic base-excitations around the second. third, and fourth natural frequencies are examined in detail. Influences of the fixture mass. gravity. mass centers of mode shapes. and non-linearities are evaluated. Geometrically exact equations governing the planar, harmonic large-amplitude vibrations of beams are solved for operational deflection shapes using the multiple shooting method. Experimental results show the existence of 1:3 and 1:2:3 external and internal resonances. energy transfer from high-frequency modes to the first mode. and amplitude- and phase- modulation among several modes. Moreover, the existence of non-linear normal modes is found to be questionable.
Bayesian inversion of surface-wave data for radial and azimuthal shear-wave anisotropy, with applications to central Mongolia and west-central Italy

NASA Astrophysics Data System (ADS)

Ravenna, Matteo; Lebedev, Sergei

2018-04-01

Seismic anisotropy provides important information on the deformation history of the Earth's interior. Rayleigh and Love surface-waves are sensitive to and can be used to determine both radial and azimuthal shear-wave anisotropies at depth, but parameter trade-offs give rise to substantial model non-uniqueness. Here, we explore the trade-offs between isotropic and anisotropic structure parameters and present a suite of methods for the inversion of surface-wave, phase-velocity curves for radial and azimuthal anisotropies. One Markov chain Monte Carlo (McMC) implementation inverts Rayleigh and Love dispersion curves for a radially anisotropic shear velocity profile of the crust and upper mantle. Another McMC implementation inverts Rayleigh phase velocities and their azimuthal anisotropy for profiles of vertically polarized shear velocity and its depth-dependent azimuthal anisotropy. The azimuthal anisotropy inversion is fully non-linear, with the forward problem solved numerically at different azimuths for every model realization, which ensures that any linearization biases are avoided. The computations are performed in parallel, in order to reduce the computing time. The often challenging issue of data noise estimation is addressed by means of a Hierarchical Bayesian approach, with the variance of the noise treated as an unknown during the radial anisotropy inversion. In addition to the McMC inversions, we also present faster, non-linear gradient-search inversions for the same anisotropic structure. The results of the two approaches are mutually consistent; the advantage of the McMC inversions is that they provide a measure of uncertainty of the models. Applying the method to broad-band data from the Baikal-central Mongolia region, we determine radial anisotropy from the crust down to the transition-zone depths. Robust negative anisotropy (Vsh < Vsv) in the asthenosphere, at 100-300 km depths, presents strong new evidence for a vertical component of asthenospheric flow. This is consistent with an upward flow from below the thick lithosphere of the Siberian Craton to below the thinner lithosphere of central Mongolia, likely to give rise to decompression melting and the scattered, sporadic volcanism observed in the Baikal Rift area, as proposed previously. Inversion of phase-velocity data from west-central Italy for azimuthal anisotropy reveals a clear change in the shear-wave fast-propagation direction at 70-100 km depths, near the lithosphere-asthenosphere boundary. The orientation of the fabric in the lithosphere is roughly E-W, parallel to the direction of stretching over the last 10 m.y. The orientation of the fabric in the asthenosphere is NW-SE, matching the fast directions inferred from shear-wave splitting and probably indicating the direction of the asthenospheric flow.
Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

NASA Astrophysics Data System (ADS)

Passemier, Damien; McKay, Matthew R.; Chen, Yang

2015-07-01

Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.
Acoustic Wave Dispersion and Scattering in Complex Marine Sediment Structures

DTIC Science & Technology

2018-03-21

Developed theory and methodology to distinguish between the two major classes of volume heterogeneities, discrete particles or a fluctuation...acoustics of muddy sediments has become of intense interest in the ONR community and very large and non -linear gradients have been observed in such...method was applied to measured reflection data in a muddy sediment area, where highly non -linear depth-dependent profiles were obtained – informed by the
Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.

PubMed

Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J

2016-10-03

Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.
Optical measurement of the weak non-linearity in the eardrum vibration response to auditory stimuli

NASA Astrophysics Data System (ADS)

Aerts, Johan

The mammalian hearing organ consists of the external ear (auricle and ear canal) followed by the middle ear (eardrum and ossicles) and the inner ear (cochlea). Its function is to convert the incoming sound waves and convert them into nerve pulses which are processed in the final stage by the brain. The main task of the external and middle ear is to concentrate the incoming sound waves on a smaller surface to reduce the loss that would normally occur in transmission from air to inner ear fluid. In the past it has been shown that this is a linear process, thus without serious distortions, for sound waves going up to pressures of 130 dB SPL (˜90 Pa). However, at large pressure changes up to several kPa, the middle ear movement clearly shows non-linear behaviour. Thus, it is possible that some small non-linear distortions are also present in the middle ear vibration at lower sound pressures. In this thesis a sensitive measurement set-up is presented to detect this weak non-linear behaviour. Essentially, this set-up consists of a loud-speaker which excites the middle ear, and the resulting vibration is measured with an heterodyne vibrometer. The use of specially designed acoustic excitation signals (odd random phase multisines) enables the separation of the linear and non-linear response. The application of this technique on the middle ear demonstrates that there are already non-linear distortions present in the vibration of the middle ear at a sound pressure of 93 dB SPL. This non-linear component also grows strongly with increasing sound pressure. Knowledge of this non-linear component can contribute to the improvement of modern hearing aids, which operate at higher sound pressures where the non-linearities could distort the signal considerably. It is also important to know the contribution of middle ear non-linearity to otoacoustic emissions. This are non-linearities caused by the active feedback amplifier in the inner ear, and can be detected in the external and middle ear. These signals are used for diagnostic purposes, and therefore it is important to have an estimate the non-linear middle ear contribution to these emissions.
The non-linear power spectrum of the Lyman alpha forest

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arinyo-i-Prats, Andreu; Miralda-Escudé, Jordi; Viel, Matteo

2015-12-01

The Lyman alpha forest power spectrum has been measured on large scales by the BOSS survey in SDSS-III at z∼ 2.3, has been shown to agree well with linear theory predictions, and has provided the first measurement of Baryon Acoustic Oscillations at this redshift. However, the power at small scales, affected by non-linearities, has not been well examined so far. We present results from a variety of hydrodynamic simulations to predict the redshift space non-linear power spectrum of the Lyα transmission for several models, testing the dependence on resolution and box size. A new fitting formula is introduced to facilitate themore » comparison of our simulation results with observations and other simulations. The non-linear power spectrum has a generic shape determined by a transition scale from linear to non-linear anisotropy, and a Jeans scale below which the power drops rapidly. In addition, we predict the two linear bias factors of the Lyα forest and provide a better physical interpretation of their values and redshift evolution. The dependence of these bias factors and the non-linear power on the amplitude and slope of the primordial fluctuations power spectrum, the temperature-density relation of the intergalactic medium, and the mean Lyα transmission, as well as the redshift evolution, is investigated and discussed in detail. A preliminary comparison to the observations shows that the predicted redshift distortion parameter is in good agreement with the recent determination of Blomqvist et al., but the density bias factor is lower than observed. We make all our results publicly available in the form of tables of the non-linear power spectrum that is directly obtained from all our simulations, and parameters of our fitting formula.« less
Elastic Properties of Novel Co- and CoNi-Based Superalloys Determined through Bayesian Inference and Resonant Ultrasound Spectroscopy

NASA Astrophysics Data System (ADS)

Goodlet, Brent R.; Mills, Leah; Bales, Ben; Charpagne, Marie-Agathe; Murray, Sean P.; Lenthe, William C.; Petzold, Linda; Pollock, Tresa M.

2018-06-01

Bayesian inference is employed to precisely evaluate single crystal elastic properties of novel γ -γ ' Co- and CoNi-based superalloys from simple and non-destructive resonant ultrasound spectroscopy (RUS) measurements. Nine alloys from three Co-, CoNi-, and Ni-based alloy classes were evaluated in the fully aged condition, with one alloy per class also evaluated in the solution heat-treated condition. Comparisons are made between the elastic properties of the three alloy classes and among the alloys of a single class, with the following trends observed. A monotonic rise in the c_{44} (shear) elastic constant by a total of 12 pct is observed between the three alloy classes as Co is substituted for Ni. Elastic anisotropy ( A) is also increased, with a large majority of the nearly 13 pct increase occurring after Co becomes the dominant constituent. Together the five CoNi alloys, with Co:Ni ratios from 1:1 to 1.5:1, exhibited remarkably similar properties with an average A 1.8 pct greater than the Ni-based alloy CMSX-4. Custom code demonstrating a substantial advance over previously reported methods for RUS inversion is also reported here for the first time. CmdStan-RUS is built upon the open-source probabilistic programing language of Stan and formulates the inverse problem using Bayesian methods. Bayesian posterior distributions are efficiently computed with Hamiltonian Monte Carlo (HMC), while initial parameterization is randomly generated from weakly informative prior distributions. Remarkably robust convergence behavior is demonstrated across multiple independent HMC chains in spite of initial parameterization often very far from actual parameter values. Experimental procedures are substantially simplified by allowing any arbitrary misorientation between the specimen and crystal axes, as elastic properties and misorientation are estimated simultaneously.
Solving Fuzzy Optimization Problem Using Hybrid Ls-Sa Method

NASA Astrophysics Data System (ADS)

Vasant, Pandian

2011-06-01

Fuzzy optimization problem has been one of the most and prominent topics inside the broad area of computational intelligent. It's especially relevant in the filed of fuzzy non-linear programming. It's application as well as practical realization can been seen in all the real world problems. In this paper a large scale non-linear fuzzy programming problem has been solved by hybrid optimization techniques of Line Search (LS), Simulated Annealing (SA) and Pattern Search (PS). As industrial production planning problem with cubic objective function, 8 decision variables and 29 constraints has been solved successfully using LS-SA-PS hybrid optimization techniques. The computational results for the objective function respect to vagueness factor and level of satisfaction has been provided in the form of 2D and 3D plots. The outcome is very promising and strongly suggests that the hybrid LS-SA-PS algorithm is very efficient and productive in solving the large scale non-linear fuzzy programming problem.
Reconstructing Information in Large-Scale Structure via Logarithmic Mapping

NASA Astrophysics Data System (ADS)

Szapudi, Istvan

We propose to develop a new method to extract information from large-scale structure data combining two-point statistics and non-linear transformations; before, this information was available only with substantially more complex higher-order statistical methods. Initially, most of the cosmological information in large-scale structure lies in two-point statistics. With non- linear evolution, some of that useful information leaks into higher-order statistics. The PI and group has shown in a series of theoretical investigations how that leakage occurs, and explained the Fisher information plateau at smaller scales. This plateau means that even as more modes are added to the measurement of the power spectrum, the total cumulative information (loosely speaking the inverse errorbar) is not increasing. Recently we have shown in Neyrinck et al. (2009, 2010) that a logarithmic (and a related Gaussianization or Box-Cox) transformation on the non-linear Dark Matter or galaxy field reconstructs a surprisingly large fraction of this missing Fisher information of the initial conditions. This was predicted by the earlier wave mechanical formulation of gravitational dynamics by Szapudi & Kaiser (2003). The present proposal is focused on working out the theoretical underpinning of the method to a point that it can be used in practice to analyze data. In particular, one needs to deal with the usual real-life issues of galaxy surveys, such as complex geometry, discrete sam- pling (Poisson or sub-Poisson noise), bias (linear, or non-linear, deterministic, or stochastic), redshift distortions, pro jection effects for 2D samples, and the effects of photometric redshift errors. We will develop methods for weak lensing and Sunyaev-Zeldovich power spectra as well, the latter specifically targetting Planck. In addition, we plan to investigate the question of residual higher- order information after the non-linear mapping, and possible applications for cosmology. Our aim will be to work out practical methods, with the ultimate goal of cosmological parameter estimation. We will quantify with standard MCMC and Fisher methods (including DETF Figure of merit when applicable) the efficiency of our estimators, comparing with the conventional method, that uses the un-transformed field. Preliminary results indicate that the increase for NASA's WFIRST in the DETF Figure of Merit would be 1.5-4.2 using a range of pessimistic to optimistic assumptions, respectively.
Assessing noninferiority in a three-arm trial using the Bayesian approach.

PubMed

Ghosh, Pulak; Nathoo, Farouk; Gönen, Mithat; Tiwari, Ram C

2011-07-10

Non-inferiority trials, which aim to demonstrate that a test product is not worse than a competitor by more than a pre-specified small amount, are of great importance to the pharmaceutical community. As a result, methodology for designing and analyzing such trials is required, and developing new methods for such analysis is an important area of statistical research. The three-arm trial consists of a placebo, a reference and an experimental treatment, and simultaneously tests the superiority of the reference over the placebo along with comparing this reference to an experimental treatment. In this paper, we consider the analysis of non-inferiority trials using Bayesian methods which incorporate both parametric as well as semi-parametric models. The resulting testing approach is both flexible and robust. The benefit of the proposed Bayesian methods is assessed via simulation, based on a study examining home-based blood pressure interventions. Copyright © 2011 John Wiley & Sons, Ltd.
Optimal Sequential Rules for Computer-Based Instruction.

ERIC Educational Resources Information Center

Vos, Hans J.

1998-01-01

Formulates sequential rules for adapting the appropriate amount of instruction to learning needs in the context of computer-based instruction. Topics include Bayesian decision theory, threshold and linear-utility structure, psychometric model, optimal sequential number of test questions, and an empirical example of sequential instructional…
Distributed Estimation using Bayesian Consensus Filtering

DTIC Science & Technology

2014-06-06

Convergence rate analysis of distributed gossip (linear parameter) estimation: Fundamental limits and tradeoffs,” IEEE J. Sel. Topics Signal Process...Dimakis, S. Kar, J. Moura, M. Rabbat, and A. Scaglione, “ Gossip algorithms for distributed signal processing,” Proc. of the IEEE, vol. 98, no. 11, pp
Automating approximate Bayesian computation by local linear regression.

PubMed

Thornton, Kevin R

2009-07-07

In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
Commensurate Priors for Incorporating Historical Information in Clinical Trials Using General and Generalized Linear Models

PubMed Central

Hobbs, Brian P.; Sargent, Daniel J.; Carlin, Bradley P.

2014-01-01

Assessing between-study variability in the context of conventional random-effects meta-analysis is notoriously difficult when incorporating data from only a small number of historical studies. In order to borrow strength, historical and current data are often assumed to be fully homogeneous, but this can have drastic consequences for power and Type I error if the historical information is biased. In this paper, we propose empirical and fully Bayesian modifications of the commensurate prior model (Hobbs et al., 2011) extending Pocock (1976), and evaluate their frequentist and Bayesian properties for incorporating patient-level historical data using general and generalized linear mixed regression models. Our proposed commensurate prior models lead to preposterior admissible estimators that facilitate alternative bias-variance trade-offs than those offered by pre-existing methodologies for incorporating historical data from a small number of historical studies. We also provide a sample analysis of a colon cancer trial comparing time-to-disease progression using a Weibull regression model. PMID:24795786
Evidence of major genes affecting stress response in rainbow trout using Bayesian methods of complex segregation analysis

USDA-ARS?s Scientific Manuscript database

As a first step towards the genetic mapping of quantitative trait loci (QTL) affecting stress response variation in rainbow trout, we performed complex segregation analyses (CSA) fitting mixed inheritance models of plasma cortisol using Bayesian methods in large full-sib families of rainbow trout. ...
An Intuitive Dashboard for Bayesian Network Inference

NASA Astrophysics Data System (ADS)

Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.

2014-03-01

Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
When mechanism matters: Bayesian forecasting using models of ecological diffusion

USGS Publications Warehouse

Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

2017-01-01

Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bates, Cameron Russell; Mckigney, Edward Allen

The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.

Estimation for coefficient of variation of an extension of the exponential distribution under type-II censoring scheme

NASA Astrophysics Data System (ADS)

Bakoban, Rana A.

2017-08-01

The coefficient of variation [CV] has several applications in applied statistics. So in this paper, we adopt Bayesian and non-Bayesian approaches for the estimation of CV under type-II censored data from extension exponential distribution [EED]. The point and interval estimate of the CV are obtained for each of the maximum likelihood and parametric bootstrap techniques. Also the Bayesian approach with the help of MCMC method is presented. A real data set is presented and analyzed, hence the obtained results are used to assess the obtained theoretical results.
Comparison of linear and non-linear models for the adsorption of fluoride onto geo-material: limonite.

PubMed

Sahin, Rubina; Tapadia, Kavita

2015-01-01

The three widely used isotherms Langmuir, Freundlich and Temkin were examined in an experiment using fluoride (F⁻) ion adsorption on a geo-material (limonite) at four different temperatures by linear and non-linear models. Comparison of linear and non-linear regression models were given in selecting the optimum isotherm for the experimental results. The coefficient of determination, r², was used to select the best theoretical isotherm. The four Langmuir linear equations (1, 2, 3, and 4) are discussed. Langmuir isotherm parameters obtained from the four Langmuir linear equations using the linear model differed but they were the same when using the nonlinear model. Langmuir-2 isotherm is one of the linear forms, and it had the highest coefficient of determination (r² = 0.99) compared to the other Langmuir linear equations (1, 3 and 4) in linear form, whereas, for non-linear, Langmuir-4 fitted best among all the isotherms because it had the highest coefficient of determination (r² = 0.99). The results showed that the non-linear model may be a better way to obtain the parameters. In the present work, the thermodynamic parameters show that the absorption of fluoride onto limonite is both spontaneous (ΔG < 0) and endothermic (ΔH > 0). Scanning electron microscope and X-ray diffraction images also confirm the adsorption of F⁻ ion onto limonite. The isotherm and kinetic study reveals that limonite can be used as an adsorbent for fluoride removal. In future we can develop new technology for fluoride removal in large scale by using limonite which is cost-effective, eco-friendly and is easily available in the study area.
Plant selection for ethnobotanical uses on the Amalfi Coast (Southern Italy).

PubMed

Savo, V; Joy, R; Caneva, G; McClatchey, W C

2015-07-15

Many ethnobotanical studies have investigated selection criteria for medicinal and non-medicinal plants. In this paper we test several statistical methods using different ethnobotanical datasets in order to 1) define to which extent the nature of the datasets can affect the interpretation of results; 2) determine if the selection for different plant uses is based on phylogeny, or other selection criteria. We considered three different ethnobotanical datasets: two datasets of medicinal plants and a dataset of non-medicinal plants (handicraft production, domestic and agro-pastoral practices) and two floras of the Amalfi Coast. We performed residual analysis from linear regression, the binomial test and the Bayesian approach for calculating under-used and over-used plant families within ethnobotanical datasets. Percentages of agreement were calculated to compare the results of the analyses. We also analyzed the relationship between plant selection and phylogeny, chorology, life form and habitat using the chi-square test. Pearson's residuals for each of the significant chi-square analyses were examined for investigating alternative hypotheses of plant selection criteria. The three statistical analysis methods differed within the same dataset, and between different datasets and floras, but with some similarities. In the two medicinal datasets, only Lamiaceae was identified in both floras as an over-used family by all three statistical methods. All statistical methods in one flora agreed that Malvaceae was over-used and Poaceae under-used, but this was not found to be consistent with results of the second flora in which one statistical result was non-significant. All other families had some discrepancy in significance across methods, or floras. Significant over- or under-use was observed in only a minority of cases. The chi-square analyses were significant for phylogeny, life form and habitat. Pearson's residuals indicated a non-random selection of woody species for non-medicinal uses and an under-use of plants of temperate forests for medicinal uses. Our study showed that selection criteria for plant uses (including medicinal) are not always based on phylogeny. The comparison of different statistical methods (regression, binomial and Bayesian) under different conditions led to the conclusion that the most conservative results are obtained using regression analysis.
Novel method to construct large-scale design space in lubrication process utilizing Bayesian estimation based on a small-scale design-of-experiment and small sets of large-scale manufacturing data.

PubMed

Maeda, Jin; Suzuki, Tatsuya; Takayama, Kozo

2012-12-01

A large-scale design space was constructed using a Bayesian estimation method with a small-scale design of experiments (DoE) and small sets of large-scale manufacturing data without enforcing a large-scale DoE. The small-scale DoE was conducted using various Froude numbers (X(1)) and blending times (X(2)) in the lubricant blending process for theophylline tablets. The response surfaces, design space, and their reliability of the compression rate of the powder mixture (Y(1)), tablet hardness (Y(2)), and dissolution rate (Y(3)) on a small scale were calculated using multivariate spline interpolation, a bootstrap resampling technique, and self-organizing map clustering. The constant Froude number was applied as a scale-up rule. Three experiments under an optimal condition and two experiments under other conditions were performed on a large scale. The response surfaces on the small scale were corrected to those on a large scale by Bayesian estimation using the large-scale results. Large-scale experiments under three additional sets of conditions showed that the corrected design space was more reliable than that on the small scale, even if there was some discrepancy in the pharmaceutical quality between the manufacturing scales. This approach is useful for setting up a design space in pharmaceutical development when a DoE cannot be performed at a commercial large manufacturing scale.
Understanding the impact of career academy attendance: an application of the principal stratification framework for causal effects accounting for partial compliance.

PubMed

Page, Lindsay C

2012-04-01

Results from MDRC's longitudinal, random-assignment evaluation of career-academy high schools reveal that several years after high-school completion, those randomized to receive the academy opportunity realized a $175 (11%) increase in monthly earnings, on average. In this paper, I investigate the impact of duration of actual academy enrollment, as nearly half of treatment group students either never enrolled or participated for only a portion of high school. I capitalize on data from this experimental evaluation and utilize a principal stratification framework and Bayesian inference to investigate the causal impact of academy participation. This analysis focuses on a sample of 1,306 students across seven sites in the MDRC evaluation. Participation is measured by number of years of academy enrollment, and the outcome of interest is average monthly earnings in the period of four to eight years after high school graduation. I estimate an average causal effect of treatment assignment on subsequent monthly earnings of approximately $588 among males who remained enrolled in an academy throughout high school and more modest impacts among those who participated only partially. Different from an instrumental variables approach to treatment non-compliance, which allows for the estimation of linear returns to treatment take-up, the more general framework of principal stratification allows for the consideration of non-linear returns, although at the expense of additional model-based assumptions.
Uncertainty aggregation and reduction in structure-material performance prediction

NASA Astrophysics Data System (ADS)

Hu, Zhen; Mahadevan, Sankaran; Ao, Dan

2018-02-01

An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.
A Large Signal Model for CMUT Arrays with Arbitrary Membrane Geometries Operating in Non-Collapsed Mode

PubMed Central

Satir, Sarp; Zahorian, Jaime; Degertekin, F. Levent

2014-01-01

A large signal, transient model has been developed to predict the output characteristics of a CMUT array operated in the non-collapse mode. The model is based on separation of the nonlinear electrostatic voltage-to-force relation and the linear acoustic array response. For linear acoustic radiation and crosstalk effects, the boundary element method is used. The stiffness matrix in the vibroacoustics calculations is obtained using static finite element analysis of a single membrane which can have arbitrary geometry and boundary conditions. A lumped modeling approach is used to reduce the order of the system for modeling the transient nonlinear electrostatic actuation. To accurately capture the dynamics of the non-uniform electrostatic force distribution over the CMUT electrode during large deflections, the membrane electrode is divided into patches shaped to match higher order membrane modes, each introducing a variable to the system model. This reduced order nonlinear lumped model is solved in the time domain using Simulink. The model has two linear blocks to calculate the displacement profile of the electrode patches and the output pressure for a given force distribution over the array, respectively. The force to array displacement block uses the linear acoustic model, and the Rayleigh integral is evaluated to calculate the pressure at any field point. Using the model, the transient transmitted pressure can be simulated for different large signal drive signal configurations. The acoustic model is verified by comparison to harmonic FEA in vacuum and fluid for high and low aspect ratio membranes as well as mass-loaded membranes. The overall Simulink model is verified by comparison to transient 3D FEA and experimental results for different large drive signals; and an example for a phased array simulation is given. PMID:24158297
Comparison between the basic least squares and the Bayesian approach for elastic constants identification

NASA Astrophysics Data System (ADS)

Gogu, C.; Haftka, R.; LeRiche, R.; Molimard, J.; Vautrin, A.; Sankar, B.

2008-11-01

The basic formulation of the least squares method, based on the L2 norm of the misfit, is still widely used today for identifying elastic material properties from experimental data. An alternative statistical approach is the Bayesian method. We seek here situations with significant difference between the material properties found by the two methods. For a simple three bar truss example we illustrate three such situations in which the Bayesian approach leads to more accurate results: different magnitude of the measurements, different uncertainty in the measurements and correlation among measurements. When all three effects add up, the Bayesian approach can have a large advantage. We then compared the two methods for identification of elastic constants from plate vibration natural frequencies.
Comparison of estimation methods for creating small area rates of acute myocardial infarction among Medicare beneficiaries in California.

PubMed

Yasaitis, Laura C; Arcaya, Mariana C; Subramanian, S V

2015-09-01

Creating local population health measures from administrative data would be useful for health policy and public health monitoring purposes. While a wide range of options--from simple spatial smoothers to model-based methods--for estimating such rates exists, there are relatively few side-by-side comparisons, especially not with real-world data. In this paper, we compare methods for creating local estimates of acute myocardial infarction rates from Medicare claims data. A Bayesian Monte Carlo Markov Chain estimator that incorporated spatial and local random effects performed best, followed by a method-of-moments spatial Empirical Bayes estimator. As the former is more complicated and time-consuming, spatial linear Empirical Bayes methods may represent a good alternative for non-specialist investigators. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bayesian spatial transformation models with applications in neuroimaging data.

PubMed

Miranda, Michelle F; Zhu, Hongtu; Ibrahim, Joseph G

2013-12-01

The aim of this article is to develop a class of spatial transformation models (STM) to spatially model the varying association between imaging measures in a three-dimensional (3D) volume (or 2D surface) and a set of covariates. The proposed STM include a varying Box-Cox transformation model for dealing with the issue of non-Gaussian distributed imaging data and a Gaussian Markov random field model for incorporating spatial smoothness of the imaging data. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. Simulations and real data analysis demonstrate that the STM significantly outperforms the voxel-wise linear model with Gaussian noise in recovering meaningful geometric patterns. Our STM is able to reveal important brain regions with morphological changes in children with attention deficit hyperactivity disorder. © 2013, The International Biometric Society.
Estimating neural response functions from fMRI

PubMed Central

Kumar, Sukhbinder; Penny, William

2014-01-01

This paper proposes a methodology for estimating Neural Response Functions (NRFs) from fMRI data. These NRFs describe non-linear relationships between experimental stimuli and neuronal population responses. The method is based on a two-stage model comprising an NRF and a Hemodynamic Response Function (HRF) that are simultaneously fitted to fMRI data using a Bayesian optimization algorithm. This algorithm also produces a model evidence score, providing a formal model comparison method for evaluating alternative NRFs. The HRF is characterized using previously established “Balloon” and BOLD signal models. We illustrate the method with two example applications based on fMRI studies of the auditory system. In the first, we estimate the time constants of repetition suppression and facilitation, and in the second we estimate the parameters of population receptive fields in a tonotopic mapping study. PMID:24847246
Reconstructing for joint angles on the shoulder and elbow from non-invasive electroencephalographic signals through electromyography

PubMed Central

Choi, Kyuwan

2013-01-01

In this study, first the cortical activities over 2240 vertexes on the brain were estimated from 64 channels electroencephalography (EEG) signals using the Hierarchical Bayesian estimation while 5 subjects did continuous arm reaching movements. From the estimated cortical activities, a sparse linear regression method selected only useful features in reconstructing the electromyography (EMG) signals and estimated the EMG signals of 9 arm muscles. Then, a modular artificial neural network was used to estimate four joint angles from the estimated EMG signals of 9 muscles: one for movement control and the other for posture control. The estimated joint angles using this method have the correlation coefficient (CC) of 0.807 (±0.10) and the normalized root-mean-square error (nRMSE) of 0.176 (±0.29) with the actual joint angles. PMID:24167469
A Bayesian Model for the Estimation of Latent Interaction and Quadratic Effects When Latent Variables Are Non-Normally Distributed

ERIC Educational Resources Information Center

Kelava, Augustin; Nagengast, Benjamin

2012-01-01

Structural equation models with interaction and quadratic effects have become a standard tool for testing nonlinear hypotheses in the social sciences. Most of the current approaches assume normally distributed latent predictor variables. In this article, we present a Bayesian model for the estimation of latent nonlinear effects when the latent…
Application of Bayesian Maximum Entropy Filter in parameter calibration of groundwater flow model in PingTung Plain

NASA Astrophysics Data System (ADS)

Cheung, Shao-Yong; Lee, Chieh-Han; Yu, Hwa-Lung

2017-04-01

Due to the limited hydrogeological observation data and high levels of uncertainty within, parameter estimation of the groundwater model has been an important issue. There are many methods of parameter estimation, for example, Kalman filter provides a real-time calibration of parameters through measurement of groundwater monitoring wells, related methods such as Extended Kalman Filter and Ensemble Kalman Filter are widely applied in groundwater research. However, Kalman Filter method is limited to linearity. This study propose a novel method, Bayesian Maximum Entropy Filtering, which provides a method that can considers the uncertainty of data in parameter estimation. With this two methods, we can estimate parameter by given hard data (certain) and soft data (uncertain) in the same time. In this study, we use Python and QGIS in groundwater model (MODFLOW) and development of Extended Kalman Filter and Bayesian Maximum Entropy Filtering in Python in parameter estimation. This method may provide a conventional filtering method and also consider the uncertainty of data. This study was conducted through numerical model experiment to explore, combine Bayesian maximum entropy filter and a hypothesis for the architecture of MODFLOW groundwater model numerical estimation. Through the virtual observation wells to simulate and observe the groundwater model periodically. The result showed that considering the uncertainty of data, the Bayesian maximum entropy filter will provide an ideal result of real-time parameters estimation.
Non-Linear Dynamics of Saturn’s Rings

NASA Astrophysics Data System (ADS)

Esposito, Larry W.

2015-11-01

Non-linear processes can explain why Saturn’s rings are so active and dynamic. Ring systems differ from simple linear systems in two significant ways: 1. They are systems of granular material: where particle-to-particle collisions dominate; thus a kinetic, not a fluid description needed. We find that stresses are strikingly inhomogeneous and fluctuations are large compared to equilibrium. 2. They are strongly forced by resonances: which drive a non-linear response, pushing the system across thresholds that lead to persistent states.Some of this non-linearity is captured in a simple Predator-Prey Model: Periodic forcing from the moon causes streamline crowding; This damps the relative velocity, and allows aggregates to grow. About a quarter phase later, the aggregates stir the system to higher relative velocity and the limit cycle repeats each orbit.Summary of Halo Results: A predator-prey model for ring dynamics produces transient structures like ‘straw’ that can explain the halo structure and spectroscopy: This requires energetic collisions (v ≈ 10m/sec, with throw distances about 200km, implying objects of scale R ≈ 20km).Transform to Duffing Eqn : With the coordinate transformation, z = M2/3, the Predator-Prey equations can be combined to form a single second-order differential equation with harmonic resonance forcing.Ring dynamics and history implications: Moon-triggered clumping at perturbed regions in Saturn’s rings creates both high velocity dispersion and large aggregates at these distances, explaining both small and large particles observed there. We calculate the stationary size distribution using a cell-to-cell mapping procedure that converts the phase-plane trajectories to a Markov chain. Approximating the Markov chain as an asymmetric random walk with reflecting boundaries allows us to determine the power law index from results of numerical simulations in the tidal environment surrounding Saturn. Aggregates can explain many dynamic aspects of the rings and can renew rings by shielding and recycling the material within them, depending on how long the mass is sequestered. We can ask: Are Saturn’s rings a chaotic non-linear driven system?
Diagnostics for generalized linear hierarchical models in network meta-analysis.

PubMed

Zhao, Hong; Hodges, James S; Carlin, Bradley P

2017-09-01

Network meta-analysis (NMA) combines direct and indirect evidence comparing more than 2 treatments. Inconsistency arises when these 2 information sources differ. Previous work focuses on inconsistency detection, but little has been done on how to proceed after identifying inconsistency. The key issue is whether inconsistency changes an NMA's substantive conclusions. In this paper, we examine such discrepancies from a diagnostic point of view. Our methods seek to detect influential and outlying observations in NMA at a trial-by-arm level. These observations may have a large effect on the parameter estimates in NMA, or they may deviate markedly from other observations. We develop formal diagnostics for a Bayesian hierarchical model to check the effect of deleting any observation. Diagnostics are specified for generalized linear hierarchical NMA models and investigated for both published and simulated datasets. Results from our example dataset using either contrast- or arm-based models and from the simulated datasets indicate that the sources of inconsistency in NMA tend not to be influential, though results from the example dataset suggest that they are likely to be outliers. This mimics a familiar result from linear model theory, in which outliers with low leverage are not influential. Future extensions include incorporating baseline covariates and individual-level patient data. Copyright © 2017 John Wiley & Sons, Ltd.
Pyrogenic carbon distribution in mineral topsoils of the northeastern United States

USGS Publications Warehouse

Jauss, Verena; Sullivan, Patrick J.; Sanderman, Jonathan; Smith, David; Lehmann, Johannes

2017-01-01

Due to its slow turnover rates in soil, pyrogenic carbon (PyC) is considered an important C pool and relevant to climate change processes. Therefore, the amounts of soil PyC were compared to environmental covariates over an area of 327,757 km2 in the northeastern United States in order to understand the controls on PyC distribution over large areas. Topsoil (defined as the soil A horizon, after removal of any organic horizons) samples were collected at 165 field sites in a generalised random tessellation stratified design that corresponded to approximately 1 site per 1600 km2 and PyC was estimated from diffuse reflectance mid-infrared spectroscopy measurements using a partial least-squares regression analysis in conjunction with a large database of PyC measurements based on a solid-state 13C nuclear magnetic resonance spectroscopy technique. Three spatial models were applied to the data in order to relate critical environmental covariates to the changes in spatial density of PyC over the landscape. Regional mean density estimates of PyC were 11.0 g kg− 1 (0.84 Gg km− 2) for Ordinary Kriging, 25.8 g kg− 1(12.2 Gg km− 2) for Multivariate Linear Regression, and 26.1 g kg− 1 (12.4 Gg km− 2) for Bayesian Regression Kriging. Akaike Information Criterion (AIC) indicated that the Multivariate Linear Regression model performed best (AIC = 842.6; n = 165) compared to Ordinary Kriging (AIC = 982.4) and Bayesian Regression Kriging (AIC = 979.2). Soil PyC concentrations correlated well with total soil sulphur (P < 0.001; n = 165), plant tissue lignin (P = 0.003), and drainage class (P = 0.008). This suggests the opportunity of including related environmental parameters in the spatial assessment of PyC in soils. Better estimates of the contribution of PyC to the global carbon cycle will thus also require more accurate assessments of these covariates.
Online Variational Bayesian Filtering-Based Mobile Target Tracking in Wireless Sensor Networks

PubMed Central

Zhou, Bingpeng; Chen, Qingchun; Li, Tiffany Jing; Xiao, Pei

2014-01-01

The received signal strength (RSS)-based online tracking for a mobile node in wireless sensor networks (WSNs) is investigated in this paper. Firstly, a multi-layer dynamic Bayesian network (MDBN) is introduced to characterize the target mobility with either directional or undirected movement. In particular, it is proposed to employ the Wishart distribution to approximate the time-varying RSS measurement precision's randomness due to the target movement. It is shown that the proposed MDBN offers a more general analysis model via incorporating the underlying statistical information of both the target movement and observations, which can be utilized to improve the online tracking capability by exploiting the Bayesian statistics. Secondly, based on the MDBN model, a mean-field variational Bayesian filtering (VBF) algorithm is developed to realize the online tracking of a mobile target in the presence of nonlinear observations and time-varying RSS precision, wherein the traditional Bayesian filtering scheme cannot be directly employed. Thirdly, a joint optimization between the real-time velocity and its prior expectation is proposed to enable online velocity tracking in the proposed online tacking scheme. Finally, the associated Bayesian Cramer–Rao Lower Bound (BCRLB) analysis and numerical simulations are conducted. Our analysis unveils that, by exploiting the potential state information via the general MDBN model, the proposed VBF algorithm provides a promising solution to the online tracking of a mobile node in WSNs. In addition, it is shown that the final tracking accuracy linearly scales with its expectation when the RSS measurement precision is time-varying. PMID:25393784
Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: An example from a vertigo phase III study with longitudinal count data as primary endpoint

PubMed Central

2012-01-01

Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. Conclusions The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint. PMID:22962944
Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint.

PubMed

Adrion, Christine; Mansmann, Ulrich

2012-09-10

A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.

Low-Loss Materials for Josephson Qubits

DTIC Science & Technology

2014-10-09

quantum circuit. It also intuitively explains how for a linear circuit the standard results for electrical circuits are obtained, justifying the use of... linear concepts for a weakly non- linear device such as the transmon. It has also become common to use a double sided noise spectrum to represent...loss tangent of large area pad junction. (c) Effective linearized circuit for the double junction, which makes up the admittance $Y$. $L_j$ is the
Core turbulence behavior moving from ion-temperature-gradient regime towards trapped-electron-mode regime in the ASDEX Upgrade tokamak and comparison with gyrokinetic simulation

NASA Astrophysics Data System (ADS)

Happel, T.; Navarro, A. Bañón; Conway, G. D.; Angioni, C.; Bernert, M.; Dunne, M.; Fable, E.; Geiger, B.; Görler, T.; Jenko, F.; McDermott, R. M.; Ryter, F.; Stroth, U.

2015-03-01

Additional electron cyclotron resonance heating (ECRH) is used in an ion-temperature-gradient instability dominated regime to increase R / L Te in order to approach the trapped-electron-mode instability regime. The radial ECRH deposition location determines to a large degree the effect on R / L Te . Accompanying scale-selective turbulence measurements at perpendicular wavenumbers between k⊥ = 4-18 cm-1 (k⊥ρs = 0.7-4.2) show a pronounced increase of large-scale density fluctuations close to the ECRH radial deposition location at mid-radius, along with a reduction in phase velocity of large-scale density fluctuations. Measurements are compared with results from linear and non-linear flux-matched gyrokinetic (GK) simulations with the gyrokinetic code GENE. Linear GK simulations show a reduction of phase velocity, indicating a pronounced change in the character of the dominant instability. Comparing measurement and non-linear GK simulation, as a central result, agreement is obtained in the shape of radial turbulence level profiles. However, the turbulence intensity is increasing with additional heating in the experiment, while gyrokinetic simulations show a decrease.
Non-linear interactions between CO_2 radiative and physiological effects on Amazonian evapotranspiration in an Earth system model

NASA Astrophysics Data System (ADS)

Halladay, Kate; Good, Peter

2017-10-01

We present a detailed analysis of mechanisms underlying the evapotranspiration response to increased CO_2 in HadGEM2-ES, focussed on western Amazonia. We use three simulations from CMIP5 in which atmospheric CO_2 increases at 1% per year reaching approximately four times pre-industrial levels after 140 years. Using 3-hourly data, we found that evapotranspiration (ET) change was dominated by decreased stomatal conductance (g_s), and to a lesser extent by decreased canopy water and increased moisture gradient (specific humidity difference between surface and near-surface). There were large, non-linear decreases in ET in the simulation in which radiative and physiological forcings could interact. This non-linearity arises from non-linearity in the conductance term (includes aerodynamic and stomatal resistance and partitioning between the two, which is determined by canopy water availability), the moisture gradient, and negative correlation between these two terms. The conductance term is non-linear because GPP responds non-linearly to temperature and GPP is the dominant control on g_s in HadGEM2-ES. In addition, canopy water declines, mainly due to increases in potential evaporation, which further decrease the conductance term. The moisture gradient responds non-linearly owing to the non-linear response of temperature to CO_2 increases, which increases the Bowen ratio. Moisture gradient increases resulting from ET decline increase ET and thus constitute a negative feedback. This analysis highlights the importance of the g_s parametrisation in determining the ET response and the potential differences between offline and online simulations owing to feedbacks on ET via the atmosphere, some of which would not occur in an offline simulation.
Use of limited data to construct Bayesian networks for probabilistic risk assessment.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Groth, Katrina M.; Swiler, Laura Painton

2013-03-01

Probabilistic Risk Assessment (PRA) is a fundamental part of safety/quality assurance for nuclear power and nuclear weapons. Traditional PRA very effectively models complex hardware system risks using binary probabilistic models. However, traditional PRA models are not flexible enough to accommodate non-binary soft-causal factors, such as digital instrumentation&control, passive components, aging, common cause failure, and human errors. Bayesian Networks offer the opportunity to incorporate these risks into the PRA framework. This report describes the results of an early career LDRD project titled %E2%80%9CUse of Limited Data to Construct Bayesian Networks for Probabilistic Risk Assessment%E2%80%9D. The goal of the work was tomore » establish the capability to develop Bayesian Networks from sparse data, and to demonstrate this capability by producing a data-informed Bayesian Network for use in Human Reliability Analysis (HRA) as part of nuclear power plant Probabilistic Risk Assessment (PRA). This report summarizes the research goal and major products of the research.« less
Assessment of parametric uncertainty for groundwater reactive transport modeling,

USGS Publications Warehouse

Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun

2014-01-01

The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.
Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests

NASA Astrophysics Data System (ADS)

Wheeler, David C.; Waller, Lance A.

2009-03-01

In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization

ERIC Educational Resources Information Center

Gelman, Andrew; Lee, Daniel; Guo, Jiqiang

2015-01-01

Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
A Comparison of the β-Substitution Method and a Bayesian Method for Analyzing Left-Censored Data.

PubMed

Huynh, Tran; Quick, Harrison; Ramachandran, Gurumurthy; Banerjee, Sudipto; Stenzel, Mark; Sandler, Dale P; Engel, Lawrence S; Kwok, Richard K; Blair, Aaron; Stewart, Patricia A

2016-01-01

Classical statistical methods for analyzing exposure data with values below the detection limits are well described in the occupational hygiene literature, but an evaluation of a Bayesian approach for handling such data is currently lacking. Here, we first describe a Bayesian framework for analyzing censored data. We then present the results of a simulation study conducted to compare the β-substitution method with a Bayesian method for exposure datasets drawn from lognormal distributions and mixed lognormal distributions with varying sample sizes, geometric standard deviations (GSDs), and censoring for single and multiple limits of detection. For each set of factors, estimates for the arithmetic mean (AM), geometric mean, GSD, and the 95th percentile (X0.95) of the exposure distribution were obtained. We evaluated the performance of each method using relative bias, the root mean squared error (rMSE), and coverage (the proportion of the computed 95% uncertainty intervals containing the true value). The Bayesian method using non-informative priors and the β-substitution method were generally comparable in bias and rMSE when estimating the AM and GM. For the GSD and the 95th percentile, the Bayesian method with non-informative priors was more biased and had a higher rMSE than the β-substitution method, but use of more informative priors generally improved the Bayesian method's performance, making both the bias and the rMSE more comparable to the β-substitution method. An advantage of the Bayesian method is that it provided estimates of uncertainty for these parameters of interest and good coverage, whereas the β-substitution method only provided estimates of uncertainty for the AM, and coverage was not as consistent. Selection of one or the other method depends on the needs of the practitioner, the availability of prior information, and the distribution characteristics of the measurement data. We suggest the use of Bayesian methods if the practitioner has the computational resources and prior information, as the method would generally provide accurate estimates and also provides the distributions of all of the parameters, which could be useful for making decisions in some applications. © The Author 2015. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Phylodynamic Analysis Reveals CRF01_AE Dissemination between Japan and Neighboring Asian Countries and the Role of Intravenous Drug Use in Transmission

PubMed Central

Shiino, Teiichiro; Hattori, Junko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

2014-01-01

Background One major circulating HIV-1 subtype in Southeast Asian countries is CRF01_AE, but little is known about its epidemiology in Japan. We conducted a molecular phylodynamic study of patients newly diagnosed with CRF01_AE from 2003 to 2010. Methods Plasma samples from patients registered in Japanese Drug Resistance HIV-1 Surveillance Network were analyzed for protease-reverse transcriptase sequences; all sequences undergo subtyping and phylogenetic analysis using distance-matrix-based, maximum likelihood and Bayesian coalescent Markov Chain Monte Carlo (MCMC) phylogenetic inferences. Transmission clusters were identified using interior branch test and depth-first searches for sub-tree partitions. Times of most recent common ancestor (tMRCAs) of significant clusters were estimated using Bayesian MCMC analysis. Results Among 3618 patient registered in our network, 243 were infected with CRF01_AE. The majority of individuals with CRF01_AE were Japanese, predominantly male, and reported heterosexual contact as their risk factor. We found 5 large clusters with ≥5 members and 25 small clusters consisting of pairs of individuals with highly related CRF01_AE strains. The earliest cluster showed a tMRCA of 1996, and consisted of individuals with their known risk as heterosexual contacts. The other four large clusters showed later tMRCAs between 2000 and 2002 with members including intravenous drug users (IVDU) and non-Japanese, but not men who have sex with men (MSM). In contrast, small clusters included a high frequency of individuals reporting MSM risk factors. Phylogenetic analysis also showed that some individuals infected with HIV strains spread in East and South-eastern Asian countries. Conclusions Introduction of CRF01_AE viruses into Japan is estimated to have occurred in the 1990s. CFR01_AE spread via heterosexual behavior, then among persons connected with non-Japanese, IVDU, and MSM. Phylogenetic analysis demonstrated that some viral variants are largely restricted to Japan, while others have a broad geographic distribution. PMID:25025900
Using Bayesian neural networks to classify forest scenes

NASA Astrophysics Data System (ADS)

Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni

1998-10-01

We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.
Quantifying uncertainties of seismic Bayesian inversion of Northern Great Plains

NASA Astrophysics Data System (ADS)

Gao, C.; Lekic, V.

2017-12-01

Elastic waves excited by earthquakes are the fundamental observations of the seismological studies. Seismologists measure information such as travel time, amplitude, and polarization to infer the properties of earthquake source, seismic wave propagation, and subsurface structure. Across numerous applications, seismic imaging has been able to take advantage of complimentary seismic observables to constrain profiles and lateral variations of Earth's elastic properties. Moreover, seismic imaging plays a unique role in multidisciplinary studies of geoscience by providing direct constraints on the unreachable interior of the Earth. Accurate quantification of uncertainties of inferences made from seismic observations is of paramount importance for interpreting seismic images and testing geological hypotheses. However, such quantification remains challenging and subjective due to the non-linearity and non-uniqueness of geophysical inverse problem. In this project, we apply a reverse jump Markov chain Monte Carlo (rjMcMC) algorithm for a transdimensional Bayesian inversion of continental lithosphere structure. Such inversion allows us to quantify the uncertainties of inversion results by inverting for an ensemble solution. It also yields an adaptive parameterization that enables simultaneous inversion of different elastic properties without imposing strong prior information on the relationship between them. We present retrieved profiles of shear velocity (Vs) and radial anisotropy in Northern Great Plains using measurements from USArray stations. We use both seismic surface wave dispersion and receiver function data due to their complementary constraints of lithosphere structure. Furthermore, we analyze the uncertainties of both individual and joint inversion of those two data types to quantify the benefit of doing joint inversion. As an application, we infer the variation of Moho depths and crustal layering across the northern Great Plains.
Bayesian soft X-ray tomography using non-stationary Gaussian Processes

NASA Astrophysics Data System (ADS)

Li, Dong; Svensson, J.; Thomsen, H.; Medina, F.; Werner, A.; Wolf, R.

2013-08-01

In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.
Bayesian soft X-ray tomography using non-stationary Gaussian Processes.

PubMed

Li, Dong; Svensson, J; Thomsen, H; Medina, F; Werner, A; Wolf, R

2013-08-01

In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.
A statistical assessment of seismic models of the U.S. continental crust using Bayesian inversion of ambient noise surface wave dispersion data

NASA Astrophysics Data System (ADS)

Olugboji, T. M.; Lekic, V.; McDonough, W.

2017-07-01

We present a new approach for evaluating existing crustal models using ambient noise data sets and its associated uncertainties. We use a transdimensional hierarchical Bayesian inversion approach to invert ambient noise surface wave phase dispersion maps for Love and Rayleigh waves using measurements obtained from Ekström (2014). Spatiospectral analysis shows that our results are comparable to a linear least squares inverse approach (except at higher harmonic degrees), but the procedure has additional advantages: (1) it yields an autoadaptive parameterization that follows Earth structure without making restricting assumptions on model resolution (regularization or damping) and data errors; (2) it can recover non-Gaussian phase velocity probability distributions while quantifying the sources of uncertainties in the data measurements and modeling procedure; and (3) it enables statistical assessments of different crustal models (e.g., CRUST1.0, LITHO1.0, and NACr14) using variable resolution residual and standard deviation maps estimated from the ensemble. These assessments show that in the stable old crust of the Archean, the misfits are statistically negligible, requiring no significant update to crustal models from the ambient noise data set. In other regions of the U.S., significant updates to regionalization and crustal structure are expected especially in the shallow sedimentary basins and the tectonically active regions, where the differences between model predictions and data are statistically significant.
Choosing the Optimal Number of B-spline Control Points (Part 1: Methodology and Approximation of Curves)

NASA Astrophysics Data System (ADS)

Harmening, Corinna; Neuner, Hans

2016-09-01

Due to the establishment of terrestrial laser scanner, the analysis strategies in engineering geodesy change from pointwise approaches to areal ones. These areal analysis strategies are commonly built on the modelling of the acquired point clouds. Freeform curves and surfaces like B-spline curves/surfaces are one possible approach to obtain space continuous information. A variety of parameters determines the B-spline's appearance; the B-spline's complexity is mostly determined by the number of control points. Usually, this number of control points is chosen quite arbitrarily by intuitive trial-and-error-procedures. In this paper, the Akaike Information Criterion and the Bayesian Information Criterion are investigated with regard to a justified and reproducible choice of the optimal number of control points of B-spline curves. Additionally, we develop a method which is based on the structural risk minimization of the statistical learning theory. Unlike the Akaike and the Bayesian Information Criteria this method doesn't use the number of parameters as complexity measure of the approximating functions but their Vapnik-Chervonenkis-dimension. Furthermore, it is also valid for non-linear models. Thus, the three methods differ in their target function to be minimized and consequently in their definition of optimality. The present paper will be continued by a second paper dealing with the choice of the optimal number of control points of B-spline surfaces.
Fossil Signatures Using Elemental Abundance Distributions and Bayesian Probabilistic Classification

NASA Technical Reports Server (NTRS)

Hoover, Richard B.; Storrie-Lombardi, Michael C.

2004-01-01

Elemental abundances (C6, N7, O8, Na11, Mg12, Al3, P15, S16, Cl17, K19, Ca20, Ti22, Mn25, Fe26, and Ni28) were obtained for a set of terrestrial fossils and the rock matrix surrounding them. Principal Component Analysis extracted five factors accounting for the 92.5% of the data variance, i.e. information content, of the elemental abundance data. Hierarchical Cluster Analysis provided unsupervised sample classification distinguishing fossil from matrix samples on the basis of either raw abundances or PCA input that agreed strongly with visual classification. A stochastic, non-linear Artificial Neural Network produced a Bayesian probability of correct sample classification. The results provide a quantitative probabilistic methodology for discriminating terrestrial fossils from the surrounding rock matrix using chemical information. To demonstrate the applicability of these techniques to the assessment of meteoritic samples or in situ extraterrestrial exploration, we present preliminary data on samples of the Orgueil meteorite. In both systems an elemental signature produces target classification decisions remarkably consistent with morphological classification by a human expert using only structural (visual) information. We discuss the possibility of implementing a complexity analysis metric capable of automating certain image analysis and pattern recognition abilities of the human eye using low magnification optical microscopy images and discuss the extension of this technique across multiple scales.
A Gaussian Approximation Approach for Value of Information Analysis.

PubMed

Jalal, Hawre; Alarid-Escudero, Fernando

2018-02-01

Most decisions are associated with uncertainty. Value of information (VOI) analysis quantifies the opportunity loss associated with choosing a suboptimal intervention based on current imperfect information. VOI can inform the value of collecting additional information, resource allocation, research prioritization, and future research designs. However, in practice, VOI remains underused due to many conceptual and computational challenges associated with its application. Expected value of sample information (EVSI) is rooted in Bayesian statistical decision theory and measures the value of information from a finite sample. The past few years have witnessed a dramatic growth in computationally efficient methods to calculate EVSI, including metamodeling. However, little research has been done to simplify the experimental data collection step inherent to all EVSI computations, especially for correlated model parameters. This article proposes a general Gaussian approximation (GA) of the traditional Bayesian updating approach based on the original work by Raiffa and Schlaifer to compute EVSI. The proposed approach uses a single probabilistic sensitivity analysis (PSA) data set and involves 2 steps: 1) a linear metamodel step to compute the EVSI on the preposterior distributions and 2) a GA step to compute the preposterior distribution of the parameters of interest. The proposed approach is efficient and can be applied for a wide range of data collection designs involving multiple non-Gaussian parameters and unbalanced study designs. Our approach is particularly useful when the parameters of an economic evaluation are correlated or interact.
A new framework for modeling decisions about changing information: The Piecewise Linear Ballistic Accumulator model

PubMed Central

Heathcote, Andrew

2016-01-01

In the real world, decision making processes must be able to integrate non-stationary information that changes systematically while the decision is in progress. Although theories of decision making have traditionally been applied to paradigms with stationary information, non-stationary stimuli are now of increasing theoretical interest. We use a random-dot motion paradigm along with cognitive modeling to investigate how the decision process is updated when a stimulus changes. Participants viewed a cloud of moving dots, where the motion switched directions midway through some trials, and were asked to determine the direction of motion. Behavioral results revealed a strong delay effect: after presentation of the initial motion direction there is a substantial time delay before the changed motion information is integrated into the decision process. To further investigate the underlying changes in the decision process, we developed a Piecewise Linear Ballistic Accumulator model (PLBA). The PLBA is efficient to simulate, enabling it to be fit to participant choice and response-time distribution data in a hierarchal modeling framework using a non-parametric approximate Bayesian algorithm. Consistent with behavioral results, PLBA fits confirmed the presence of a long delay between presentation and integration of new stimulus information, but did not support increased response caution in reaction to the change. We also found the decision process was not veridical, as symmetric stimulus change had an asymmetric effect on the rate of evidence accumulation. Thus, the perceptual decision process was slow to react to, and underestimated, new contrary motion information. PMID:26760448
Non-Linear Vibroisolation Pads Design, Numerical FEM Analysis and Introductory Experimental Investigations

NASA Astrophysics Data System (ADS)

Zielnica, J.; Ziółkowski, A.; Cempel, C.

2003-03-01

Design and theoretical and experimental investigation of vibroisolation pads with non-linear static and dynamic responses is the objective of the paper. The analytical investigations are based on non-linear finite element analysis where the load-deflection response is traced against the shape and material properties of the analysed model of the vibroisolation pad. A new model of vibroisolation pad of antisymmetrical type was designed and analysed by the finite element method based on the second-order theory (large displacements and strains) with the assumption of material's non-linearities (Mooney-Rivlin model). Stability loss phenomenon was used in the design of the vibroisolators, and it was proved that it would be possible to design a model of vibroisolator in the form of a continuous pad with non-linear static and dynamic response, typical to vibroisolation purposes. The materials used for the vibroisolator are those of rubber, elastomers, and similar ones. The results of theoretical investigations were examined experimentally. A series of models made of soft rubber were designed for the test purposes. The experimental investigations of the vibroisolation models, under static and dynamic loads, confirmed the results of the FEM analysis.
A Bayesian Semiparametric Latent Variable Model for Mixed Responses

ERIC Educational Resources Information Center

Fahrmeir, Ludwig; Raach, Alexander

2007-01-01

In this paper we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric Gaussian regression model. We extend existing LVMs with the usual linear covariate effects by including nonparametric components for nonlinear…

Genomic microsatellites identify shared Jewish ancestry intermediate between Middle Eastern and European populations.

PubMed

Kopelman, Naama M; Stone, Lewi; Wang, Chaolong; Gefel, Dov; Feldman, Marcus W; Hillel, Jossi; Rosenberg, Noah A

2009-12-08

Genetic studies have often produced conflicting results on the question of whether distant Jewish populations in different geographic locations share greater genetic similarity to each other or instead, to nearby non-Jewish populations. We perform a genome-wide population-genetic study of Jewish populations, analyzing 678 autosomal microsatellite loci in 78 individuals from four Jewish groups together with similar data on 321 individuals from 12 non-Jewish Middle Eastern and European populations. We find that the Jewish populations show a high level of genetic similarity to each other, clustering together in several types of analysis of population structure. Further, Bayesian clustering, neighbor-joining trees, and multidimensional scaling place the Jewish populations as intermediate between the non-Jewish Middle Eastern and European populations. These results support the view that the Jewish populations largely share a common Middle Eastern ancestry and that over their history they have undergone varying degrees of admixture with non-Jewish populations of European descent.
Genomic microsatellites identify shared Jewish ancestry intermediate between Middle Eastern and European populations

PubMed Central

2009-01-01

Background Genetic studies have often produced conflicting results on the question of whether distant Jewish populations in different geographic locations share greater genetic similarity to each other or instead, to nearby non-Jewish populations. We perform a genome-wide population-genetic study of Jewish populations, analyzing 678 autosomal microsatellite loci in 78 individuals from four Jewish groups together with similar data on 321 individuals from 12 non-Jewish Middle Eastern and European populations. Results We find that the Jewish populations show a high level of genetic similarity to each other, clustering together in several types of analysis of population structure. Further, Bayesian clustering, neighbor-joining trees, and multidimensional scaling place the Jewish populations as intermediate between the non-Jewish Middle Eastern and European populations. Conclusion These results support the view that the Jewish populations largely share a common Middle Eastern ancestry and that over their history they have undergone varying degrees of admixture with non-Jewish populations of European descent. PMID:19995433
A fast estimator for the bispectrum and beyond - a practical method for measuring non-Gaussianity in 21-cm maps

NASA Astrophysics Data System (ADS)

Watkinson, Catherine A.; Majumdar, Suman; Pritchard, Jonathan R.; Mondal, Rajesh

2017-12-01

In this paper, we establish the accuracy and robustness of a fast estimator for the bispectrum - the 'FFT-bispectrum estimator'. The implementation of the estimator presented here offers speed and simplicity benefits over a direct-measurement approach. We also generalize the derivation so it may be easily be applied to any order polyspectra, such as the trispectrum, with the cost of only a handful of Fast-Fourier Transforms (FFTs). All lower order statistics can also be calculated simultaneously for little extra cost. To test the estimator, we make use of a non-linear density field, and for a more strongly non-Gaussian test case, we use a toy-model of reionization in which ionized bubbles at a given redshift are all of equal size and are randomly distributed. Our tests find that the FFT-estimator remains accurate over a wide range of k, and so should be extremely useful for analysis of 21-cm observations. The speed of the FFT-bispectrum estimator makes it suitable for sampling applications, such as Bayesian inference. The algorithm we describe should prove valuable in the analysis of simulations and observations, and whilst, we apply it within the field of cosmology, this estimator is useful in any field that deals with non-Gaussian data.
Implementation of Dynamic Extensible Adaptive Locally Exchangeable Measures (IDEALEM) v 0.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sim, Alex; Lee, Dongeun; Wu, K. John

2016-03-04

Handling large streaming data is essential for various applications such as network traffic analysis, social networks, energy cost trends, and environment modeling. However, it is in general intractable to store, compute, search, and retrieve large streaming data. This software addresses a fundamental issue, which is to reduce the size of large streaming data and still obtain accurate statistical analysis. As an example, when a high-speed network such as 100 Gbps network is monitored, the collected measurement data rapidly grows so that polynomial time algorithms (e.g., Gaussian processes) become intractable. One possible solution to reduce the storage of vast amounts ofmore » measured data is to store a random sample, such as one out of 1000 network packets. However, such static sampling methods (linear sampling) have drawbacks: (1) it is not scalable for high-rate streaming data, and (2) there is no guarantee of reflecting the underlying distribution. In this software, we implemented a dynamic sampling algorithm, based on the recent technology from the relational dynamic bayesian online locally exchangeable measures, that reduces the storage of data records in a large scale, and still provides accurate analysis of large streaming data. The software can be used for both online and offline data records.« less
A new prior for bayesian anomaly detection: application to biosurveillance.

PubMed

Shen, Y; Cooper, G F

2010-01-01

Bayesian anomaly detection computes posterior probabilities of anomalous events by combining prior beliefs and evidence from data. However, the specification of prior probabilities can be challenging. This paper describes a Bayesian prior in the context of disease outbreak detection. The goal is to provide a meaningful, easy-to-use prior that yields a posterior probability of an outbreak that performs at least as well as a standard frequentist approach. If this goal is achieved, the resulting posterior could be usefully incorporated into a decision analysis about how to act in light of a possible disease outbreak. This paper describes a Bayesian method for anomaly detection that combines learning from data with a semi-informative prior probability over patterns of anomalous events. A univariate version of the algorithm is presented here for ease of illustration of the essential ideas. The paper describes the algorithm in the context of disease-outbreak detection, but it is general and can be used in other anomaly detection applications. For this application, the semi-informative prior specifies that an increased count over baseline is expected for the variable being monitored, such as the number of respiratory chief complaints per day at a given emergency department. The semi-informative prior is derived based on the baseline prior, which is estimated from using historical data. The evaluation reported here used semi-synthetic data to evaluate the detection performance of the proposed Bayesian method and a control chart method, which is a standard frequentist algorithm that is closest to the Bayesian method in terms of the type of data it uses. The disease-outbreak detection performance of the Bayesian method was statistically significantly better than that of the control chart method when proper baseline periods were used to estimate the baseline behavior to avoid seasonal effects. When using longer baseline periods, the Bayesian method performed as well as the control chart method. The time complexity of the Bayesian algorithm is linear in the number of the observed events being monitored, due to a novel, closed-form derivation that is introduced in the paper. This paper introduces a novel prior probability for Bayesian outbreak detection that is expressive, easy-to-apply, computationally efficient, and performs as well or better than a standard frequentist method.
The Bayesian Cramér-Rao lower bound in Astrometry

NASA Astrophysics Data System (ADS)

Mendez, R. A.; Echeverria, A.; Silva, J.; Orchard, M.

2018-01-01

A determination of the highest precision that can be achieved in the measurement of the location of a stellar-like object has been a topic of permanent interest by the astrometric community. The so-called (parametric, or non-Bayesian) Cramér-Rao (CR hereafter) bound provides a lower bound for the variance with which one could estimate the position of a point source. This has been studied recently by Mendez et al. (2013, 2014, 2015). In this work we present a different approach to the same problem (Echeverria et al. 2016), using a Bayesian CR setting which has a number of advantages over the parametric scenario.
The Bayesian Cramér-Rao lower bound in Astrometry

NASA Astrophysics Data System (ADS)

Mendez, R. A.; Echeverria, A.; Silva, J.; Orchard, M.

2017-07-01

A determination of the highest precision that can be achieved in the measurement of the location of a stellar-like object has been a topic of permanent interest by the astrometric community. The so-called (parametric, or non-Bayesian) Cramér-Rao (CR hereafter) bound provides a lower bound for the variance with which one could estimate the position of a point source. This has been studied recently by Mendez and collaborators (2014, 2015). In this work we present a different approach to the same problem (Echeverria et al. 2016), using a Bayesian CR setting which has a number of advantages over the parametric scenario.
Structural mapping in statistical word problems: A relational reasoning approach to Bayesian inference.

PubMed

Johnson, Eric D; Tubau, Elisabet

2017-06-01

Presenting natural frequencies facilitates Bayesian inferences relative to using percentages. Nevertheless, many people, including highly educated and skilled reasoners, still fail to provide Bayesian responses to these computationally simple problems. We show that the complexity of relational reasoning (e.g., the structural mapping between the presented and requested relations) can help explain the remaining difficulties. With a non-Bayesian inference that required identical arithmetic but afforded a more direct structural mapping, performance was universally high. Furthermore, reducing the relational demands of the task through questions that directed reasoners to use the presented statistics, as compared with questions that prompted the representation of a second, similar sample, also significantly improved reasoning. Distinct error patterns were also observed between these presented- and similar-sample scenarios, which suggested differences in relational-reasoning strategies. On the other hand, while higher numeracy was associated with better Bayesian reasoning, higher-numerate reasoners were not immune to the relational complexity of the task. Together, these findings validate the relational-reasoning view of Bayesian problem solving and highlight the importance of considering not only the presented task structure, but also the complexity of the structural alignment between the presented and requested relations.
Bayesian sparse channel estimation

NASA Astrophysics Data System (ADS)

Chen, Chulong; Zoltowski, Michael D.

2012-05-01

In Orthogonal Frequency Division Multiplexing (OFDM) systems, the technique used to estimate and track the time-varying multipath channel is critical to ensure reliable, high data rate communications. It is recognized that wireless channels often exhibit a sparse structure, especially for wideband and ultra-wideband systems. In order to exploit this sparse structure to reduce the number of pilot tones and increase the channel estimation quality, the application of compressed sensing to channel estimation is proposed. In this article, to make the compressed channel estimation more feasible for practical applications, it is investigated from a perspective of Bayesian learning. Under the Bayesian learning framework, the large-scale compressed sensing problem, as well as large time delay for the estimation of the doubly selective channel over multiple consecutive OFDM symbols, can be avoided. Simulation studies show a significant improvement in channel estimation MSE and less computing time compared to the conventional compressed channel estimation techniques.
Imprints of dark energy on cosmic structure formation - I. Realistic quintessence models and the non-linear matter power spectrum

NASA Astrophysics Data System (ADS)

Alimi, J.-M.; Füzfa, A.; Boucher, V.; Rasera, Y.; Courtin, J.; Corasaniti, P.-S.

2010-01-01

Quintessence has been proposed to account for dark energy (DE) in the Universe. This component causes a typical modification of the background cosmic expansion, which, in addition to its clustering properties, can leave a potentially distinctive signature on large-scale structures. Many previous studies have investigated this topic, particularly in relation to the non-linear regime of structure formation. However, no careful pre-selection of viable quintessence models with high precision cosmological data was performed. Here we show that this has led to a misinterpretation (and underestimation) of the imprint of quintessence on the distribution of large-scale structures. To this purpose, we perform a likelihood analysis of the combined Supernova Ia UNION data set and Wilkinson Microwave Anisotropy Probe 5-yr data to identify realistic quintessence models. These are specified by different model parameter values, but still statistically indistinguishable from the vanilla Λ cold dark matter (ΛCDM). Differences are especially manifest in the predicted amplitude and shape of the linear matter power spectrum though these remain within the uncertainties of the Sloan Digital Sky Survey data. We use these models as a benchmark for studying the clustering properties of dark matter haloes by performing a series of high-resolution N-body simulations. In this first paper, we specifically focus on the non-linear matter power spectrum. We find that realistic quintessence models allow for relevant differences of the dark matter distribution with respect to the ΛCDM scenario well into the non-linear regime, with deviations of up to 40 per cent in the non-linear power spectrum. Such differences are shown to depend on the nature of DE, as well as the scale and epoch considered. At small scales (k ~ 1-5hMpc-1, depending on the redshift), the structure formation process is about 20 per cent more efficient than in ΛCDM. We show that these imprints are a specific record of the cosmic structure formation history in DE cosmologies and therefore cannot be accounted for in standard fitting functions of the non-linear matter power spectrum.
Correction of Non-Linear Propagation Artifact in Contrast-Enhanced Ultrasound Imaging of Carotid Arteries: Methods and in Vitro Evaluation.

PubMed

Yildiz, Yesna O; Eckersley, Robert J; Senior, Roxy; Lim, Adrian K P; Cosgrove, David; Tang, Meng-Xing

2015-07-01

Non-linear propagation of ultrasound creates artifacts in contrast-enhanced ultrasound images that significantly affect both qualitative and quantitative assessments of tissue perfusion. This article describes the development and evaluation of a new algorithm to correct for this artifact. The correction is a post-processing method that estimates and removes non-linear artifact in the contrast-specific image using the simultaneously acquired B-mode image data. The method is evaluated on carotid artery flow phantoms with large and small vessels containing microbubbles of various concentrations at different acoustic pressures. The algorithm significantly reduces non-linear artifacts while maintaining the contrast signal from bubbles to increase the contrast-to-tissue ratio by up to 11 dB. Contrast signal from a small vessel 600 μm in diameter buried in tissue artifacts before correction was recovered after the correction. Copyright © 2015 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Non-linearities in Holocene floodplain sediment storage

NASA Astrophysics Data System (ADS)

Notebaert, Bastiaan; Nils, Broothaerts; Jean-François, Berger; Gert, Verstraeten

2013-04-01

Floodplain sediment storage is an important part of the sediment cascade model, buffering sediment delivery between hillslopes and oceans, which is hitherto not fully quantified in contrast to other global sediment budget components. Quantification and dating of floodplain sediment storage is data and financially demanding, limiting contemporary estimates for larger spatial units to simple linear extrapolations from a number of smaller catchments. In this paper we will present non-linearities in both space and time for floodplain sediment budgets in three different catchments. Holocene floodplain sediments of the Dijle catchment in the Belgian loess region, show a clear distinction between morphological stages: early Holocene peat accumulation, followed by mineral floodplain aggradation from the start of the agricultural period on. Contrary to previous assumptions, detailed dating of this morphological change at different shows an important non-linearity in geomorphologic changes of the floodplain, both between and within cross sections. A second example comes from the Pre-Alpine French Valdaine region, where non-linearities and complex system behavior exists between (temporal) patterns of soil erosion and floodplain sediment deposition. In this region Holocene floodplain deposition is characterized by different cut-and-fill phases. The quantification of these different phases shows a complicated image of increasing and decreasing floodplain sediment storage, which hampers the image of increasing sediment accumulation over time. Although fill stages may correspond with large quantities of deposited sediment and traditionally calculated sedimentation rates for such stages are high, they do not necessary correspond with a long-term net increase in floodplain deposition. A third example is based on the floodplain sediment storage in the Amblève catchment, located in the Belgian Ardennes uplands. Detailed floodplain sediment quantification for this catchments shows that a strong multifractality is present in the scaling relationship between sediment storage and catchment area, depending on geomorphic landscape properties. Extrapolation of data from one spatial scale to another inevitably leads to large errors: when only the data of the upper floodplains are considered, a regression analysis results in an overestimation of total floodplain deposition for the entire catchment of circa 115%. This example demonstrates multifractality and related non-linearity in scaling relationships, which influences extrapolations beyond the initial range of measurements. These different examples indicate how traditional extrapolation techniques and assumptions in sediment budget studies can be challenged by field data, further complicating our understanding of these systems. Although simplifications are often necessary when working on large spatial scale, such non-linearities may form challenges for a better understanding of system behavior.
Temporal Gain Correction for X-Ray Calorimeter Spectrometers

NASA Technical Reports Server (NTRS)

Porter, F. S.; Chiao, M. P.; Eckart, M. E.; Fujimoto, R.; Ishisaki, Y.; Kelley, R. L.; Kilbourne, C. A.; Leutenegger, M. A.; McCammon, D.; Mitsuda, K.

2016-01-01

Calorimetric X-ray detectors are very sensitive to their environment. The boundary conditions can have a profound effect on the gain including heat sink temperature, the local radiation temperature, bias, and the temperature of the readout electronics. Any variation in the boundary conditions can cause temporal variations in the gain of the detector and compromise both the energy scale and the resolving power of the spectrometer. Most production X-ray calorimeter spectrometers, both on the ground and in space, have some means of tracking the gain as a function of time, often using a calibration spectral line. For small gain changes, a linear stretch correction is often sufficient. However, the detectors are intrinsically non-linear and often the event analysis, i.e., shaping, optimal filters etc., add additional non-linearity. Thus for large gain variations or when the best possible precision is required, a linear stretch correction is not sufficient. Here, we discuss a new correction technique based on non-linear interpolation of the energy-scale functions. Using Astro-HSXS calibration data, we demonstrate that the correction can recover the X-ray energy to better than 1 part in 104 over the entire spectral band to above 12 keV even for large-scale gain variations. This method will be used to correct any temporal drift of the on-orbit per-pixel gain using on-board calibration sources for the SXS instrument on the Astro-H observatory.
Logistic random effects regression models: a comparison of statistical packages for binary and ordinal outcomes

PubMed Central

2011-01-01

Background Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. Methods We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. Results The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. Conclusions On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain. PMID:21605357
Discrete-time neural network for fast solving large linear L1 estimation problems and its application to image restoration.

PubMed

Xia, Youshen; Sun, Changyin; Zheng, Wei Xing

2012-05-01

There is growing interest in solving linear L1 estimation problems for sparsity of the solution and robustness against non-Gaussian noise. This paper proposes a discrete-time neural network which can calculate large linear L1 estimation problems fast. The proposed neural network has a fixed computational step length and is proved to be globally convergent to an optimal solution. Then, the proposed neural network is efficiently applied to image restoration. Numerical results show that the proposed neural network is not only efficient in solving degenerate problems resulting from the nonunique solutions of the linear L1 estimation problems but also needs much less computational time than the related algorithms in solving both linear L1 estimation and image restoration problems.
Automatic physical inference with information maximizing neural networks

NASA Astrophysics Data System (ADS)

Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

2018-04-01

Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests

NASA Astrophysics Data System (ADS)

Shumway, R. H.

2001-10-01

- The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests

NASA Astrophysics Data System (ADS)

Shumway, R. H.

The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
A Bayesian approach to identifying structural nonlinearity using free-decay response: Application to damage detection in composites

USGS Publications Warehouse

Nichols, J.M.; Link, W.A.; Murphy, K.D.; Olson, C.C.

2010-01-01

This work discusses a Bayesian approach to approximating the distribution of parameters governing nonlinear structural systems. Specifically, we use a Markov Chain Monte Carlo method for sampling the posterior parameter distributions thus producing both point and interval estimates for parameters. The method is first used to identify both linear and nonlinear parameters in a multiple degree-of-freedom structural systems using free-decay vibrations. The approach is then applied to the problem of identifying the location, size, and depth of delamination in a model composite beam. The influence of additive Gaussian noise on the response data is explored with respect to the quality of the resulting parameter estimates.
Bayesian flood forecasting methods: A review

NASA Astrophysics Data System (ADS)

Han, Shasha; Coulibaly, Paulin

2017-08-01

Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.

Genetic basis of climatic adaptation in scots pine by bayesian quantitative trait locus analysis.

PubMed Central

Hurme, P; Sillanpää, M J; Arjas, E; Repo, T; Savolainen, O

2000-01-01

We examined the genetic basis of large adaptive differences in timing of bud set and frost hardiness between natural populations of Scots pine. As a mapping population, we considered an "open-pollinated backcross" progeny by collecting seeds of a single F(1) tree (cross between trees from southern and northern Finland) growing in southern Finland. Due to the special features of the design (no marker information available on grandparents or the father), we applied a Bayesian quantitative trait locus (QTL) mapping method developed previously for outcrossed offspring. We found four potential QTL for timing of bud set and seven for frost hardiness. Bayesian analyses detected more QTL than ANOVA for frost hardiness, but the opposite was true for bud set. These QTL included alleles with rather large effects, and additionally smaller QTL were supported. The largest QTL for bud set date accounted for about a fourth of the mean difference between populations. Thus, natural selection during adaptation has resulted in selection of at least some alleles of rather large effect. PMID:11063704
Nonlinear study of the parallel velocity/tearing instability using an implicit, nonlinear resistive MHD solver

NASA Astrophysics Data System (ADS)

Chacon, L.; Finn, J. M.; Knoll, D. A.

2000-10-01

Recently, a new parallel velocity instability has been found.(J. M. Finn, Phys. Plasmas), 2, 12 (1995) This mode is a tearing mode driven unstable by curvature effects and sound wave coupling in the presence of parallel velocity shear. Under such conditions, linear theory predicts that tearing instabilities will grow even in situations in which the classical tearing mode is stable. This could then be a viable seed mechanism for the neoclassical tearing mode, and hence a non-linear study is of interest. Here, the linear and non-linear stages of this instability are explored using a fully implicit, fully nonlinear 2D reduced resistive MHD code,(L. Chacon et al), ``Implicit, Jacobian-free Newton-Krylov 2D reduced resistive MHD nonlinear solver,'' submitted to J. Comput. Phys. (2000) including viscosity and particle transport effects. The nonlinear implicit time integration is performed using the Newton-Raphson iterative algorithm. Krylov iterative techniques are employed for the required algebraic matrix inversions, implemented Jacobian-free (i.e., without ever forming and storing the Jacobian matrix), and preconditioned with a ``physics-based'' preconditioner. Nonlinear results indicate that, for large total plasma beta and large parallel velocity shear, the instability results in the generation of large poloidal shear flows and large magnetic islands even in regimes when the classical tearing mode is absolutely stable. For small viscosity, the time asymptotic state can be turbulent.
Bayesian Methods for the Physical Sciences. Learning from Examples in Astronomy and Physics.

NASA Astrophysics Data System (ADS)

Andreon, Stefano; Weaver, Brian

2015-05-01

Chapter 1: This chapter presents some basic steps for performing a good statistical analysis, all summarized in about one page. Chapter 2: This short chapter introduces the basics of probability theory inan intuitive fashion using simple examples. It also illustrates, again with examples, how to propagate errors and the difference between marginal and profile likelihoods. Chapter 3: This chapter introduces the computational tools and methods that we use for sampling from the posterior distribution. Since all numerical computations, and Bayesian ones are no exception, may end in errors, we also provide a few tips to check that the numerical computation is sampling from the posterior distribution. Chapter 4: Many of the concepts of building, running, and summarizing the resultsof a Bayesian analysis are described with this step-by-step guide using a basic (Gaussian) model. The chapter also introduces examples using Poisson and Binomial likelihoods, and how to combine repeated independent measurements. Chapter 5: All statistical analyses make assumptions, and Bayesian analyses are no exception. This chapter emphasizes that results depend on data and priors (assumptions). We illustrate this concept with examples where the prior plays greatly different roles, from major to negligible. We also provide some advice on how to look for information useful for sculpting the prior. Chapter 6: In this chapter we consider examples for which we want to estimate more than a single parameter. These common problems include estimating location and spread. We also consider examples that require the modeling of two populations (one we are interested in and a nuisance population) or averaging incompatible measurements. We also introduce quite complex examples dealing with upper limits and with a larger-than-expected scatter. Chapter 7: Rarely is a sample randomly selected from the population we wish to study. Often, samples are affected by selection effects, e.g., easier-to-collect events or objects are over-represented in samples and difficult-to-collect are under-represented if not missing altogether. In this chapter we show how to account for non-random data collection to infer the properties of the population from the studied sample. Chapter 8: In this chapter we introduce regression models, i.e., how to fit (regress) one, or more quantities, against each other through a functional relationship and estimate any unknown parameters that dictate this relationship. Questions of interest include: how to deal with samples affected by selection effects? How does a rich data structure influence the fitted parameters? And what about non-linear multiple-predictor fits, upper/lower limits, measurements errors of different amplitudes and an intrinsic variety in the studied populations or an extra source of variability? A number of examples illustrate how to answer these questions and how to predict the value of an unavailable quantity by exploiting the existence of a trend with another, available, quantity. Chapter 9: This chapter provides some advice on how the careful scientist should perform model checking and sensitivity analysis, i.e., how to answer the following questions: is the considered model at odds with the current available data (the fitted data), for example because it is over-simplified compared to some specific complexity pointed out by the data? Furthermore, are the data informative about the quantity being measured or are results sensibly dependent on details of the fitted model? And, finally, what about if assumptions are uncertain? A number of examples illustrate how to answer these questions. Chapter 10: This chapter compares the performance of Bayesian methods against simple, non-Bayesian alternatives, such as maximum likelihood, minimal chi square, ordinary and weighted least square, bivariate correlated errors and intrinsic scatter, and robust estimates of location and scale. Performances are evaluated in terms of quality of the prediction, accuracy of the estimates, and fairness and noisiness of the quoted errors. We also focus on three failures of maximum likelihood methods occurring with small samples, with mixtures, and with regressions with errors in the predictor quantity.
Improved inference in Bayesian segmentation using Monte Carlo sampling: application to hippocampal subfield volumetry.

PubMed

Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen

2013-10-01

Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the technique also allows one to compute informative "error bars" on the volume estimates of individual structures. Copyright © 2013 Elsevier B.V. All rights reserved.
Improved Inference in Bayesian Segmentation Using Monte Carlo Sampling: Application to Hippocampal Subfield Volumetry

PubMed Central

Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Leemput, Koen Van

2013-01-01

Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer’s disease classification task. As an additional benefit, the technique also allows one to compute informative “error bars” on the volume estimates of individual structures. PMID:23773521
Bayesian inference of a historical bottleneck in a heavily exploited marine mammal.

PubMed

Hoffman, J I; Grant, S M; Forcada, J; Phillips, C D

2011-10-01

Emerging Bayesian analytical approaches offer increasingly sophisticated means of reconstructing historical population dynamics from genetic data, but have been little applied to scenarios involving demographic bottlenecks. Consequently, we analysed a large mitochondrial and microsatellite dataset from the Antarctic fur seal Arctocephalus gazella, a species subjected to one of the most extreme examples of uncontrolled exploitation in history when it was reduced to the brink of extinction by the sealing industry during the late eighteenth and nineteenth centuries. Classical bottleneck tests, which exploit the fact that rare alleles are rapidly lost during demographic reduction, yielded ambiguous results. In contrast, a strong signal of recent demographic decline was detected using both Bayesian skyline plots and Approximate Bayesian Computation, the latter also allowing derivation of posterior parameter estimates that were remarkably consistent with historical observations. This was achieved using only contemporary samples, further emphasizing the potential of Bayesian approaches to address important problems in conservation and evolutionary biology. © 2011 Blackwell Publishing Ltd.
Efficient Stochastic Inversion Using Adjoint Models and Kernel-PCA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thimmisetty, Charanraj A.; Zhao, Wenju; Chen, Xiao

2017-10-18

Performing stochastic inversion on a computationally expensive forward simulation model with a high-dimensional uncertain parameter space (e.g. a spatial random field) is computationally prohibitive even when gradient information can be computed efficiently. Moreover, the ‘nonlinear’ mapping from parameters to observables generally gives rise to non-Gaussian posteriors even with Gaussian priors, thus hampering the use of efficient inversion algorithms designed for models with Gaussian assumptions. In this paper, we propose a novel Bayesian stochastic inversion methodology, which is characterized by a tight coupling between the gradient-based Langevin Markov Chain Monte Carlo (LMCMC) method and a kernel principal component analysis (KPCA). Thismore » approach addresses the ‘curse-of-dimensionality’ via KPCA to identify a low-dimensional feature space within the high-dimensional and nonlinearly correlated parameter space. In addition, non-Gaussian posterior distributions are estimated via an efficient LMCMC method on the projected low-dimensional feature space. We will demonstrate this computational framework by integrating and adapting our recent data-driven statistics-on-manifolds constructions and reduction-through-projection techniques to a linear elasticity model.« less
A Bayesian-based system to assess wave-driven flooding hazards on coral reef-lined coasts

USGS Publications Warehouse

Pearson, S. G.; Storlazzi, Curt; van Dongeren, A. R.; Tissier, M. F. S.; Reniers, A. J. H. M.

2017-01-01

Many low-elevation, coral reef-lined, tropical coasts are vulnerable to the effects of climate change, sea level rise, and wave-induced flooding. The considerable morphological diversity of these coasts and the variability of the hydrodynamic forcing that they are exposed to make predicting wave-induced flooding a challenge. A process-based wave-resolving hydrodynamic model (XBeach Non-Hydrostatic, “XBNH”) was used to create a large synthetic database for use in a “Bayesian Estimator for Wave Attack in Reef Environments” (BEWARE), relating incident hydrodynamics and coral reef geomorphology to coastal flooding hazards on reef-lined coasts. Building on previous work, BEWARE improves system understanding of reef hydrodynamics by examining the intrinsic reef and extrinsic forcing factors controlling runup and flooding on reef-lined coasts. The Bayesian estimator has high predictive skill for the XBNH model outputs that are flooding indicators, and was validated for a number of available field cases. It was found that, in order to accurately predict flooding hazards, water depth over the reef flat, incident wave conditions, and reef flat width are the most essential factors, whereas other factors such as beach slope and bed friction due to the presence or absence of corals are less important. BEWARE is a potentially powerful tool for use in early warning systems or risk assessment studies, and can be used to make projections about how wave-induced flooding on coral reef-lined coasts may change due to climate change.
A Non-linear Lifting Line Model for Design and Analysis of Trochoidal Propulsors

NASA Astrophysics Data System (ADS)

Roesler, Bernard; Epps, Brenden

2014-11-01

Flapping wing propulsors may increase the propulsive efficiency of large shipping vessels. A comparison of the design of a notional propulsor for a large shipping vessel with (a) a conventional ducted propeller versus (b) a flapping wing propulsor is presented. Calculations for flapping wing propulsors are performed using an open-source MATLAB software suite developed by the authors, CyROD, implementing an unsteady lifting-line model with free vortex wake roll-up to study the non-linear effects of foil-wake, and foil-foil interactions. Improvements to the traditional lifting line theory are made using further discretization of the wake vortex ring spacing near the trailing edge. Considerations of packaging options for a flapping wing propulsor on a large shipping vessel are presented, and compared with those for a conventional ducted propeller.
A guide to Bayesian model selection for ecologists

USGS Publications Warehouse

Hooten, Mevin B.; Hobbs, N.T.

2015-01-01

The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Bayesian Fundamentalism or Enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition.

PubMed

Jones, Matt; Love, Bradley C

2011-08-01

The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls that have plagued previous theoretical movements.
DEMNUni: massive neutrinos and the bispectrum of large scale structures

NASA Astrophysics Data System (ADS)

Ruggeri, Rossana; Castorina, Emanuele; Carbone, Carmelita; Sefusatti, Emiliano

2018-03-01

The main effect of massive neutrinos on the large-scale structure consists in a few percent suppression of matter perturbations on all scales below their free-streaming scale. Such effect is of particular importance as it allows to constraint the value of the sum of neutrino masses from measurements of the galaxy power spectrum. In this work, we present the first measurements of the next higher-order correlation function, the bispectrum, from N-body simulations that include massive neutrinos as particles. This is the simplest statistics characterising the non-Gaussian properties of the matter and dark matter halos distributions. We investigate, in the first place, the suppression due to massive neutrinos on the matter bispectrum, comparing our measurements with the simplest perturbation theory predictions, finding the approximation of neutrinos contributing at quadratic order in perturbation theory to provide a good fit to the measurements in the simulations. On the other hand, as expected, a linear approximation for neutrino perturbations would lead to Script O(fν) errors on the total matter bispectrum at large scales. We then attempt an extension of previous results on the universality of linear halo bias in neutrino cosmologies, to non-linear and non-local corrections finding consistent results with the power spectrum analysis.
Origin of optical non-linear response in TiN owing to excitation dynamics of surface plasmon resonance electronic oscillations

NASA Astrophysics Data System (ADS)

Divya, S.; Nampoori, V. P. N.; Radhakrishnan, P.; Mujeeb, A.

2014-08-01

TiN nanoparticles of average size 55 nm were investigated for their optical non-linear properties. During the experiment the irradiated laser wavelength coincided with the surface plasmon resonance (SPR) peak of the nanoparticle. The large non-linearity of the nanoparticle was attributed to the plasmon resonance, which largely enhanced the local field within the nanoparticle. Both open and closed aperture Z-scan experiments were performed and the corresponding optical constants were explored. The post-excitation absorption spectra revealed the interesting phenomenon of photo fragmentation leading to the blue shift in band gap and red shift in the SPR. The results are discussed in terms of enhanced interparticle interaction simultaneous with size reduction. Here, the optical constants being intrinsic constants for a particular sample change unusually with laser power intensity. The dependence of χ(3) is discussed in terms of the size variation caused by photo fragmentation. The studies proved that the TiN nanoparticles are potential candidates in photonics technology offering huge scope to study unexplored research for various expedient applications.
Extraordinary Claims Require Extraordinary Evidence: The Case of Non-Local Perception, a Classical and Bayesian Review of Evidences

PubMed Central

Tressoldi, Patrizio E.

2011-01-01

Starting from the famous phrase “extraordinary claims require extraordinary evidence,” we will present the evidence supporting the concept that human visual perception may have non-local properties, in other words, that it may operate beyond the space and time constraints of sensory organs, in order to discuss which criteria can be used to define evidence as extraordinary. This evidence has been obtained from seven databases which are related to six different protocols used to test the reality and the functioning of non-local perception, analyzed using both a frequentist and a new Bayesian meta-analysis statistical procedure. According to a frequentist meta-analysis, the null hypothesis can be rejected for all six protocols even if the effect sizes range from 0.007 to 0.28. According to Bayesian meta-analysis, the Bayes factors provides strong evidence to support the alternative hypothesis (H1) over the null hypothesis (H0), but only for three out of the six protocols. We will discuss whether quantitative psychology can contribute to defining the criteria for the acceptance of new scientific ideas in order to avoid the inconclusive controversies between supporters and opponents. PMID:21713069
Validating Bayesian truth serum in large-scale online human experiments.

PubMed

Frank, Morgan R; Cebrian, Manuel; Pickard, Galen; Rahwan, Iyad

2017-01-01

Bayesian truth serum (BTS) is an exciting new method for improving honesty and information quality in multiple-choice survey, but, despite the method's mathematical reliance on large sample sizes, existing literature about BTS only focuses on small experiments. Combined with the prevalence of online survey platforms, such as Amazon's Mechanical Turk, which facilitate surveys with hundreds or thousands of participants, BTS must be effective in large-scale experiments for BTS to become a readily accepted tool in real-world applications. We demonstrate that BTS quantifiably improves honesty in large-scale online surveys where the "honest" distribution of answers is known in expectation on aggregate. Furthermore, we explore a marketing application where "honest" answers cannot be known, but find that BTS treatment impacts the resulting distributions of answers.
Validating Bayesian truth serum in large-scale online human experiments

PubMed Central

Frank, Morgan R.; Cebrian, Manuel; Pickard, Galen; Rahwan, Iyad

2017-01-01

Bayesian truth serum (BTS) is an exciting new method for improving honesty and information quality in multiple-choice survey, but, despite the method’s mathematical reliance on large sample sizes, existing literature about BTS only focuses on small experiments. Combined with the prevalence of online survey platforms, such as Amazon’s Mechanical Turk, which facilitate surveys with hundreds or thousands of participants, BTS must be effective in large-scale experiments for BTS to become a readily accepted tool in real-world applications. We demonstrate that BTS quantifiably improves honesty in large-scale online surveys where the “honest” distribution of answers is known in expectation on aggregate. Furthermore, we explore a marketing application where “honest” answers cannot be known, but find that BTS treatment impacts the resulting distributions of answers. PMID:28494000
Broad-band simulation of M7.2 earthquake on the North Tehran fault, considering non-linear soil effects

NASA Astrophysics Data System (ADS)

Majidinejad, A.; Zafarani, H.; Vahdani, S.

2018-05-01

The North Tehran fault (NTF) is known to be one of the most drastic sources of seismic hazard on the city of Tehran. In this study, we provide broad-band (0-10 Hz) ground motions for the city as a consequence of probable M7.2 earthquake on the NTF. Low-frequency motions (0-2 Hz) are provided from spectral element dynamic simulation of 17 scenario models. High-frequency (2-10 Hz) motions are calculated with a physics-based method based on S-to-S backscattering theory. Broad-band ground motions at the bedrock level show amplifications, both at low and high frequencies, due to the existence of deep Tehran basin in the vicinity of the NTF. By employing soil profiles obtained from regional studies, effect of shallow soil layers on broad-band ground motions is investigated by both linear and non-linear analyses. While linear soil response overestimate ground motion prediction equations, non-linear response predicts plausible results within one standard deviation of empirical relationships. Average Peak Ground Accelerations (PGAs) at the northern, central and southern parts of the city are estimated about 0.93, 0.59 and 0.4 g, respectively. Increased damping caused by non-linear soil behaviour, reduces the soil linear responses considerably, in particular at frequencies above 3 Hz. Non-linear deamplification reduces linear spectral accelerations up to 63 per cent at stations above soft thick sediments. By performing more general analyses, which exclude source-to-site effects on stations, a correction function is proposed for typical site classes of Tehran. Parameters for the function which reduces linear soil response in order to take into account non-linear soil deamplification are provided for various frequencies in the range of engineering interest. In addition to fully non-linear analyses, equivalent-linear calculations were also conducted which their comparison revealed appropriateness of the method for large peaks and low frequencies, but its shortage for small to medium peaks and motions with higher than 3 Hz frequencies.
Linear phase compressive filter

DOEpatents

McEwan, Thomas E.

1995-01-01

A phase linear filter for soliton suppression is in the form of a laddered series of stages of non-commensurate low pass filters with each low pass filter having a series coupled inductance (L) and a reverse biased, voltage dependent varactor diode, to ground which acts as a variable capacitance (C). L and C values are set to levels which correspond to a linear or conventional phase linear filter. Inductance is mapped directly from that of an equivalent nonlinear transmission line and capacitance is mapped from the linear case using a large signal equivalent of a nonlinear transmission line.
Lagged kernel machine regression for identifying time windows of susceptibility to exposures of complex mixtures.

PubMed

Liu, Shelley H; Bobb, Jennifer F; Lee, Kyu Ha; Gennings, Chris; Claus Henn, Birgit; Bellinger, David; Austin, Christine; Schnaas, Lourdes; Tellez-Rojo, Martha M; Hu, Howard; Wright, Robert O; Arora, Manish; Coull, Brent A

2018-07-01

The impact of neurotoxic chemical mixtures on children's health is a critical public health concern. It is well known that during early life, toxic exposures may impact cognitive function during critical time intervals of increased vulnerability, known as windows of susceptibility. Knowledge on time windows of susceptibility can help inform treatment and prevention strategies, as chemical mixtures may affect a developmental process that is operating at a specific life phase. There are several statistical challenges in estimating the health effects of time-varying exposures to multi-pollutant mixtures, such as: multi-collinearity among the exposures both within time points and across time points, and complex exposure-response relationships. To address these concerns, we develop a flexible statistical method, called lagged kernel machine regression (LKMR). LKMR identifies critical exposure windows of chemical mixtures, and accounts for complex non-linear and non-additive effects of the mixture at any given exposure window. Specifically, LKMR estimates how the effects of a mixture of exposures change with the exposure time window using a Bayesian formulation of a grouped, fused lasso penalty within a kernel machine regression (KMR) framework. A simulation study demonstrates the performance of LKMR under realistic exposure-response scenarios, and demonstrates large gains over approaches that consider each time window separately, particularly when serial correlation among the time-varying exposures is high. Furthermore, LKMR demonstrates gains over another approach that inputs all time-specific chemical concentrations together into a single KMR. We apply LKMR to estimate associations between neurodevelopment and metal mixtures in Early Life Exposures in Mexico and Neurotoxicology, a prospective cohort study of child health in Mexico City.
Spatial distribution of psychotic disorders in an urban area of France: an ecological study.

PubMed

Pignon, Baptiste; Schürhoff, Franck; Baudin, Grégoire; Ferchiou, Aziz; Richard, Jean-Romain; Saba, Ghassen; Leboyer, Marion; Kirkbride, James B; Szöke, Andrei

2016-05-18

Previous analyses of neighbourhood variations of non-affective psychotic disorders (NAPD) have focused mainly on incidence. However, prevalence studies provide important insights on factors associated with disease evolution as well as for healthcare resource allocation. This study aimed to investigate the distribution of prevalent NAPD cases in an urban area in France. The number of cases in each neighbourhood was modelled as a function of potential confounders and ecological variables, namely: migrant density, economic deprivation and social fragmentation. This was modelled using statistical models of increasing complexity: frequentist models (using Poisson and negative binomial regressions), and several Bayesian models. For each model, assumptions validity were checked and compared as to how this fitted to the data, in order to test for possible spatial variation in prevalence. Data showed significant overdispersion (invalidating the Poisson regression model) and residual autocorrelation (suggesting the need to use Bayesian models). The best Bayesian model was Leroux's model (i.e. a model with both strong correlation between neighbouring areas and weaker correlation between areas further apart), with economic deprivation as an explanatory variable (OR = 1.13, 95% CI [1.02-1.25]). In comparison with frequentist methods, the Bayesian model showed a better fit. The number of cases showed non-random spatial distribution and was linked to economic deprivation.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.