Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B; Neyer, Franz J; van Aken, Marcel AG
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are introduced using a simplified example. Thereafter, the advantages and pitfalls of the specification of prior knowledge are discussed. To illustrate Bayesian methods explained in this study, in a second example a series of studies that examine the theoretical framework of dynamic interactionism are considered. In the Discussion the advantages and disadvantages of using Bayesian statistics are reviewed, and guidelines on how to report on Bayesian statistics are provided. PMID:24116396
Bayesian Statistics for Biological Data: Pedigree Analysis
ERIC Educational Resources Information Center
Stanfield, William D.; Carlton, Matthew A.
2004-01-01
The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
A Primer on Bayesian Analysis for Experimental Psychopathologists
Krypotos, Angelos-Miltiadis; Blanken, Tessa F.; Arnaudova, Inna; Matzke, Dora; Beckers, Tom
2016-01-01
The principal goals of experimental psychopathology (EPP) research are to offer insights into the pathogenic mechanisms of mental disorders and to provide a stable ground for the development of clinical interventions. The main message of the present article is that those goals are better served by the adoption of Bayesian statistics than by the continued use of null-hypothesis significance testing (NHST). In the first part of the article we list the main disadvantages of NHST and explain why those disadvantages limit the conclusions that can be drawn from EPP research. Next, we highlight the advantages of Bayesian statistics. To illustrate, we then pit NHST and Bayesian analysis against each other using an experimental data set from our lab. Finally, we discuss some challenges when adopting Bayesian statistics. We hope that the present article will encourage experimental psychopathologists to embrace Bayesian statistics, which could strengthen the conclusions drawn from EPP research. PMID:28748068
[Bayesian statistics in medicine -- part II: main applications and inference].
Montomoli, C; Nichelatti, M
2008-01-01
Bayesian statistics is not only used when one is dealing with 2-way tables, but it can be used for inferential purposes. Using the basic concepts presented in the first part, this paper aims to give a simple overview of Bayesian methods by introducing its foundation (Bayes' theorem) and then applying this rule to a very simple practical example; whenever possible, the elementary processes at the basis of analysis are compared to those of frequentist (classical) statistical analysis. The Bayesian reasoning is naturally connected to medical activity, since it appears to be quite similar to a diagnostic process.
An introduction to Bayesian statistics in health psychology.
Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske
2017-09-01
The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.
Applying Bayesian statistics to the study of psychological trauma: A suggestion for future research.
Yalch, Matthew M
2016-03-01
Several contemporary researchers have noted the virtues of Bayesian methods of data analysis. Although debates continue about whether conventional or Bayesian statistics is the "better" approach for researchers in general, there are reasons why Bayesian methods may be well suited to the study of psychological trauma in particular. This article describes how Bayesian statistics offers practical solutions to the problems of data non-normality, small sample size, and missing data common in research on psychological trauma. After a discussion of these problems and the effects they have on trauma research, this article explains the basic philosophical and statistical foundations of Bayesian statistics and how it provides solutions to these problems using an applied example. Results of the literature review and the accompanying example indicates the utility of Bayesian statistics in addressing problems common in trauma research. Bayesian statistics provides a set of methodological tools and a broader philosophical framework that is useful for trauma researchers. Methodological resources are also provided so that interested readers can learn more. (c) 2016 APA, all rights reserved).
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
ERIC Educational Resources Information Center
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B.; Neyer, Franz J.; van Aken, Marcel A. G.
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are…
Daniel Goodman’s empirical approach to Bayesian statistics
Gerrodette, Tim; Ward, Eric; Taylor, Rebecca L.; Schwarz, Lisa K.; Eguchi, Tomoharu; Wade, Paul; Himes Boor, Gina
2016-01-01
Bayesian statistics, in contrast to classical statistics, uses probability to represent uncertainty about the state of knowledge. Bayesian statistics has often been associated with the idea that knowledge is subjective and that a probability distribution represents a personal degree of belief. Dr. Daniel Goodman considered this viewpoint problematic for issues of public policy. He sought to ground his Bayesian approach in data, and advocated the construction of a prior as an empirical histogram of “similar” cases. In this way, the posterior distribution that results from a Bayesian analysis combined comparable previous data with case-specific current data, using Bayes’ formula. Goodman championed such a data-based approach, but he acknowledged that it was difficult in practice. If based on a true representation of our knowledge and uncertainty, Goodman argued that risk assessment and decision-making could be an exact science, despite the uncertainties. In his view, Bayesian statistics is a critical component of this science because a Bayesian analysis produces the probabilities of future outcomes. Indeed, Goodman maintained that the Bayesian machinery, following the rules of conditional probability, offered the best legitimate inference from available data. We give an example of an informative prior in a recent study of Steller sea lion spatial use patterns in Alaska.
Prior approval: the growth of Bayesian methods in psychology.
Andrews, Mark; Baguley, Thom
2013-02-01
Within the last few years, Bayesian methods of data analysis in psychology have proliferated. In this paper, we briefly review the history or the Bayesian approach to statistics, and consider the implications that Bayesian methods have for the theory and practice of data analysis in psychology.
Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall
2016-01-01
Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.
Bayesian Posterior Odds Ratios: Statistical Tools for Collaborative Evaluations
ERIC Educational Resources Information Center
Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon
2018-01-01
To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…
A Tutorial in Bayesian Potential Outcomes Mediation Analysis.
Miočević, Milica; Gonzalez, Oscar; Valente, Matthew J; MacKinnon, David P
2018-01-01
Statistical mediation analysis is used to investigate intermediate variables in the relation between independent and dependent variables. Causal interpretation of mediation analyses is challenging because randomization of subjects to levels of the independent variable does not rule out the possibility of unmeasured confounders of the mediator to outcome relation. Furthermore, commonly used frequentist methods for mediation analysis compute the probability of the data given the null hypothesis, which is not the probability of a hypothesis given the data as in Bayesian analysis. Under certain assumptions, applying the potential outcomes framework to mediation analysis allows for the computation of causal effects, and statistical mediation in the Bayesian framework gives indirect effects probabilistic interpretations. This tutorial combines causal inference and Bayesian methods for mediation analysis so the indirect and direct effects have both causal and probabilistic interpretations. Steps in Bayesian causal mediation analysis are shown in the application to an empirical example.
The Application of Bayesian Analysis to Issues in Developmental Research
ERIC Educational Resources Information Center
Walker, Lawrence J.; Gustafson, Paul; Frimer, Jeremy A.
2007-01-01
This article reviews the concepts and methods of Bayesian statistical analysis, which can offer innovative and powerful solutions to some challenging analytical problems that characterize developmental research. In this article, we demonstrate the utility of Bayesian analysis, explain its unique adeptness in some circumstances, address some…
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Bayesian analyses of time-interval data for environmental radiation monitoring.
Luo, Peng; Sharp, Julia L; DeVol, Timothy A
2013-01-01
Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
BCM: toolkit for Bayesian analysis of Computational Models using samplers.
Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A
2016-10-21
Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
ERIC Educational Resources Information Center
Hsieh, Chueh-An; Maier, Kimberly S.
2009-01-01
The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection.…
Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation
NASA Technical Reports Server (NTRS)
Jefferys, William H.; Berger, James O.
1992-01-01
'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.
Carvalho, Pedro; Marques, Rui Cunha
2016-02-15
This study aims to search for economies of size and scope in the Portuguese water sector applying Bayesian and classical statistics to make inference in stochastic frontier analysis (SFA). This study proves the usefulness and advantages of the application of Bayesian statistics for making inference in SFA over traditional SFA which just uses classical statistics. The resulting Bayesian methods allow overcoming some problems that arise in the application of the traditional SFA, such as the bias in small samples and skewness of residuals. In the present case study of the water sector in Portugal, these Bayesian methods provide more plausible and acceptable results. Based on the results obtained we found that there are important economies of output density, economies of size, economies of vertical integration and economies of scope in the Portuguese water sector, pointing out to the huge advantages in undertaking mergers by joining the retail and wholesale components and by joining the drinking water and wastewater services. Copyright © 2015 Elsevier B.V. All rights reserved.
A Bayesian test for Hardy–Weinberg equilibrium of biallelic X-chromosomal markers
Puig, X; Ginebra, J; Graffelman, J
2017-01-01
The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy–Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy–Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used. PMID:28900292
Bayesian Statistics in Educational Research: A Look at the Current State of Affairs
ERIC Educational Resources Information Center
König, Christoph; van de Schoot, Rens
2018-01-01
The ability of a scientific discipline to build cumulative knowledge depends on its predominant method of data analysis. A steady accumulation of knowledge requires approaches which allow researchers to consider results from comparable prior research. Bayesian statistics is especially relevant for establishing a cumulative scientific discipline,…
The Importance of Proving the Null
ERIC Educational Resources Information Center
Gallistel, C. R.
2009-01-01
Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is…
Single-Case Time Series with Bayesian Analysis: A Practitioner's Guide.
ERIC Educational Resources Information Center
Jones, W. Paul
2003-01-01
This article illustrates a simplified time series analysis for use by the counseling researcher practitioner in single-case baseline plus intervention studies with a Bayesian probability analysis to integrate findings from replications. The C statistic is recommended as a primary analysis tool with particular relevance in the context of actual…
The Development of Bayesian Theory and Its Applications in Business and Bioinformatics
NASA Astrophysics Data System (ADS)
Zhang, Yifei
2018-03-01
Bayesian Theory originated from an Essay of a British mathematician named Thomas Bayes in 1763, and after its development in 20th century, Bayesian Statistics has been taking a significant part in statistical study of all fields. Due to the recent breakthrough of high-dimensional integral, Bayesian Statistics has been improved and perfected, and now it can be used to solve problems that Classical Statistics failed to solve. This paper summarizes Bayesian Statistics’ history, concepts and applications, which are illustrated in five parts: the history of Bayesian Statistics, the weakness of Classical Statistics, Bayesian Theory and its development and applications. The first two parts make a comparison between Bayesian Statistics and Classical Statistics in a macroscopic aspect. And the last three parts focus on Bayesian Theory in specific -- from introducing some particular Bayesian Statistics’ concepts to listing their development and finally their applications.
Application of a data-mining method based on Bayesian networks to lesion-deficit analysis
NASA Technical Reports Server (NTRS)
Herskovits, Edward H.; Gerring, Joan P.
2003-01-01
Although lesion-deficit analysis (LDA) has provided extensive information about structure-function associations in the human brain, LDA has suffered from the difficulties inherent to the analysis of spatial data, i.e., there are many more variables than subjects, and data may be difficult to model using standard distributions, such as the normal distribution. We herein describe a Bayesian method for LDA; this method is based on data-mining techniques that employ Bayesian networks to represent structure-function associations. These methods are computationally tractable, and can represent complex, nonlinear structure-function associations. When applied to the evaluation of data obtained from a study of the psychiatric sequelae of traumatic brain injury in children, this method generates a Bayesian network that demonstrates complex, nonlinear associations among lesions in the left caudate, right globus pallidus, right side of the corpus callosum, right caudate, and left thalamus, and subsequent development of attention-deficit hyperactivity disorder, confirming and extending our previous statistical analysis of these data. Furthermore, analysis of simulated data indicates that methods based on Bayesian networks may be more sensitive and specific for detecting associations among categorical variables than methods based on chi-square and Fisher exact statistics.
Bartlett, Jonathan W; Keogh, Ruth H
2018-06-01
Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.
Bayesian statistics: estimating plant demographic parameters
James S. Clark; Michael Lavine
2001-01-01
There are times when external information should be brought tobear on an ecological analysis. experiments are never conducted in a knowledge-free context. The inference we draw from an observation may depend on everything else we know about the process. Bayesian analysis is a method that brings outside evidence into the analysis of experimental and observational data...
Harrison, Jay M; Breeze, Matthew L; Harrigan, George G
2011-08-01
Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.
Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T
2016-12-20
Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
A Bayesian approach to meta-analysis of plant pathology studies.
Mila, A L; Ngugi, H K
2011-01-01
Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework. Bayesian meta-analysis can readily include information not easily incorporated in classical methods, and allow for a full evaluation of competing models. Given the power and flexibility of Bayesian methods, we expect them to become widely adopted for meta-analysis of plant pathology studies.
Applications of Bayesian Statistics to Problems in Gamma-Ray Bursts
NASA Technical Reports Server (NTRS)
Meegan, Charles A.
1997-01-01
This presentation will describe two applications of Bayesian statistics to Gamma Ray Bursts (GRBS). The first attempts to quantify the evidence for a cosmological versus galactic origin of GRBs using only the observations of the dipole and quadrupole moments of the angular distribution of bursts. The cosmological hypothesis predicts isotropy, while the galactic hypothesis is assumed to produce a uniform probability distribution over positive values for these moments. The observed isotropic distribution indicates that the Bayes factor for the cosmological hypothesis over the galactic hypothesis is about 300. Another application of Bayesian statistics is in the estimation of chance associations of optical counterparts with galaxies. The Bayesian approach is preferred to frequentist techniques here because the Bayesian approach easily accounts for galaxy mass distributions and because one can incorporate three disjoint hypotheses: (1) bursts come from galactic centers, (2) bursts come from galaxies in proportion to luminosity, and (3) bursts do not come from external galaxies. This technique was used in the analysis of the optical counterpart to GRB970228.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data
ZHU, HONGTU; IBRAHIM, JOSEPH G.; TANG, NIANSHENG
2013-01-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures. PMID:24753718
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
BaTMAn: Bayesian Technique for Multi-image Analysis
NASA Astrophysics Data System (ADS)
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
A Comparison of Imputation Methods for Bayesian Factor Analysis Models
ERIC Educational Resources Information Center
Merkle, Edgar C.
2011-01-01
Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Bayesian networks and statistical analysis application to analyze the diagnostic test accuracy
NASA Astrophysics Data System (ADS)
Orzechowski, P.; Makal, Jaroslaw; Onisko, A.
2005-02-01
The computer aided BPH diagnosis system based on Bayesian network is described in the paper. First result are compared to a given statistical method. Different statistical methods are used successfully in medicine for years. However, the undoubted advantages of probabilistic methods make them useful in application in newly created systems which are frequent in medicine, but do not have full and competent knowledge. The article presents advantages of the computer aided BPH diagnosis system in clinical practice for urologists.
Bayesian approach for counting experiment statistics applied to a neutrino point source analysis
NASA Astrophysics Data System (ADS)
Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.
2013-12-01
In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.
Buddhavarapu, Prasad; Smit, Andre F; Prozzi, Jorge A
2015-07-01
Permeable friction course (PFC), a porous hot-mix asphalt, is typically applied to improve wet weather safety on high-speed roadways in Texas. In order to warrant expensive PFC construction, a statistical evaluation of its safety benefits is essential. Generally, the literature on the effectiveness of porous mixes in reducing wet-weather crashes is limited and often inconclusive. In this study, the safety effectiveness of PFC was evaluated using a fully Bayesian before-after safety analysis. First, two groups of road segments overlaid with PFC and non-PFC material were identified across Texas; the non-PFC or reference road segments selected were similar to their PFC counterparts in terms of site specific features. Second, a negative binomial data generating process was assumed to model the underlying distribution of crash counts of PFC and reference road segments to perform Bayesian inference on the safety effectiveness. A data-augmentation based computationally efficient algorithm was employed for a fully Bayesian estimation. The statistical analysis shows that PFC is not effective in reducing wet weather crashes. It should be noted that the findings of this study are in agreement with the existing literature, although these studies were not based on a fully Bayesian statistical analysis. Our study suggests that the safety effectiveness of PFC road surfaces, or any other safety infrastructure, largely relies on its interrelationship with the road user. The results suggest that the safety infrastructure must be properly used to reap the benefits of the substantial investments. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bayesian statistics in medicine: a 25 year review.
Ashby, Deborah
2006-11-15
This review examines the state of Bayesian thinking as Statistics in Medicine was launched in 1982, reflecting particularly on its applicability and uses in medical research. It then looks at each subsequent five-year epoch, with a focus on papers appearing in Statistics in Medicine, putting these in the context of major developments in Bayesian thinking and computation with reference to important books, landmark meetings and seminal papers. It charts the growth of Bayesian statistics as it is applied to medicine and makes predictions for the future. From sparse beginnings, where Bayesian statistics was barely mentioned, Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modelling, longitudinal modelling, survival modelling, molecular genetics and decision-making in respect of new technologies.
Maritime Transportation Risk Assessment of Tianjin Port with Bayesian Belief Networks.
Zhang, Jinfen; Teixeira, Ângelo P; Guedes Soares, C; Yan, Xinping; Liu, Kezhong
2016-06-01
This article develops a Bayesian belief network model for the prediction of accident consequences in the Tianjin port. The study starts with a statistical analysis of historical accident data of six years from 2008 to 2013. Then a Bayesian belief network is constructed to express the dependencies between the indicator variables and accident consequences. The statistics and expert knowledge are synthesized in the Bayesian belief network model to obtain the probability distribution of the consequences. By a sensitivity analysis, several indicator variables that have influence on the consequences are identified, including navigational area, ship type and time of the day. The results indicate that the consequences are most sensitive to the position where the accidents occurred, followed by time of day and ship length. The results also reflect that the navigational risk of the Tianjin port is at the acceptable level, despite that there is more room of improvement. These results can be used by the Maritime Safety Administration to take effective measures to enhance maritime safety in the Tianjin port. © 2016 Society for Risk Analysis.
Quantitative trait nucleotide analysis using Bayesian model selection.
Blangero, John; Goring, Harald H H; Kent, Jack W; Williams, Jeff T; Peterson, Charles P; Almasy, Laura; Dyer, Thomas D
2005-10-01
Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.
Bayesian demography 250 years after Bayes
Bijak, Jakub; Bryant, John
2016-01-01
Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889
Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A
2017-01-01
The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60-90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity.
Spectral Analysis of B Stars: An Application of Bayesian Statistics
NASA Astrophysics Data System (ADS)
Mugnes, J.-M.; Robert, C.
2012-12-01
To better understand the processes involved in stellar physics, it is necessary to obtain accurate stellar parameters (effective temperature, surface gravity, abundances…). Spectral analysis is a powerful tool for investigating stars, but it is also vital to reduce uncertainties at a decent computational cost. Here we present a spectral analysis method based on a combination of Bayesian statistics and grids of synthetic spectra obtained with TLUSTY. This method simultaneously constrains the stellar parameters by using all the lines accessible in observed spectra and thus greatly reduces uncertainties and improves the overall spectrum fitting. Preliminary results are shown using spectra from the Observatoire du Mont-Mégantic.
Suggestions for presenting the results of data analyses
Anderson, David R.; Link, William A.; Johnson, Douglas H.; Burnham, Kenneth P.
2001-01-01
We give suggestions for the presentation of research results from frequentist, information-theoretic, and Bayesian analysis paradigms, followed by several general suggestions. The information-theoretic and Bayesian methods offer alternative approaches to data analysis and inference compared to traditionally used methods. Guidance is lacking on the presentation of results under these alternative procedures and on nontesting aspects of classical frequentists methods of statistical analysis. Null hypothesis testing has come under intense criticism. We recommend less reporting of the results of statistical tests of null hypotheses in cases where the null is surely false anyway, or where the null hypothesis is of little interest to science or management.
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Antal, Péter; Kiszel, Petra Sz.; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F.; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba
2012-01-01
Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called Bayesian network based Bayesian multilevel analysis of relevance (BN-BMLA). This method uses Bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the Bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated. With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2–1.8); p = 3×10−4). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics. In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance. PMID:22432035
A Bayesian Missing Data Framework for Generalized Multiple Outcome Mixed Treatment Comparisons
ERIC Educational Resources Information Center
Hong, Hwanhee; Chu, Haitao; Zhang, Jing; Carlin, Bradley P.
2016-01-01
Bayesian statistical approaches to mixed treatment comparisons (MTCs) are becoming more popular because of their flexibility and interpretability. Many randomized clinical trials report multiple outcomes with possible inherent correlations. Moreover, MTC data are typically sparse (although richer than standard meta-analysis, comparing only two…
Moving beyond qualitative evaluations of Bayesian models of cognition.
Hemmer, Pernille; Tauber, Sean; Steyvers, Mark
2015-06-01
Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.
Tweedell, Andrew J.; Haynes, Courtney A.
2017-01-01
The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60–90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity. PMID:28489897
Bayesian conditional-independence modeling of the AIDS epidemic in England and Wales
NASA Astrophysics Data System (ADS)
Gilks, Walter R.; De Angelis, Daniela; Day, Nicholas E.
We describe the use of conditional-independence modeling, Bayesian inference and Markov chain Monte Carlo, to model and project the HIV-AIDS epidemic in homosexual/bisexual males in England and Wales. Complexity in this analysis arises through selectively missing data, indirectly observed underlying processes, and measurement error. Our emphasis is on presentation and discussion of the concepts, not on the technicalities of this analysis, which can be found elsewhere [D. De Angelis, W.R. Gilks, N.E. Day, Bayesian projection of the the acquired immune deficiency syndrome epidemic (with discussion), Applied Statistics, in press].
NASA Astrophysics Data System (ADS)
Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew
2007-04-01
One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.
BATSE gamma-ray burst line search. 2: Bayesian consistency methodology
NASA Technical Reports Server (NTRS)
Band, D. L.; Ford, L. A.; Matteson, J. L.; Briggs, M.; Paciesas, W.; Pendleton, G.; Preece, R.; Palmer, D.; Teegarden, B.; Schaefer, B.
1994-01-01
We describe a Bayesian methodology to evaluate the consistency between the reported Ginga and Burst and Transient Source Experiment (BATSE) detections of absorption features in gamma-ray burst spectra. Currently no features have been detected by BATSE, but this methodology will still be applicable if and when such features are discovered. The Bayesian methodology permits the comparison of hypotheses regarding the two detectors' observations and makes explicit the subjective aspects of our analysis (e.g., the quantification of our confidence in detector performance). We also present non-Bayesian consistency statistics. Based on preliminary calculations of line detectability, we find that both the Bayesian and non-Bayesian techniques show that the BATSE and Ginga observations are consistent given our understanding of these detectors.
Sirota, Miroslav; Kostovičová, Lenka; Juanchich, Marie
2014-08-01
Knowing which properties of visual displays facilitate statistical reasoning bears practical and theoretical implications. Therefore, we studied the effect of one property of visual diplays - iconicity (i.e., the resemblance of a visual sign to its referent) - on Bayesian reasoning. Two main accounts of statistical reasoning predict different effect of iconicity on Bayesian reasoning. The ecological-rationality account predicts a positive iconicity effect, because more highly iconic signs resemble more individuated objects, which tap better into an evolutionary-designed frequency-coding mechanism that, in turn, facilitates Bayesian reasoning. The nested-sets account predicts a null iconicity effect, because iconicity does not affect the salience of a nested-sets structure-the factor facilitating Bayesian reasoning processed by a general reasoning mechanism. In two well-powered experiments (N = 577), we found no support for a positive iconicity effect across different iconicity levels that were manipulated in different visual displays (meta-analytical overall effect: log OR = -0.13, 95% CI [-0.53, 0.28]). A Bayes factor analysis provided strong evidence in favor of the null hypothesis-the null iconicity effect. Thus, these findings corroborate the nested-sets rather than the ecological-rationality account of statistical reasoning.
Statistical Symbolic Execution with Informed Sampling
NASA Technical Reports Server (NTRS)
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)
NASA Astrophysics Data System (ADS)
Kasibhatla, P.
2004-12-01
In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.
Asteroid orbital error analysis: Theory and application
NASA Technical Reports Server (NTRS)
Muinonen, K.; Bowell, Edward
1992-01-01
We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
Bayesian multimodel inference for dose-response studies
Link, W.A.; Albers, P.H.
2007-01-01
Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.
A Bayesian approach to the statistical analysis of device preference studies.
Fu, Haoda; Qu, Yongming; Zhu, Baojin; Huster, William
2012-01-01
Drug delivery devices are required to have excellent technical specifications to deliver drugs accurately, and in addition, the devices should provide a satisfactory experience to patients because this can have a direct effect on drug compliance. To compare patients' experience with two devices, cross-over studies with patient-reported outcomes (PRO) as response variables are often used. Because of the strength of cross-over designs, each subject can directly compare the two devices by using the PRO variables, and variables indicating preference (preferring A, preferring B, or no preference) can be easily derived. Traditionally, methods based on frequentist statistics can be used to analyze such preference data, but there are some limitations for the frequentist methods. Recently, Bayesian methods are considered an acceptable method by the US Food and Drug Administration to design and analyze device studies. In this paper, we propose a Bayesian statistical method to analyze the data from preference trials. We demonstrate that the new Bayesian estimator enjoys some optimal properties versus the frequentist estimator. Copyright © 2012 John Wiley & Sons, Ltd.
A bayesian approach to classification criteria for spectacled eiders
Taylor, B.L.; Wade, P.R.; Stehn, R.A.; Cochrane, J.F.
1996-01-01
To facilitate decisions to classify species according to risk of extinction, we used Bayesian methods to analyze trend data for the Spectacled Eider, an arctic sea duck. Trend data from three independent surveys of the Yukon-Kuskokwim Delta were analyzed individually and in combination to yield posterior distributions for population growth rates. We used classification criteria developed by the recovery team for Spectacled Eiders that seek to equalize errors of under- or overprotecting the species. We conducted both a Bayesian decision analysis and a frequentist (classical statistical inference) decision analysis. Bayesian decision analyses are computationally easier, yield basically the same results, and yield results that are easier to explain to nonscientists. With the exception of the aerial survey analysis of the 10 most recent years, both Bayesian and frequentist methods indicated that an endangered classification is warranted. The discrepancy between surveys warrants further research. Although the trend data are abundance indices, we used a preliminary estimate of absolute abundance to demonstrate how to calculate extinction distributions using the joint probability distributions for population growth rate and variance in growth rate generated by the Bayesian analysis. Recent apparent increases in abundance highlight the need for models that apply to declining and then recovering species.
Online Variational Bayesian Filtering-Based Mobile Target Tracking in Wireless Sensor Networks
Zhou, Bingpeng; Chen, Qingchun; Li, Tiffany Jing; Xiao, Pei
2014-01-01
The received signal strength (RSS)-based online tracking for a mobile node in wireless sensor networks (WSNs) is investigated in this paper. Firstly, a multi-layer dynamic Bayesian network (MDBN) is introduced to characterize the target mobility with either directional or undirected movement. In particular, it is proposed to employ the Wishart distribution to approximate the time-varying RSS measurement precision's randomness due to the target movement. It is shown that the proposed MDBN offers a more general analysis model via incorporating the underlying statistical information of both the target movement and observations, which can be utilized to improve the online tracking capability by exploiting the Bayesian statistics. Secondly, based on the MDBN model, a mean-field variational Bayesian filtering (VBF) algorithm is developed to realize the online tracking of a mobile target in the presence of nonlinear observations and time-varying RSS precision, wherein the traditional Bayesian filtering scheme cannot be directly employed. Thirdly, a joint optimization between the real-time velocity and its prior expectation is proposed to enable online velocity tracking in the proposed online tacking scheme. Finally, the associated Bayesian Cramer–Rao Lower Bound (BCRLB) analysis and numerical simulations are conducted. Our analysis unveils that, by exploiting the potential state information via the general MDBN model, the proposed VBF algorithm provides a promising solution to the online tracking of a mobile node in WSNs. In addition, it is shown that the final tracking accuracy linearly scales with its expectation when the RSS measurement precision is time-varying. PMID:25393784
NASA Astrophysics Data System (ADS)
Cox, M.; Shirono, K.
2017-10-01
A criticism levelled at the Guide to the Expression of Uncertainty in Measurement (GUM) is that it is based on a mixture of frequentist and Bayesian thinking. In particular, the GUM’s Type A (statistical) uncertainty evaluations are frequentist, whereas the Type B evaluations, using state-of-knowledge distributions, are Bayesian. In contrast, making the GUM fully Bayesian implies, among other things, that a conventional objective Bayesian approach to Type A uncertainty evaluation for a number n of observations leads to the impractical consequence that n must be at least equal to 4, thus presenting a difficulty for many metrologists. This paper presents a Bayesian analysis of Type A uncertainty evaluation that applies for all n ≥slant 2 , as in the frequentist analysis in the current GUM. The analysis is based on assuming that the observations are drawn from a normal distribution (as in the conventional objective Bayesian analysis), but uses an informative prior based on lower and upper bounds for the standard deviation of the sampling distribution for the quantity under consideration. The main outcome of the analysis is a closed-form mathematical expression for the factor by which the standard deviation of the mean observation should be multiplied to calculate the required standard uncertainty. Metrological examples are used to illustrate the approach, which is straightforward to apply using a formula or look-up table.
NASA Astrophysics Data System (ADS)
Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The
2015-11-01
While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.
A systematic review of Bayesian articles in psychology: The last 25 years.
van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah
2017-06-01
Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
ERIC Educational Resources Information Center
Galbraith, Craig S.; Merrill, Gregory B.; Kline, Doug M.
2012-01-01
In this study we investigate the underlying relational structure between student evaluations of teaching effectiveness (SETEs) and achievement of student learning outcomes in 116 business related courses. Utilizing traditional statistical techniques, a neural network analysis and a Bayesian data reduction and classification algorithm, we find…
On the blind use of statistical tools in the analysis of globular cluster stars
NASA Astrophysics Data System (ADS)
D'Antona, Francesca; Caloi, Vittoria; Tailo, Marco
2018-04-01
As with most data analysis methods, the Bayesian method must be handled with care. We show that its application to determine stellar evolution parameters within globular clusters can lead to paradoxical results if used without the necessary precautions. This is a cautionary tale on the use of statistical tools for big data analysis.
NASA Astrophysics Data System (ADS)
Rosenheim, B. E.; Firesinger, D.; Roberts, M. L.; Burton, J. R.; Khan, N.; Moyer, R. P.
2016-12-01
Radiocarbon (14C) sediment core chronologies benefit from a high density of dates, even when precision of individual dates is sacrificed. This is demonstrated by a combined approach of rapid 14C analysis of CO2 gas generated from carbonates and organic material coupled with Bayesian statistical modeling. Analysis of 14C is facilitated by the gas ion source on the Continuous Flow Accelerator Mass Spectrometry (CFAMS) system at the Woods Hole Oceanographic Institution's National Ocean Sciences Accelerator Mass Spectrometry facility. This instrument is capable of producing a 14C determination of +/- 100 14C y precision every 4-5 minutes, with limited sample handling (dissolution of carbonates and/or combustion of organic carbon in evacuated containers). Rapid analysis allows over-preparation of samples to include replicates at each depth and/or comparison of different sample types at particular depths in a sediment or peat core. Analysis priority is given to depths that have the least chronologic precision as determined by Bayesian modeling of the chronology of calibrated ages. Use of such a statistical approach to determine the order in which samples are run ensures that the chronology constantly improves so long as material is available for the analysis of chronologic weak points. Ultimately, accuracy of the chronology is determined by the material that is actually being dated, and our combined approach allows testing of different constituents of the organic carbon pool and the carbonate minerals within a core. We will present preliminary results from a deep-sea sediment core abundant in deep-sea foraminifera as well as coastal wetland peat cores to demonstrate statistical improvements in sediment- and peat-core chronologies obtained by increasing the quantity and decreasing the quality of individual dates.
Spatiotemporal Bayesian analysis of Lyme disease in New York state, 1990-2000.
Chen, Haiyan; Stratton, Howard H; Caraco, Thomas B; White, Dennis J
2006-07-01
Mapping ordinarily increases our understanding of nontrivial spatial and temporal heterogeneities in disease rates. However, the large number of parameters required by the corresponding statistical models often complicates detailed analysis. This study investigates the feasibility of a fully Bayesian hierarchical regression approach to the problem and identifies how it outperforms two more popular methods: crude rate estimates (CRE) and empirical Bayes standardization (EBS). In particular, we apply a fully Bayesian approach to the spatiotemporal analysis of Lyme disease incidence in New York state for the period 1990-2000. These results are compared with those obtained by CRE and EBS in Chen et al. (2005). We show that the fully Bayesian regression model not only gives more reliable estimates of disease rates than the other two approaches but also allows for tractable models that can accommodate more numerous sources of variation and unknown parameters.
BATMAN: Bayesian Technique for Multi-image Analysis
NASA Astrophysics Data System (ADS)
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2017-04-01
This paper describes the Bayesian Technique for Multi-image Analysis (BATMAN), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical data set containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (I.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real integral-field spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of 2. Our analysis reveals that the algorithm prioritizes conservation of all the statistically significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BaTMAn is not to be used as a 'black box' to improve the signal-to-noise ratio, but as a new approach to characterize spatially resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems
NASA Technical Reports Server (NTRS)
He, Yuning; Davies, Misty Dawn
2014-01-01
The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
Computational statistics using the Bayesian Inference Engine
NASA Astrophysics Data System (ADS)
Weinberg, Martin D.
2013-09-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.
Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences
Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric
2016-01-01
Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566
Iocca, Oreste; Farcomeni, Alessio; Pardiñas Lopez, Simon; Talib, Huzefa S
2017-01-01
To conduct a traditional meta-analysis and a Bayesian Network meta-analysis to synthesize the information coming from randomized controlled trials on different socket grafting materials and combine the resulting indirect evidence in order to make inferences on treatments that have not been compared directly. RCTs were identified for inclusion in the systematic review and subsequent statistical analysis. Bone height and width remodelling were selected as the chosen summary measures for comparison. First, a series of pairwise meta-analyses were performed and overall mean difference (MD) in mm with 95% CI was calculated between grafted versus non-grafted sockets. Then, a Bayesian Network meta-analysis was performed to draw indirect conclusions on which grafting materials can be considered most likely the best compared to the others. From the six included studies, seven comparisons were obtained. Traditional meta-analysis showed statistically significant results in favour of grafting the socket compared to no-graft both for height (MD 1.02, 95% CI 0.44-1.59, p value < 0.001) than for width (MD 1.52 95% CI 1.18-1.86, p value <0.000001) remodelling. Bayesian Network meta-analysis allowed to obtain a rank of intervention efficacy. On the basis of the results of the present analysis, socket grafting seems to be more favourable than unassisted socket healing. Moreover, Bayesian Network meta-analysis indicates that freeze-dried bone graft plus membrane is the most likely effective in the reduction of bone height remodelling. Autologous bone marrow resulted the most likely effective when width remodelling was considered. Studies with larger samples and less risk of bias should be conducted in the future in order to further strengthen the results of this analysis. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Philosophy and the practice of Bayesian statistics
Gelman, Andrew; Shalizi, Cosma Rohilla
2015-01-01
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575
Philosophy and the practice of Bayesian statistics.
Gelman, Andrew; Shalizi, Cosma Rohilla
2013-02-01
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.
A Gibbs sampler for Bayesian analysis of site-occupancy data
Dorazio, Robert M.; Rodriguez, Daniel Taylor
2012-01-01
1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.
NASA Technical Reports Server (NTRS)
He, Yuning
2015-01-01
The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.
A Monte Carlo–Based Bayesian Approach for Measuring Agreement in a Qualitative Scale
Pérez Sánchez, Carlos Javier
2014-01-01
Agreement analysis has been an active research area whose techniques have been widely applied in psychology and other fields. However, statistical agreement among raters has been mainly considered from a classical statistics point of view. Bayesian methodology is a viable alternative that allows the inclusion of subjective initial information coming from expert opinions, personal judgments, or historical data. A Bayesian approach is proposed by providing a unified Monte Carlo–based framework to estimate all types of measures of agreement in a qualitative scale of response. The approach is conceptually simple and it has a low computational cost. Both informative and non-informative scenarios are considered. In case no initial information is available, the results are in line with the classical methodology, but providing more information on the measures of agreement. For the informative case, some guidelines are presented to elicitate the prior distribution. The approach has been applied to two applications related to schizophrenia diagnosis and sensory analysis. PMID:29881002
Wang, Tianli; Baron, Kyle; Zhong, Wei; Brundage, Richard; Elmquist, William
2014-03-01
The current study presents a Bayesian approach to non-compartmental analysis (NCA), which provides the accurate and precise estimate of AUC 0 (∞) and any AUC 0 (∞) -based NCA parameter or derivation. In order to assess the performance of the proposed method, 1,000 simulated datasets were generated in different scenarios. A Bayesian method was used to estimate the tissue and plasma AUC 0 (∞) s and the tissue-to-plasma AUC 0 (∞) ratio. The posterior medians and the coverage of 95% credible intervals for the true parameter values were examined. The method was applied to laboratory data from a mice brain distribution study with serial sacrifice design for illustration. Bayesian NCA approach is accurate and precise in point estimation of the AUC 0 (∞) and the partition coefficient under a serial sacrifice design. It also provides a consistently good variance estimate, even considering the variability of the data and the physiological structure of the pharmacokinetic model. The application in the case study obtained a physiologically reasonable posterior distribution of AUC, with a posterior median close to the value estimated by classic Bailer-type methods. This Bayesian NCA approach for sparse data analysis provides statistical inference on the variability of AUC 0 (∞) -based parameters such as partition coefficient and drug targeting index, so that the comparison of these parameters following destructive sampling becomes statistically feasible.
Harrigan, George G; Harrison, Jay M
2012-01-01
New transgenic (GM) crops are subjected to extensive safety assessments that include compositional comparisons with conventional counterparts as a cornerstone of the process. The influence of germplasm, location, environment, and agronomic treatments on compositional variability is, however, often obscured in these pair-wise comparisons. Furthermore, classical statistical significance testing can often provide an incomplete and over-simplified summary of highly responsive variables such as crop composition. In order to more clearly describe the influence of the numerous sources of compositional variation we present an introduction to two alternative but complementary approaches to data analysis and interpretation. These include i) exploratory data analysis (EDA) with its emphasis on visualization and graphics-based approaches and ii) Bayesian statistical methodology that provides easily interpretable and meaningful evaluations of data in terms of probability distributions. The EDA case-studies include analyses of herbicide-tolerant GM soybean and insect-protected GM maize and soybean. Bayesian approaches are presented in an analysis of herbicide-tolerant GM soybean. Advantages of these approaches over classical frequentist significance testing include the more direct interpretation of results in terms of probabilities pertaining to quantities of interest and no confusion over the application of corrections for multiple comparisons. It is concluded that a standardized framework for these methodologies could provide specific advantages through enhanced clarity of presentation and interpretation in comparative assessments of crop composition.
Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A
2017-12-01
The onset of muscle activity, as measured by electromyography (EMG), is a commonly applied metric in biomechanics. Intramuscular EMG is often used to examine deep musculature and there are currently no studies examining the effectiveness of algorithms for intramuscular EMG onset. The present study examines standard surface EMG onset algorithms (linear envelope, Teager-Kaiser Energy Operator, and sample entropy) and novel algorithms (time series mean-variance analysis, sequential/batch processing with parametric and nonparametric methods, and Bayesian changepoint analysis). Thirteen male and 5 female subjects had intramuscular EMG collected during isolated biceps brachii and vastus lateralis contractions, resulting in 103 trials. EMG onset was visually determined twice by 3 blinded reviewers. Since the reliability of visual onset was high (ICC (1,1) : 0.92), the mean of the 6 visual assessments was contrasted with the algorithmic approaches. Poorly performing algorithms were stepwise eliminated via (1) root mean square error analysis, (2) algorithm failure to identify onset/premature onset, (3) linear regression analysis, and (4) Bland-Altman plots. The top performing algorithms were all based on Bayesian changepoint analysis of rectified EMG and were statistically indistinguishable from visual analysis. Bayesian changepoint analysis has the potential to produce more reliable, accurate, and objective intramuscular EMG onset results than standard methodologies.
The image recognition based on neural network and Bayesian decision
NASA Astrophysics Data System (ADS)
Wang, Chugege
2018-04-01
The artificial neural network began in 1940, which is an important part of artificial intelligence. At present, it has become a hot topic in the fields of neuroscience, computer science, brain science, mathematics, and psychology. Thomas Bayes firstly reported the Bayesian theory in 1763. After the development in the twentieth century, it has been widespread in all areas of statistics. In recent years, due to the solution of the problem of high-dimensional integral calculation, Bayesian Statistics has been improved theoretically, which solved many problems that cannot be solved by classical statistics and is also applied to the interdisciplinary fields. In this paper, the related concepts and principles of the artificial neural network are introduced. It also summarizes the basic content and principle of Bayesian Statistics, and combines the artificial neural network technology and Bayesian decision theory and implement them in all aspects of image recognition, such as enhanced face detection method based on neural network and Bayesian decision, as well as the image classification based on the Bayesian decision. It can be seen that the combination of artificial intelligence and statistical algorithms has always been the hot research topic.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data.
Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing
2016-01-01
This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
2017-09-01
efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components
Nariai, N; Kim, S; Imoto, S; Miyano, S
2004-01-01
We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.
Advances in Bayesian Modeling in Educational Research
ERIC Educational Resources Information Center
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines
NASA Astrophysics Data System (ADS)
Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.
2016-12-01
Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.
2016-05-31
and included explosives such as TATP, HMTD, RDX, RDX, ammonium nitrate , potassium perchlorate, potassium nitrate , sugar, and TNT. The approach...Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2. d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2. d Bayesian and Non-parametric Statistics: Integration of Neural
2012-01-01
Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. Conclusions The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint. PMID:22962944
Adrion, Christine; Mansmann, Ulrich
2012-09-10
A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.
Wijeysundera, Duminda N; Austin, Peter C; Hux, Janet E; Beattie, W Scott; Laupacis, Andreas
2009-01-01
Randomized trials generally use "frequentist" statistics based on P-values and 95% confidence intervals. Frequentist methods have limitations that might be overcome, in part, by Bayesian inference. To illustrate these advantages, we re-analyzed randomized trials published in four general medical journals during 2004. We used Medline to identify randomized superiority trials with two parallel arms, individual-level randomization and dichotomous or time-to-event primary outcomes. Studies with P<0.05 in favor of the intervention were deemed "positive"; otherwise, they were "negative." We used several prior distributions and exact conjugate analyses to calculate Bayesian posterior probabilities for clinically relevant effects. Of 88 included studies, 39 were positive using a frequentist analysis. Although the Bayesian posterior probabilities of any benefit (relative risk or hazard ratio<1) were high in positive studies, these probabilities were lower and variable for larger benefits. The positive studies had only moderate probabilities for exceeding the effects that were assumed for calculating the sample size. By comparison, there were moderate probabilities of any benefit in negative studies. Bayesian and frequentist analyses complement each other when interpreting the results of randomized trials. Future reports of randomized trials should include both.
Rediscovery of Good-Turing estimators via Bayesian nonparametrics.
Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye
2016-03-01
The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. © 2015, The International Biometric Society.
Bayesian survival analysis in clinical trials: What methods are used in practice?
Brard, Caroline; Le Teuff, Gwénaël; Le Deley, Marie-Cécile; Hampson, Lisa V
2017-02-01
Background Bayesian statistics are an appealing alternative to the traditional frequentist approach to designing, analysing, and reporting of clinical trials, especially in rare diseases. Time-to-event endpoints are widely used in many medical fields. There are additional complexities to designing Bayesian survival trials which arise from the need to specify a model for the survival distribution. The objective of this article was to critically review the use and reporting of Bayesian methods in survival trials. Methods A systematic review of clinical trials using Bayesian survival analyses was performed through PubMed and Web of Science databases. This was complemented by a full text search of the online repositories of pre-selected journals. Cost-effectiveness, dose-finding studies, meta-analyses, and methodological papers using clinical trials were excluded. Results In total, 28 articles met the inclusion criteria, 25 were original reports of clinical trials and 3 were re-analyses of a clinical trial. Most trials were in oncology (n = 25), were randomised controlled (n = 21) phase III trials (n = 13), and half considered a rare disease (n = 13). Bayesian approaches were used for monitoring in 14 trials and for the final analysis only in 14 trials. In the latter case, Bayesian survival analyses were used for the primary analysis in four cases, for the secondary analysis in seven cases, and for the trial re-analysis in three cases. Overall, 12 articles reported fitting Bayesian regression models (semi-parametric, n = 3; parametric, n = 9). Prior distributions were often incompletely reported: 20 articles did not define the prior distribution used for the parameter of interest. Over half of the trials used only non-informative priors for monitoring and the final analysis (n = 12) when it was specified. Indeed, no articles fitting Bayesian regression models placed informative priors on the parameter of interest. The prior for the treatment effect was based on historical data in only four trials. Decision rules were pre-defined in eight cases when trials used Bayesian monitoring, and in only one case when trials adopted a Bayesian approach to the final analysis. Conclusion Few trials implemented a Bayesian survival analysis and few incorporated external data into priors. There is scope to improve the quality of reporting of Bayesian methods in survival trials. Extension of the Consolidated Standards of Reporting Trials statement for reporting Bayesian clinical trials is recommended.
NASA Astrophysics Data System (ADS)
Yan, Wang-Ji; Ren, Wei-Xin
2018-01-01
This study applies the theoretical findings of circularly-symmetric complex normal ratio distribution Yan and Ren (2016) [1,2] to transmissibility-based modal analysis from a statistical viewpoint. A probabilistic model of transmissibility function in the vicinity of the resonant frequency is formulated in modal domain, while some insightful comments are offered. It theoretically reveals that the statistics of transmissibility function around the resonant frequency is solely dependent on 'noise-to-signal' ratio and mode shapes. As a sequel to the development of the probabilistic model of transmissibility function in modal domain, this study poses the process of modal identification in the context of Bayesian framework by borrowing a novel paradigm. Implementation issues unique to the proposed approach are resolved by Lagrange multiplier approach. Also, this study explores the possibility of applying Bayesian analysis in distinguishing harmonic components and structural ones. The approaches are verified through simulated data and experimentally testing data. The uncertainty behavior due to variation of different factors is also discussed in detail.
Bayesian Group Bridge for Bi-level Variable Selection.
Mallick, Himel; Yi, Nengjun
2017-06-01
A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.
Bayesian analysis of CCDM models
NASA Astrophysics Data System (ADS)
Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Why Are People Bad at Detecting Randomness? A Statistical Argument
ERIC Educational Resources Information Center
Williams, Joseph J.; Griffiths, Thomas L.
2013-01-01
Errors in detecting randomness are often explained in terms of biases and misconceptions. We propose and provide evidence for an account that characterizes the contribution of the inherent statistical difficulty of the task. Our account is based on a Bayesian statistical analysis, focusing on the fact that a random process is a special case of…
ERIC Educational Resources Information Center
Jackson, Dan
2013-01-01
Statistical inference is problematic in the common situation in meta-analysis where the random effects model is fitted to just a handful of studies. In particular, the asymptotic theory of maximum likelihood provides a poor approximation, and Bayesian methods are sensitive to the prior specification. Hence, less efficient, but easily computed and…
NASA Technical Reports Server (NTRS)
Gilkey, Kelly M.; Myers, Jerry G.; McRae, Michael P.; Griffin, Elise A.; Kallrui, Aditya S.
2012-01-01
The Exploration Medical Capability project is creating a catalog of risk assessments using the Integrated Medical Model (IMM). The IMM is a software-based system intended to assist mission planners in preparing for spaceflight missions by helping them to make informed decisions about medical preparations and supplies needed for combating and treating various medical events using Probabilistic Risk Assessment. The objective is to use statistical analyses to inform the IMM decision tool with estimated probabilities of medical events occurring during an exploration mission. Because data regarding astronaut health are limited, Bayesian statistical analysis is used. Bayesian inference combines prior knowledge, such as data from the general U.S. population, the U.S. Submarine Force, or the analog astronaut population located at the NASA Johnson Space Center, with observed data for the medical condition of interest. The posterior results reflect the best evidence for specific medical events occurring in flight. Bayes theorem provides a formal mechanism for combining available observed data with data from similar studies to support the quantification process. The IMM team performed Bayesian updates on the following medical events: angina, appendicitis, atrial fibrillation, atrial flutter, dental abscess, dental caries, dental periodontal disease, gallstone disease, herpes zoster, renal stones, seizure, and stroke.
NASA Astrophysics Data System (ADS)
Li, Zhijun; Feng, Maria Q.; Luo, Longxi; Feng, Dongming; Xu, Xiuli
2018-01-01
Uncertainty of modal parameters estimation appear in structural health monitoring (SHM) practice of civil engineering to quite some significant extent due to environmental influences and modeling errors. Reasonable methodologies are needed for processing the uncertainty. Bayesian inference can provide a promising and feasible identification solution for the purpose of SHM. However, there are relatively few researches on the application of Bayesian spectral method in the modal identification using SHM data sets. To extract modal parameters from large data sets collected by SHM system, the Bayesian spectral density algorithm was applied to address the uncertainty of mode extraction from output-only response of a long-span suspension bridge. The posterior most possible values of modal parameters and their uncertainties were estimated through Bayesian inference. A long-term variation and statistical analysis was performed using the sensor data sets collected from the SHM system of the suspension bridge over a one-year period. The t location-scale distribution was shown to be a better candidate function for frequencies of lower modes. On the other hand, the burr distribution provided the best fitting to the higher modes which are sensitive to the temperature. In addition, wind-induced variation of modal parameters was also investigated. It was observed that both the damping ratios and modal forces increased during the period of typhoon excitations. Meanwhile, the modal damping ratios exhibit significant correlation with the spectral intensities of the corresponding modal forces.
What Is the Probability You Are a Bayesian?
ERIC Educational Resources Information Center
Wulff, Shaun S.; Robinson, Timothy J.
2014-01-01
Bayesian methodology continues to be widely used in statistical applications. As a result, it is increasingly important to introduce students to Bayesian thinking at early stages in their mathematics and statistics education. While many students in upper level probability courses can recite the differences in the Frequentist and Bayesian…
Analysis and meta-analysis of single-case designs: an introduction.
Shadish, William R
2014-04-01
The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model
NASA Astrophysics Data System (ADS)
Al Sobhi, Mashail M.
2015-02-01
Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.
ACHCAR, J. A.; MARTINEZ, E. Z.; RUFFINO-NETTO, A.; PAULINO, C. D.; SOARES, P.
2008-01-01
SUMMARY We considered a Bayesian analysis for the prevalence of tuberculosis cases in New York City from 1970 to 2000. This counting dataset presented two change-points during this period. We modelled this counting dataset considering non-homogeneous Poisson processes in the presence of the two-change points. A Bayesian analysis for the data is considered using Markov chain Monte Carlo methods. Simulated Gibbs samples for the parameters of interest were obtained using WinBugs software. PMID:18346287
Modeling Statistical Insensitivity: Sources of Suboptimal Behavior
ERIC Educational Resources Information Center
Gagliardi, Annie; Feldman, Naomi H.; Lidz, Jeffrey
2017-01-01
Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the…
Bayesian Latent Class Analysis Tutorial.
Li, Yuelin; Lord-Bessen, Jennifer; Shiyko, Mariya; Loeb, Rebecca
2018-01-01
This article is a how-to guide on Bayesian computation using Gibbs sampling, demonstrated in the context of Latent Class Analysis (LCA). It is written for students in quantitative psychology or related fields who have a working knowledge of Bayes Theorem and conditional probability and have experience in writing computer programs in the statistical language R . The overall goals are to provide an accessible and self-contained tutorial, along with a practical computation tool. We begin with how Bayesian computation is typically described in academic articles. Technical difficulties are addressed by a hypothetical, worked-out example. We show how Bayesian computation can be broken down into a series of simpler calculations, which can then be assembled together to complete a computationally more complex model. The details are described much more explicitly than what is typically available in elementary introductions to Bayesian modeling so that readers are not overwhelmed by the mathematics. Moreover, the provided computer program shows how Bayesian LCA can be implemented with relative ease. The computer program is then applied in a large, real-world data set and explained line-by-line. We outline the general steps in how to extend these considerations to other methodological applications. We conclude with suggestions for further readings.
[Bayesian approach for the cost-effectiveness evaluation of healthcare technologies].
Berchialla, Paola; Gregori, Dario; Brunello, Franco; Veltri, Andrea; Petrinco, Michele; Pagano, Eva
2009-01-01
The development of Bayesian statistical methods for the assessment of the cost-effectiveness of health care technologies is reviewed. Although many studies adopt a frequentist approach, several authors have advocated the use of Bayesian methods in health economics. Emphasis has been placed on the advantages of the Bayesian approach, which include: (i) the ability to make more intuitive and meaningful inferences; (ii) the ability to tackle complex problems, such as allowing for the inclusion of patients who generate no cost, thanks to the availability of powerful computational algorithms; (iii) the importance of a full use of quantitative and structural prior information to produce realistic inferences. Much literature comparing the cost-effectiveness of two treatments is based on the incremental cost-effectiveness ratio. However, new methods are arising with the purpose of decision making. These methods are based on a net benefits approach. In the present context, the cost-effectiveness acceptability curves have been pointed out to be intrinsically Bayesian in their formulation. They plot the probability of a positive net benefit against the threshold cost of a unit increase in efficacy.A case study is presented in order to illustrate the Bayesian statistics in the cost-effectiveness analysis. Emphasis is placed on the cost-effectiveness acceptability curves. Advantages and disadvantages of the method described in this paper have been compared to frequentist methods and discussed.
Wright, David K.; MacEachern, Scott; Lee, Jaeyong
2014-01-01
The locations of diy-geδ-bay (DGB) sites in the Mandara Mountains, northern Cameroon are hypothesized to occur as a function of their ability to see and be seen from points on the surrounding landscape. A series of geostatistical, two-way and Bayesian logistic regression analyses were performed to test two hypotheses related to the intervisibility of the sites to one another and their visual prominence on the landscape. We determine that the intervisibility of the sites to one another is highly statistically significant when compared to 10 stratified-random permutations of DGB sites. Bayesian logistic regression additionally demonstrates that the visibility of the sites to points on the surrounding landscape is statistically significant. The location of sites appears to have also been selected on the basis of lower slope than random permutations of sites. Using statistical measures, many of which are not commonly employed in archaeological research, to evaluate aspects of visibility on the landscape, we conclude that the placement of DGB sites improved their conspicuousness for enhanced ritual, social cooperation and/or competition purposes. PMID:25383883
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-08-27
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the fluttermore » speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.« less
Bayesian Estimation of Thermonuclear Reaction Rates for Deuterium+Deuterium Reactions
NASA Astrophysics Data System (ADS)
Gómez Iñesta, Á.; Iliadis, C.; Coc, A.
2017-11-01
The study of d+d reactions is of major interest since their reaction rates affect the predicted abundances of D, 3He, and 7Li. In particular, recent measurements of primordial D/H ratios call for reduced uncertainties in the theoretical abundances predicted by Big Bang nucleosynthesis (BBN). Different authors have studied reactions involved in BBN by incorporating new experimental data and a careful treatment of systematic and probabilistic uncertainties. To analyze the experimental data, Coc et al. used results of ab initio models for the theoretical calculation of the energy dependence of S-factors in conjunction with traditional statistical methods based on χ 2 minimization. Bayesian methods have now spread to many scientific fields and provide numerous advantages in data analysis. Astrophysical S-factors and reaction rates using Bayesian statistics were calculated by Iliadis et al. Here we present a similar analysis for two d+d reactions, d(d, n)3He and d(d, p)3H, that has been translated into a total decrease of the predicted D/H value by 0.16%.
Teaching Bayesian Statistics to Undergraduate Students through Debates
ERIC Educational Resources Information Center
Stewart, Sepideh; Stewart, Wayne
2014-01-01
This paper describes a lecturer's approach to teaching Bayesian statistics to students who were only exposed to the classical paradigm. The study shows how the lecturer extended himself by making use of ventriloquist dolls to grab hold of students' attention and embed important ideas in revealing the differences between the Bayesian and classical…
Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina
2013-01-01
In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.
Krypotos, Angelos-Miltiadis; Klugkist, Irene; Engelhard, Iris M.
2017-01-01
ABSTRACT Threat conditioning procedures have allowed the experimental investigation of the pathogenesis of Post-Traumatic Stress Disorder. The findings of these procedures have also provided stable foundations for the development of relevant intervention programs (e.g. exposure therapy). Statistical inference of threat conditioning procedures is commonly based on p-values and Null Hypothesis Significance Testing (NHST). Nowadays, however, there is a growing concern about this statistical approach, as many scientists point to the various limitations of p-values and NHST. As an alternative, the use of Bayes factors and Bayesian hypothesis testing has been suggested. In this article, we apply this statistical approach to threat conditioning data. In order to enable the easy computation of Bayes factors for threat conditioning data we present a new R package named condir, which can be used either via the R console or via a Shiny application. This article provides both a non-technical introduction to Bayesian analysis for researchers using the threat conditioning paradigm, and the necessary tools for computing Bayes factors easily. PMID:29038683
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-01
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-08
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
NASA Astrophysics Data System (ADS)
Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.
2015-12-01
Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.
Exoplanet Biosignatures: Future Directions
Bains, William; Cronin, Leroy; DasSarma, Shiladitya; Danielache, Sebastian; Domagal-Goldman, Shawn; Kacar, Betul; Kiang, Nancy Y.; Lenardic, Adrian; Reinhard, Christopher T.; Moore, William; Schwieterman, Edward W.; Shkolnik, Evgenya L.; Smith, Harrison B.
2018-01-01
Abstract We introduce a Bayesian method for guiding future directions for detection of life on exoplanets. We describe empirical and theoretical work necessary to place constraints on the relevant likelihoods, including those emerging from better understanding stellar environment, planetary climate and geophysics, geochemical cycling, the universalities of physics and chemistry, the contingencies of evolutionary history, the properties of life as an emergent complex system, and the mechanisms driving the emergence of life. We provide examples for how the Bayesian formalism could guide future search strategies, including determining observations to prioritize or deciding between targeted searches or larger lower resolution surveys to generate ensemble statistics and address how a Bayesian methodology could constrain the prior probability of life with or without a positive detection. Key Words: Exoplanets—Biosignatures—Life detection—Bayesian analysis. Astrobiology 18, 779–824. PMID:29938538
Population forecasts for Bangladesh, using a Bayesian methodology.
Mahsin, Md; Hossain, Syed Shahadat
2012-12-01
Population projection for many developing countries could be quite a challenging task for the demographers mostly due to lack of availability of enough reliable data. The objective of this paper is to present an overview of the existing methods for population forecasting and to propose an alternative based on the Bayesian statistics, combining the formality of inference. The analysis has been made using Markov Chain Monte Carlo (MCMC) technique for Bayesian methodology available with the software WinBUGS. Convergence diagnostic techniques available with the WinBUGS software have been applied to ensure the convergence of the chains necessary for the implementation of MCMC. The Bayesian approach allows for the use of observed data and expert judgements by means of appropriate priors, and a more realistic population forecasts, along with associated uncertainty, has been possible.
Exoplanet Biosignatures: Future Directions.
Walker, Sara I; Bains, William; Cronin, Leroy; DasSarma, Shiladitya; Danielache, Sebastian; Domagal-Goldman, Shawn; Kacar, Betul; Kiang, Nancy Y; Lenardic, Adrian; Reinhard, Christopher T; Moore, William; Schwieterman, Edward W; Shkolnik, Evgenya L; Smith, Harrison B
2018-06-01
We introduce a Bayesian method for guiding future directions for detection of life on exoplanets. We describe empirical and theoretical work necessary to place constraints on the relevant likelihoods, including those emerging from better understanding stellar environment, planetary climate and geophysics, geochemical cycling, the universalities of physics and chemistry, the contingencies of evolutionary history, the properties of life as an emergent complex system, and the mechanisms driving the emergence of life. We provide examples for how the Bayesian formalism could guide future search strategies, including determining observations to prioritize or deciding between targeted searches or larger lower resolution surveys to generate ensemble statistics and address how a Bayesian methodology could constrain the prior probability of life with or without a positive detection. Key Words: Exoplanets-Biosignatures-Life detection-Bayesian analysis. Astrobiology 18, 779-824.
Bayesian Propensity Score Analysis: Simulation and Case Study
ERIC Educational Resources Information Center
Kaplan, David; Chen, Cassie J. S.
2011-01-01
Propensity score analysis (PSA) has been used in a variety of settings, such as education, epidemiology, and sociology. Most typically, propensity score analysis has been implemented within the conventional frequentist perspective of statistics. This perspective, as is well known, does not account for uncertainty in either the parameters of the…
Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.
Verde, Pablo E
2010-12-30
In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples. Copyright © 2010 John Wiley & Sons, Ltd.
Markov Random Fields, Stochastic Quantization and Image Analysis
1990-01-01
Markov random fields based on the lattice Z2 have been extensively used in image analysis in a Bayesian framework as a-priori models for the...of Image Analysis can be given some fundamental justification then there is a remarkable connection between Probabilistic Image Analysis , Statistical Mechanics and Lattice-based Euclidean Quantum Field Theory.
Modified Bayesian Kriging for Noisy Response Problems for Reliability Analysis
2015-01-01
52242, USA nicholas-gaul@uiowa.edu Mary Kathryn Cowles Department of Statistics & Actuarial Science College of Liberal Arts and Sciences , The...Forrester, A. I. J., & Keane, A. J. (2009). Recent advances in surrogate-based optimization. Progress in Aerospace Sciences , 45(1–3), 50-79. doi...Wiley. [27] Sacks, J., Welch, W. J., Toby J. Mitchell, & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science , 4
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
Bayes in biological anthropology.
Konigsberg, Lyle W; Frankenberg, Susan R
2013-12-01
In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.
Multiscale hidden Markov models for photon-limited imaging
NASA Astrophysics Data System (ADS)
Nowak, Robert D.
1999-06-01
Photon-limited image analysis is often hindered by low signal-to-noise ratios. A novel Bayesian multiscale modeling and analysis method is developed in this paper to assist in these challenging situations. In addition to providing a very natural and useful framework for modeling an d processing images, Bayesian multiscale analysis is often much less computationally demanding compared to classical Markov random field models. This paper focuses on a probabilistic graph model called the multiscale hidden Markov model (MHMM), which captures the key inter-scale dependencies present in natural image intensities. The MHMM framework presented here is specifically designed for photon-limited imagin applications involving Poisson statistics, and applications to image intensity analysis are examined.
Bayesian statistics in radionuclide metrology: measurement of a decaying source
NASA Astrophysics Data System (ADS)
Bochud, François O.; Bailat, Claude J.; Laedermann, Jean-Pascal
2007-08-01
The most intuitive way of defining a probability is perhaps through the frequency at which it appears when a large number of trials are realized in identical conditions. The probability derived from the obtained histogram characterizes the so-called frequentist or conventional statistical approach. In this sense, probability is defined as a physical property of the observed system. By contrast, in Bayesian statistics, a probability is not a physical property or a directly observable quantity, but a degree of belief or an element of inference. The goal of this paper is to show how Bayesian statistics can be used in radionuclide metrology and what its advantages and disadvantages are compared with conventional statistics. This is performed through the example of an yttrium-90 source typically encountered in environmental surveillance measurement. Because of the very low activity of this kind of source and the small half-life of the radionuclide, this measurement takes several days, during which the source decays significantly. Several methods are proposed to compute simultaneously the number of unstable nuclei at a given reference time, the decay constant and the background. Asymptotically, all approaches give the same result. However, Bayesian statistics produces coherent estimates and confidence intervals in a much smaller number of measurements. Apart from the conceptual understanding of statistics, the main difficulty that could deter radionuclide metrologists from using Bayesian statistics is the complexity of the computation.
Kwon, Deukwoo; Reis, Isildinha M
2015-08-12
When conducting a meta-analysis of a continuous outcome, estimated means and standard deviations from the selected studies are required in order to obtain an overall estimate of the mean effect and its confidence interval. If these quantities are not directly reported in the publications, they must be estimated from other reported summary statistics, such as the median, the minimum, the maximum, and quartiles. We propose a simulation-based estimation approach using the Approximate Bayesian Computation (ABC) technique for estimating mean and standard deviation based on various sets of summary statistics found in published studies. We conduct a simulation study to compare the proposed ABC method with the existing methods of Hozo et al. (2005), Bland (2015), and Wan et al. (2014). In the estimation of the standard deviation, our ABC method performs better than the other methods when data are generated from skewed or heavy-tailed distributions. The corresponding average relative error (ARE) approaches zero as sample size increases. In data generated from the normal distribution, our ABC performs well. However, the Wan et al. method is best for estimating standard deviation under normal distribution. In the estimation of the mean, our ABC method is best regardless of assumed distribution. ABC is a flexible method for estimating the study-specific mean and standard deviation for meta-analysis, especially with underlying skewed or heavy-tailed distributions. The ABC method can be applied using other reported summary statistics such as the posterior mean and 95 % credible interval when Bayesian analysis has been employed.
Bayesian B-spline mapping for dynamic quantitative traits.
Xing, Jun; Li, Jiahan; Yang, Runqing; Zhou, Xiaojing; Xu, Shizhong
2012-04-01
Owing to their ability and flexibility to describe individual gene expression at different time points, random regression (RR) analyses have become a popular procedure for the genetic analysis of dynamic traits whose phenotypes are collected over time. Specifically, when modelling the dynamic patterns of gene expressions in the RR framework, B-splines have been proved successful as an alternative to orthogonal polynomials. In the so-called Bayesian B-spline quantitative trait locus (QTL) mapping, B-splines are used to characterize the patterns of QTL effects and individual-specific time-dependent environmental errors over time, and the Bayesian shrinkage estimation method is employed to estimate model parameters. Extensive simulations demonstrate that (1) in terms of statistical power, Bayesian B-spline mapping outperforms the interval mapping based on the maximum likelihood; (2) for the simulated dataset with complicated growth curve simulated by B-splines, Legendre polynomial-based Bayesian mapping is not capable of identifying the designed QTLs accurately, even when higher-order Legendre polynomials are considered and (3) for the simulated dataset using Legendre polynomials, the Bayesian B-spline mapping can find the same QTLs as those identified by Legendre polynomial analysis. All simulation results support the necessity and flexibility of B-spline in Bayesian mapping of dynamic traits. The proposed method is also applied to a real dataset, where QTLs controlling the growth trajectory of stem diameters in Populus are located.
Bettenbühl, Mario; Rusconi, Marco; Engbert, Ralf; Holschneider, Matthias
2012-01-01
Complex biological dynamics often generate sequences of discrete events which can be described as a Markov process. The order of the underlying Markovian stochastic process is fundamental for characterizing statistical dependencies within sequences. As an example for this class of biological systems, we investigate the Markov order of sequences of microsaccadic eye movements from human observers. We calculate the integrated likelihood of a given sequence for various orders of the Markov process and use this in a Bayesian framework for statistical inference on the Markov order. Our analysis shows that data from most participants are best explained by a first-order Markov process. This is compatible with recent findings of a statistical coupling of subsequent microsaccade orientations. Our method might prove to be useful for a broad class of biological systems.
McDonnell, J D; Schunck, N; Higdon, D; Sarich, J; Wild, S M; Nazarewicz, W
2015-03-27
Statistical tools of uncertainty quantification can be used to assess the information content of measured observables with respect to present-day theoretical models, to estimate model errors and thereby improve predictive capability, to extrapolate beyond the regions reached by experiment, and to provide meaningful input to applications and planned measurements. To showcase new opportunities offered by such tools, we make a rigorous analysis of theoretical statistical uncertainties in nuclear density functional theory using Bayesian inference methods. By considering the recent mass measurements from the Canadian Penning Trap at Argonne National Laboratory, we demonstrate how the Bayesian analysis and a direct least-squares optimization, combined with high-performance computing, can be used to assess the information content of the new data with respect to a model based on the Skyrme energy density functional approach. Employing the posterior probability distribution computed with a Gaussian process emulator, we apply the Bayesian framework to propagate theoretical statistical uncertainties in predictions of nuclear masses, two-neutron dripline, and fission barriers. Overall, we find that the new mass measurements do not impose a constraint that is strong enough to lead to significant changes in the model parameters. The example discussed in this study sets the stage for quantifying and maximizing the impact of new measurements with respect to current modeling and guiding future experimental efforts, thus enhancing the experiment-theory cycle in the scientific method.
Probabilistic models in human sensorimotor control
Wolpert, Daniel M.
2009-01-01
Sensory and motor uncertainty form a fundamental constraint on human sensorimotor control. Bayesian decision theory (BDT) has emerged as a unifying framework to understand how the central nervous system performs optimal estimation and control in the face of such uncertainty. BDT has two components: Bayesian statistics and decision theory. Here we review Bayesian statistics and show how it applies to estimating the state of the world and our own body. Recent results suggest that when learning novel tasks we are able to learn the statistical properties of both the world and our own sensory apparatus so as to perform estimation using Bayesian statistics. We review studies which suggest that humans can combine multiple sources of information to form maximum likelihood estimates, can incorporate prior beliefs about possible states of the world so as to generate maximum a posteriori estimates and can use Kalman filter-based processes to estimate time-varying states. Finally, we review Bayesian decision theory in motor control and how the central nervous system processes errors to determine loss functions and optimal actions. We review results that suggest we plan movements based on statistics of our actions that result from signal-dependent noise on our motor outputs. Taken together these studies provide a statistical framework for how the motor system performs in the presence of uncertainty. PMID:17628731
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Turi, Christina E; Murch, Susan J
2013-07-09
Ethnobotanical research and the study of plants used for rituals, ceremonies and to connect with the spirit world have led to the discovery of many novel psychoactive compounds such as nicotine, caffeine, and cocaine. In North America, spiritual and ceremonial uses of plants are well documented and can be accessed online via the University of Michigan's Native American Ethnobotany Database. The objective of the study was to compare Residual, Bayesian, Binomial and Imprecise Dirichlet Model (IDM) analyses of ritual, ceremonial and spiritual plants in Moerman's ethnobotanical database and to identify genera that may be good candidates for the discovery of novel psychoactive compounds. The database was queried with the following format "Family Name AND Ceremonial OR Spiritual" for 263 North American botanical families. Spiritual and ceremonial flora consisted of 86 families with 517 species belonging to 292 genera. Spiritual taxa were then grouped further into ceremonial medicines and items categories. Residual, Bayesian, Binomial and IDM analysis were performed to identify over and under-utilized families. The 4 statistical approaches were in good agreement when identifying under-utilized families but large families (>393 species) were underemphasized by Binomial, Bayesian and IDM approaches for over-utilization. Residual, Binomial, and IDM analysis identified similar families as over-utilized in the medium (92-392 species) and small (<92 species) classes. The families Apiaceae, Asteraceae, Ericacea, Pinaceae and Salicaceae were identified as significantly over-utilized as ceremonial medicines in medium and large sized families. Analysis of genera within the Apiaceae and Asteraceae suggest that the genus Ligusticum and Artemisia are good candidates for facilitating the discovery of novel psychoactive compounds. The 4 statistical approaches were not consistent in the selection of over-utilization of flora. Residual analysis revealed overall trends that were supported by Binomial analysis when separated into small, medium and large families. The Bayesian, Binomial and IDM approaches identified different genera as potentially important. Species belonging to the genus Artemisia and Ligusticum were most consistently identified and may be valuable in future studies of the ethnopharmacology. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
pyblocxs: Bayesian Low-Counts X-ray Spectral Analysis in Sherpa
NASA Astrophysics Data System (ADS)
Siemiginowska, A.; Kashyap, V.; Refsdal, B.; van Dyk, D.; Connors, A.; Park, T.
2011-07-01
Typical X-ray spectra have low counts and should be modeled using the Poisson distribution. However, χ2 statistic is often applied as an alternative and the data are assumed to follow the Gaussian distribution. A variety of weights to the statistic or a binning of the data is performed to overcome the low counts issues. However, such modifications introduce biases or/and a loss of information. Standard modeling packages such as XSPEC and Sherpa provide the Poisson likelihood and allow computation of rudimentary MCMC chains, but so far do not allow for setting a full Bayesian model. We have implemented a sophisticated Bayesian MCMC-based algorithm to carry out spectral fitting of low counts sources in the Sherpa environment. The code is a Python extension to Sherpa and allows to fit a predefined Sherpa model to high-energy X-ray spectral data and other generic data. We present the algorithm and discuss several issues related to the implementation, including flexible definition of priors and allowing for variations in the calibration information.
Rodgers, Joseph Lee
2016-01-01
The Bayesian-frequentist debate typically portrays these statistical perspectives as opposing views. However, both Bayesian and frequentist statisticians have expanded their epistemological basis away from a singular focus on the null hypothesis, to a broader perspective involving the development and comparison of competing statistical/mathematical models. For frequentists, statistical developments such as structural equation modeling and multilevel modeling have facilitated this transition. For Bayesians, the Bayes factor has facilitated this transition. The Bayes factor is treated in articles within this issue of Multivariate Behavioral Research. The current presentation provides brief commentary on those articles and more extended discussion of the transition toward a modern modeling epistemology. In certain respects, Bayesians and frequentists share common goals.
Whose statistical reasoning is facilitated by a causal structure intervention?
McNair, Simon; Feeney, Aidan
2015-02-01
People often struggle when making Bayesian probabilistic estimates on the basis of competing sources of statistical evidence. Recently, Krynski and Tenenbaum (Journal of Experimental Psychology: General, 136, 430-450, 2007) proposed that a causal Bayesian framework accounts for peoples' errors in Bayesian reasoning and showed that, by clarifying the causal relations among the pieces of evidence, judgments on a classic statistical reasoning problem could be significantly improved. We aimed to understand whose statistical reasoning is facilitated by the causal structure intervention. In Experiment 1, although we observed causal facilitation effects overall, the effect was confined to participants high in numeracy. We did not find an overall facilitation effect in Experiment 2 but did replicate the earlier interaction between numerical ability and the presence or absence of causal content. This effect held when we controlled for general cognitive ability and thinking disposition. Our results suggest that clarifying causal structure facilitates Bayesian judgments, but only for participants with sufficient understanding of basic concepts in probability and statistics.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
NASA Astrophysics Data System (ADS)
Auvinen, Jussi; Bernhard, Jonah E.; Bass, Steffen A.; Karpenko, Iurii
2018-04-01
We determine the probability distributions of the shear viscosity over the entropy density ratio η /s in the quark-gluon plasma formed in Au + Au collisions at √{sN N}=19.6 ,39 , and 62.4 GeV , using Bayesian inference and Gaussian process emulators for a model-to-data statistical analysis that probes the full input parameter space of a transport + viscous hydrodynamics hybrid model. We find the most likely value of η /s to be larger at smaller √{sN N}, although the uncertainties still allow for a constant value between 0.10 and 0.15 for the investigated collision energy range.
ERIC Educational Resources Information Center
Page, Robert; Satake, Eiki
2017-01-01
While interest in Bayesian statistics has been growing in statistics education, the treatment of the topic is still inadequate in both textbooks and the classroom. Because so many fields of study lead to careers that involve a decision-making process requiring an understanding of Bayesian methods, it is becoming increasingly clear that Bayesian…
Application of Turchin's method of statistical regularization
NASA Astrophysics Data System (ADS)
Zelenyi, Mikhail; Poliakova, Mariia; Nozik, Alexander; Khudyakov, Alexey
2018-04-01
During analysis of experimental data, one usually needs to restore a signal after it has been convoluted with some kind of apparatus function. According to Hadamard's definition this problem is ill-posed and requires regularization to provide sensible results. In this article we describe an implementation of the Turchin's method of statistical regularization based on the Bayesian approach to the regularization strategy.
A crash course on data analysis in asteroseismology
NASA Astrophysics Data System (ADS)
Appourchaux, Thierry
2014-02-01
In this course, I try to provide a few basics required for performing data analysis in asteroseismology. First, I address how one can properly treat times series: the sampling, the filtering effect, the use of Fourier transform, the associated statistics. Second, I address how one can apply statistics for decision making and for parameter estimation either in a frequentist of a Bayesian framework. Last, I review how these basic principle have been applied (or not) in asteroseismology.
Bayesian networks for evaluation of evidence from forensic entomology.
Andersson, M Gunnar; Sundström, Anders; Lindström, Anders
2013-09-01
In the aftermath of a CBRN incident, there is an urgent need to reconstruct events in order to bring the perpetrators to court and to take preventive actions for the future. The challenge is to discriminate, based on available information, between alternative scenarios. Forensic interpretation is used to evaluate to what extent results from the forensic investigation favor the prosecutors' or the defendants' arguments, using the framework of Bayesian hypothesis testing. Recently, several new scientific disciplines have been used in a forensic context. In the AniBioThreat project, the framework was applied to veterinary forensic pathology, tracing of pathogenic microorganisms, and forensic entomology. Forensic entomology is an important tool for estimating the postmortem interval in, for example, homicide investigations as a complement to more traditional methods. In this article we demonstrate the applicability of the Bayesian framework for evaluating entomological evidence in a forensic investigation through the analysis of a hypothetical scenario involving suspect movement of carcasses from a clandestine laboratory. Probabilities of different findings under the alternative hypotheses were estimated using a combination of statistical analysis of data, expert knowledge, and simulation, and entomological findings are used to update the beliefs about the prosecutors' and defendants' hypotheses and to calculate the value of evidence. The Bayesian framework proved useful for evaluating complex hypotheses using findings from several insect species, accounting for uncertainty about development rate, temperature, and precolonization. The applicability of the forensic statistic approach to evaluating forensic results from a CBRN incident is discussed.
Bayesian Computation for Log-Gaussian Cox Processes: A Comparative Analysis of Methods
Teng, Ming; Nathoo, Farouk S.; Johnson, Timothy D.
2017-01-01
The Log-Gaussian Cox Process is a commonly used model for the analysis of spatial point pattern data. Fitting this model is difficult because of its doubly-stochastic property, i.e., it is an hierarchical combination of a Poisson process at the first level and a Gaussian Process at the second level. Various methods have been proposed to estimate such a process, including traditional likelihood-based approaches as well as Bayesian methods. We focus here on Bayesian methods and several approaches that have been considered for model fitting within this framework, including Hamiltonian Monte Carlo, the Integrated nested Laplace approximation, and Variational Bayes. We consider these approaches and make comparisons with respect to statistical and computational efficiency. These comparisons are made through several simulation studies as well as through two applications, the first examining ecological data and the second involving neuroimaging data. PMID:29200537
Assessing noninferiority in a three-arm trial using the Bayesian approach.
Ghosh, Pulak; Nathoo, Farouk; Gönen, Mithat; Tiwari, Ram C
2011-07-10
Non-inferiority trials, which aim to demonstrate that a test product is not worse than a competitor by more than a pre-specified small amount, are of great importance to the pharmaceutical community. As a result, methodology for designing and analyzing such trials is required, and developing new methods for such analysis is an important area of statistical research. The three-arm trial consists of a placebo, a reference and an experimental treatment, and simultaneously tests the superiority of the reference over the placebo along with comparing this reference to an experimental treatment. In this paper, we consider the analysis of non-inferiority trials using Bayesian methods which incorporate both parametric as well as semi-parametric models. The resulting testing approach is both flexible and robust. The benefit of the proposed Bayesian methods is assessed via simulation, based on a study examining home-based blood pressure interventions. Copyright © 2011 John Wiley & Sons, Ltd.
Fenton, Norman; Neil, Martin; Berger, Daniel
2016-01-01
Although the last forty years has seen considerable growth in the use of statistics in legal proceedings, it is primarily classical statistical methods rather than Bayesian methods that have been used. Yet the Bayesian approach avoids many of the problems of classical statistics and is also well suited to a broader range of problems. This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes’ theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. We argue that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law. PMID:27398389
Fenton, Norman; Neil, Martin; Berger, Daniel
2016-06-01
Although the last forty years has seen considerable growth in the use of statistics in legal proceedings, it is primarily classical statistical methods rather than Bayesian methods that have been used. Yet the Bayesian approach avoids many of the problems of classical statistics and is also well suited to a broader range of problems. This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes' theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. We argue that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law.
Vernon, Ian; Liu, Junli; Goldstein, Michael; Rowe, James; Topping, Jen; Lindsey, Keith
2018-01-02
Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data of various forms. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embeded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources of uncertainty. The approach is demonstrated via application to a model of hormonal crosstalk in Arabidopsis root development, which has 32 rate parameters, for which we identify the sets of rate parameter values that lead to acceptable matches between model output and observed trend data. The multiple insights into the model's structure that this analysis provides are discussed. The methodology is applied to a second related model, and the biological consequences of the resulting comparison, including the evaluation of gene functions, are described. Bayesian uncertainty analysis for complex models using both emulators and history matching is shown to be a powerful technique that can greatly aid the study of a large class of systems biology models. It both provides insight into model behaviour and identifies the sets of rate parameters of interest.
Fisher, Neyman, and Bayes at FDA.
Rubin, Donald B
2016-01-01
The wise use of statistical ideas in practice essentially requires some Bayesian thinking, in contrast to the classical rigid frequentist dogma. This dogma too often has seemed to influence the applications of statistics, even at agencies like the FDA. Greg Campbell was one of the most important advocates there for more nuanced modes of thought, especially Bayesian statistics. Because two brilliant statisticians, Ronald Fisher and Jerzy Neyman, are often credited with instilling the traditional frequentist approach in current practice, I argue that both men were actually seeking very Bayesian answers, and neither would have endorsed the rigid application of their ideas.
New applications of maximum likelihood and Bayesian statistics in macromolecular crystallography.
McCoy, Airlie J
2002-10-01
Maximum likelihood methods are well known to macromolecular crystallographers as the methods of choice for isomorphous phasing and structure refinement. Recently, the use of maximum likelihood and Bayesian statistics has extended to the areas of molecular replacement and density modification, placing these methods on a stronger statistical foundation and making them more accurate and effective.
Embedding the results of focussed Bayesian fusion into a global context
NASA Astrophysics Data System (ADS)
Sander, Jennifer; Heizmann, Michael
2014-05-01
Bayesian statistics offers a well-founded and powerful fusion methodology also for the fusion of heterogeneous information sources. However, except in special cases, the needed posterior distribution is not analytically derivable. As consequence, Bayesian fusion may cause unacceptably high computational and storage costs in practice. Local Bayesian fusion approaches aim at reducing the complexity of the Bayesian fusion methodology significantly. This is done by concentrating the actual Bayesian fusion on the potentially most task relevant parts of the domain of the Properties of Interest. Our research on these approaches is motivated by an analogy to criminal investigations where criminalists pursue clues also only locally. This publication follows previous publications on a special local Bayesian fusion technique called focussed Bayesian fusion. Here, the actual calculation of the posterior distribution gets completely restricted to a suitably chosen local context. By this, the global posterior distribution is not completely determined. Strategies for using the results of a focussed Bayesian analysis appropriately are needed. In this publication, we primarily contrast different ways of embedding the results of focussed Bayesian fusion explicitly into a global context. To obtain a unique global posterior distribution, we analyze the application of the Maximum Entropy Principle that has been shown to be successfully applicable in metrology and in different other areas. To address the special need for making further decisions subsequently to the actual fusion task, we further analyze criteria for decision making under partial information.
ERIC Educational Resources Information Center
Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.
2008-01-01
Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…
The US EPA’s ToxCastTM program seeks to combine advances in high-throughput screening technology with methodologies from statistics and computer science to develop high-throughput decision support tools for assessing chemical hazard and risk. To develop new methods of analysis of...
McDonnell, J. D.; Schunck, N.; Higdon, D.; ...
2015-03-24
Statistical tools of uncertainty quantification can be used to assess the information content of measured observables with respect to present-day theoretical models, to estimate model errors and thereby improve predictive capability, to extrapolate beyond the regions reached by experiment, and to provide meaningful input to applications and planned measurements. To showcase new opportunities offered by such tools, we make a rigorous analysis of theoretical statistical uncertainties in nuclear density functional theory using Bayesian inference methods. By considering the recent mass measurements from the Canadian Penning Trap at Argonne National Laboratory, we demonstrate how the Bayesian analysis and a direct least-squaresmore » optimization, combined with high-performance computing, can be used to assess the information content of the new data with respect to a model based on the Skyrme energy density functional approach. Employing the posterior probability distribution computed with a Gaussian process emulator, we apply the Bayesian framework to propagate theoretical statistical uncertainties in predictions of nuclear masses, two-neutron dripline, and fission barriers. Overall, we find that the new mass measurements do not impose a constraint that is strong enough to lead to significant changes in the model parameters. In addition, the example discussed in this study sets the stage for quantifying and maximizing the impact of new measurements with respect to current modeling and guiding future experimental efforts, thus enhancing the experiment-theory cycle in the scientific method.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDonnell, J. D.; Schunck, N.; Higdon, D.
2015-03-24
Statistical tools of uncertainty quantification can be used to assess the information content of measured observables with respect to present-day theoretical models, to estimate model errors and thereby improve predictive capability, to extrapolate beyond the regions reached by experiment, and to provide meaningful input to applications and planned measurements. To showcase new opportunities offered by such tools, we make a rigorous analysis of theoretical statistical uncertainties in nuclear density functional theory using Bayesian inference methods. By considering the recent mass measurements from the Canadian Penning Trap at Argonne National Laboratory, we demonstrate how the Bayesian analysis and a direct least-squaresmore » optimization, combined with high-performance computing, can be used to assess the information content of the new data with respect to a model based on the Skyrme energy density functional approach. Employing the posterior probability distribution computed with a Gaussian process emulator, we apply the Bayesian framework to propagate theoretical statistical uncertainties in predictions of nuclear masses, two-neutron dripline, and fission barriers. Overall, we find that the new mass measurements do not impose a constraint that is strong enough to lead to significant changes in the model parameters. As a result, the example discussed in this study sets the stage for quantifying and maximizing the impact of new measurements with respect to current modeling and guiding future experimental efforts, thus enhancing the experiment-theory cycle in the scientific method.« less
A study of finite mixture model: Bayesian approach on financial time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Lead isotope ratios for bullets, forensic evaluation in a Bayesian paradigm.
Sjåstad, Knut-Endre; Lucy, David; Andersen, Tom
2016-01-01
Forensic science is a discipline concerned with collection, examination and evaluation of physical evidence related to criminal cases. The results from the activities of the forensic scientist may ultimately be presented to the court in such a way that the triers of fact understand the implications of the data. Forensic science has been, and still is, driven by development of new technology, and in the last two decades evaluation of evidence based on logical reasoning and Bayesian statistic has reached some level of general acceptance within the forensic community. Tracing of lead fragments of unknown origin to a given source of ammunition is a task that might be of interest for the Court. Use of data from lead isotope ratios analysis interpreted within a Bayesian framework has shown to be suitable method to guide the Court to draw their conclusion for such task. In this work we have used isotopic composition of lead from small arms projectiles (cal. .22) and developed an approach based on Bayesian statistics and likelihood ratio calculation. The likelihood ratio is a single quantity that provides a measure of the value of evidence that can be used in the deliberation of the court. Copyright © 2015 Elsevier B.V. All rights reserved.
Bayesian Correlation Analysis for Sequence Count Data
Lau, Nelson; Perkins, Theodore J.
2016-01-01
Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities’ measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low—especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities’ signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset. PMID:27701449
Model-based Bayesian inference for ROC data analysis
NASA Astrophysics Data System (ADS)
Lei, Tianhu; Bae, K. Ty
2013-03-01
This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
Case studies in Bayesian microbial risk assessments.
Kennedy, Marc C; Clough, Helen E; Turner, Joanne
2009-12-21
The quantification of uncertainty and variability is a key component of quantitative risk analysis. Recent advances in Bayesian statistics make it ideal for integrating multiple sources of information, of different types and quality, and providing a realistic estimate of the combined uncertainty in the final risk estimates. We present two case studies related to foodborne microbial risks. In the first, we combine models to describe the sequence of events resulting in illness from consumption of milk contaminated with VTEC O157. We used Monte Carlo simulation to propagate uncertainty in some of the inputs to computer models describing the farm and pasteurisation process. Resulting simulated contamination levels were then assigned to consumption events from a dietary survey. Finally we accounted for uncertainty in the dose-response relationship and uncertainty due to limited incidence data to derive uncertainty about yearly incidences of illness in young children. Options for altering the risk were considered by running the model with different hypothetical policy-driven exposure scenarios. In the second case study we illustrate an efficient Bayesian sensitivity analysis for identifying the most important parameters of a complex computer code that simulated VTEC O157 prevalence within a managed dairy herd. This was carried out in 2 stages, first to screen out the unimportant inputs, then to perform a more detailed analysis on the remaining inputs. The method works by building a Bayesian statistical approximation to the computer code using a number of known code input/output pairs (training runs). We estimated that the expected total number of children aged 1.5-4.5 who become ill due to VTEC O157 in milk is 8.6 per year, with 95% uncertainty interval (0,11.5). The most extreme policy we considered was banning on-farm pasteurisation of milk, which reduced the estimate to 6.4 with 95% interval (0,11). In the second case study the effective number of inputs was reduced from 30 to 7 in the screening stage, and just 2 inputs were found to explain 82.8% of the output variance. A combined total of 500 runs of the computer code were used. These case studies illustrate the use of Bayesian statistics to perform detailed uncertainty and sensitivity analyses, integrating multiple information sources in a way that is both rigorous and efficient.
Editorial: Bayesian benefits for child psychology and psychiatry researchers.
Oldehinkel, Albertine J
2016-09-01
For many scientists, performing statistical tests has become an almost automated routine. However, p-values are frequently used and interpreted incorrectly; and even when used appropriately, p-values tend to provide answers that do not match researchers' questions and hypotheses well. Bayesian statistics present an elegant and often more suitable alternative. The Bayesian approach has rarely been applied in child psychology and psychiatry research so far, but the development of user-friendly software packages and tutorials has placed it well within reach now. Because Bayesian analyses require a more refined definition of hypothesized probabilities of possible outcomes than the classical approach, going Bayesian may offer the additional benefit of sparkling the development and refinement of theoretical models in our field. © 2016 Association for Child and Adolescent Mental Health.
DOT National Transportation Integrated Search
2010-12-01
Recent research suggests that traditional safety evaluation methods may be inadequate in accurately determining the effectiveness of roadway safety measures. In recent years, advanced statistical methods are being utilized in traffic safety studies t...
A BAYESIAN STATISTICAL APPROACH FOR THE EVALUATION OF CMAQ
Bayesian statistical methods are used to evaluate Community Multiscale Air Quality (CMAQ) model simulations of sulfate aerosol over a section of the eastern US for 4-week periods in summer and winter 2001. The observed data come from two U.S. Environmental Protection Agency data ...
Teaching Bayesian Statistics in a Health Research Methodology Program
ERIC Educational Resources Information Center
Pullenayegum, Eleanor M.; Thabane, Lehana
2009-01-01
Despite the appeal of Bayesian methods in health research, they are not widely used. This is partly due to a lack of courses in Bayesian methods at an appropriate level for non-statisticians in health research. Teaching such a course can be challenging because most statisticians have been taught Bayesian methods using a mathematical approach, and…
Wu, Jianyong; Gronewold, Andrew D; Rodriguez, Roberto A; Stewart, Jill R; Sobsey, Mark D
2014-02-01
Rapid quantification of viral pathogens in drinking and recreational water can help reduce waterborne disease risks. For this purpose, samples in small volume (e.g. 1L) are favored because of the convenience of collection, transportation and processing. However, the results of viral analysis are often subject to uncertainty. To overcome this limitation, we propose an approach that integrates Bayesian statistics, efficient concentration methods, and quantitative PCR (qPCR) to quantify viral pathogens in water. Using this approach, we quantified human adenoviruses (HAdVs) in eighteen samples of source water collected from six drinking water treatment plants. HAdVs were found in seven samples. In the other eleven samples, HAdVs were not detected by qPCR, but might have existed based on Bayesian inference. Our integrated approach that quantifies uncertainty provides a better understanding than conventional assessments of potential risks to public health, particularly in cases when pathogens may present a threat but cannot be detected by traditional methods. © 2013 Elsevier B.V. All rights reserved.
Genome-wide regression and prediction with the BGLR statistical package.
Pérez, Paulino; de los Campos, Gustavo
2014-10-01
Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis. Copyright © 2014 by the Genetics Society of America.
An absolute chronology for early Egypt using radiocarbon dating and Bayesian statistical modelling
Dee, Michael; Wengrow, David; Shortland, Andrew; Stevenson, Alice; Brock, Fiona; Girdland Flink, Linus; Bronk Ramsey, Christopher
2013-01-01
The Egyptian state was formed prior to the existence of verifiable historical records. Conventional dates for its formation are based on the relative ordering of artefacts. This approach is no longer considered sufficient for cogent historical analysis. Here, we produce an absolute chronology for Early Egypt by combining radiocarbon and archaeological evidence within a Bayesian paradigm. Our data cover the full trajectory of Egyptian state formation and indicate that the process occurred more rapidly than previously thought. We provide a timeline for the First Dynasty of Egypt of generational-scale resolution that concurs with prevailing archaeological analysis and produce a chronometric date for the foundation of Egypt that distinguishes between historical estimates. PMID:24204188
Nowcasting Cloud Fields for U.S. Air Force Special Operations
2017-03-01
application of Bayes’ Rule offers many advantages over Kernel Density Estimation (KDE) and other commonly used statistical post-processing methods...reflectance and probability of cloud. A statistical post-processing technique is applied using Bayesian estimation to train the system from a set of past...nowcasting, low cloud forecasting, cloud reflectance, ISR, Bayesian estimation, statistical post-processing, machine learning 15. NUMBER OF PAGES
Numerical study on the sequential Bayesian approach for radioactive materials detection
NASA Astrophysics Data System (ADS)
Qingpei, Xiang; Dongfeng, Tian; Jianyu, Zhu; Fanhua, Hao; Ge, Ding; Jun, Zeng
2013-01-01
A new detection method, based on the sequential Bayesian approach proposed by Candy et al., offers new horizons for the research of radioactive detection. Compared with the commonly adopted detection methods incorporated with statistical theory, the sequential Bayesian approach offers the advantages of shorter verification time during the analysis of spectra that contain low total counts, especially in complex radionuclide components. In this paper, a simulation experiment platform implanted with the methodology of sequential Bayesian approach was developed. Events sequences of γ-rays associating with the true parameters of a LaBr3(Ce) detector were obtained based on an events sequence generator using Monte Carlo sampling theory to study the performance of the sequential Bayesian approach. The numerical experimental results are in accordance with those of Candy. Moreover, the relationship between the detection model and the event generator, respectively represented by the expected detection rate (Am) and the tested detection rate (Gm) parameters, is investigated. To achieve an optimal performance for this processor, the interval of the tested detection rate as a function of the expected detection rate is also presented.
Combining statistical inference and decisions in ecology
Williams, Perry J.; Hooten, Mevin B.
2016-01-01
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation, and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem.
Hierarchical models and bayesian analysis of bird survey information
John R. Sauer; William A. Link; J. Andrew Royle
2005-01-01
Summary of bird survey information is a critical component of conservation activities, but often our summaries rely on statistical methods that do not accommodate the limitations of the information. Prioritization of species requires ranking and analysis of species by magnitude of population trend, but often magnitude of trend is a misleading measure of actual decline...
Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C
2018-03-07
Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
Turner, Rebecca M; Jackson, Dan; Wei, Yinghui; Thompson, Simon G; Higgins, Julian P T
2015-01-01
Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:25475839
Dokoumetzidis, Aristides; Aarons, Leon
2005-08-01
We investigated the propagation of population pharmacokinetic information across clinical studies by applying Bayesian techniques. The aim was to summarize the population pharmacokinetic estimates of a study in appropriate statistical distributions in order to use them as Bayesian priors in consequent population pharmacokinetic analyses. Various data sets of simulated and real clinical data were fitted with WinBUGS, with and without informative priors. The posterior estimates of fittings with non-informative priors were used to build parametric informative priors and the whole procedure was carried on in a consecutive manner. The posterior distributions of the fittings with informative priors where compared to those of the meta-analysis fittings of the respective combinations of data sets. Good agreement was found, for the simulated and experimental datasets when the populations were exchangeable, with the posterior distribution from the fittings with the prior to be nearly identical to the ones estimated with meta-analysis. However, when populations were not exchangeble an alternative parametric form for the prior, the natural conjugate prior, had to be used in order to have consistent results. In conclusion, the results of a population pharmacokinetic analysis may be summarized in Bayesian prior distributions that can be used consecutively with other analyses. The procedure is an alternative to meta-analysis and gives comparable results. It has the advantage that it is faster than the meta-analysis, due to the large datasets used with the latter and can be performed when the data included in the prior are not actually available.
BAYESIAN ESTIMATION OF THERMONUCLEAR REACTION RATES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Iliadis, C.; Anderson, K. S.; Coc, A.
The problem of estimating non-resonant astrophysical S -factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied to this problem in the past, almost all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extrasolar planets, gravitational waves, and Type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We presentmore » astrophysical S -factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the reactions d(p, γ ){sup 3}He, {sup 3}He({sup 3}He,2p){sup 4}He, and {sup 3}He( α , γ ){sup 7}Be, important for deuterium burning, solar neutrinos, and Big Bang nucleosynthesis.« less
The Importance of Proving the Null
Gallistel, C. R.
2010-01-01
Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose 3 common experimental questions: (a) Are 2 means the same? (b) Is performance at chance? (c) Are factors additive? PMID:19348549
Bayesian data analysis tools for atomic physics
NASA Astrophysics Data System (ADS)
Trassinelli, Martino
2017-10-01
We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested_fit to calculate the different probability distributions and other related quantities. Nested_fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.
Fully Bayesian tests of neutrality using genealogical summary statistics.
Drummond, Alexei J; Suchard, Marc A
2008-10-31
Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.
Tipping point analysis of atmospheric oxygen concentration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Livina, V. N.; Forbes, A. B.; Vaz Martins, T. M.
2015-03-15
We apply tipping point analysis to nine observational oxygen concentration records around the globe, analyse their dynamics and perform projections under possible future scenarios, leading to oxygen deficiency in the atmosphere. The analysis is based on statistical physics framework with stochastic modelling, where we represent the observed data as a composition of deterministic and stochastic components estimated from the observed data using Bayesian and wavelet techniques.
A Bayesian pick-the-winner design in a randomized phase II clinical trial.
Chen, Dung-Tsa; Huang, Po-Yu; Lin, Hui-Yi; Chiappori, Alberto A; Gabrilovich, Dmitry I; Haura, Eric B; Antonia, Scott J; Gray, Jhanelle E
2017-10-24
Many phase II clinical trials evaluate unique experimental drugs/combinations through multi-arm design to expedite the screening process (early termination of ineffective drugs) and to identify the most effective drug (pick the winner) to warrant a phase III trial. Various statistical approaches have been developed for the pick-the-winner design but have been criticized for lack of objective comparison among the drug agents. We developed a Bayesian pick-the-winner design by integrating a Bayesian posterior probability with Simon two-stage design in a randomized two-arm clinical trial. The Bayesian posterior probability, as the rule to pick the winner, is defined as probability of the response rate in one arm higher than in the other arm. The posterior probability aims to determine the winner when both arms pass the second stage of the Simon two-stage design. When both arms are competitive (i.e., both passing the second stage), the Bayesian posterior probability performs better to correctly identify the winner compared with the Fisher exact test in the simulation study. In comparison to a standard two-arm randomized design, the Bayesian pick-the-winner design has a higher power to determine a clear winner. In application to two studies, the approach is able to perform statistical comparison of two treatment arms and provides a winner probability (Bayesian posterior probability) to statistically justify the winning arm. We developed an integrated design that utilizes Bayesian posterior probability, Simon two-stage design, and randomization into a unique setting. It gives objective comparisons between the arms to determine the winner.
NASA Astrophysics Data System (ADS)
Shafii, M.; Tolson, B.; Matott, L. S.
2012-04-01
Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.
Bayesian inference for joint modelling of longitudinal continuous, binary and ordinal events.
Li, Qiuju; Pan, Jianxin; Belcher, John
2016-12-01
In medical studies, repeated measurements of continuous, binary and ordinal outcomes are routinely collected from the same patient. Instead of modelling each outcome separately, in this study we propose to jointly model the trivariate longitudinal responses, so as to take account of the inherent association between the different outcomes and thus improve statistical inferences. This work is motivated by a large cohort study in the North West of England, involving trivariate responses from each patient: Body Mass Index, Depression (Yes/No) ascertained with cut-off score not less than 8 at the Hospital Anxiety and Depression Scale, and Pain Interference generated from the Medical Outcomes Study 36-item short-form health survey with values returned on an ordinal scale 1-5. There are some well-established methods for combined continuous and binary, or even continuous and ordinal responses, but little work was done on the joint analysis of continuous, binary and ordinal responses. We propose conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes. Bayesian analysis methods are used to make statistical inferences. Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences. In the real data analysis, the proposed joint analysis yields a much smaller deviance information criterion value than the separate analysis, and shows other good statistical properties too. © The Author(s) 2014.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
DATMAN: A reliability data analysis program using Bayesian updating
DOE Office of Scientific and Technical Information (OSTI.GOV)
Becker, M.; Feltus, M.A.
1996-12-31
Preventive maintenance (PM) techniques focus on the prevention of failures, in particular, system components that are important to plant functions. Reliability-centered maintenance (RCM) improves on the PM techniques by introducing a set of guidelines by which to evaluate the system functions. It also minimizes intrusive maintenance, labor, and equipment downtime without sacrificing system performance when its function is essential for plant safety. Both the PM and RCM approaches require that system reliability data be updated as more component failures and operation time are acquired. Systems reliability and the likelihood of component failures can be calculated by Bayesian statistical methods, whichmore » can update these data. The DATMAN computer code has been developed at Penn State to simplify the Bayesian analysis by performing tedious calculations needed for RCM reliability analysis. DATMAN reads data for updating, fits a distribution that best fits the data, and calculates component reliability. DATMAN provides a user-friendly interface menu that allows the user to choose from several common prior and posterior distributions, insert new failure data, and visually select the distribution that matches the data most accurately.« less
Non-arbitrage in financial markets: A Bayesian approach for verification
NASA Astrophysics Data System (ADS)
Cerezetti, F. V.; Stern, Julio Michael
2012-10-01
The concept of non-arbitrage plays an essential role in finance theory. Under certain regularity conditions, the Fundamental Theorem of Asset Pricing states that, in non-arbitrage markets, prices of financial instruments are martingale processes. In this theoretical framework, the analysis of the statistical distributions of financial assets can assist in understanding how participants behave in the markets, and may or may not engender arbitrage conditions. Assuming an underlying Variance Gamma statistical model, this study aims to test, using the FBST - Full Bayesian Significance Test, if there is a relevant price difference between essentially the same financial asset traded at two distinct locations. Specifically, we investigate and compare the behavior of call options on the BOVESPA Index traded at (a) the Equities Segment and (b) the Derivatives Segment of BM&FBovespa. Our results seem to point out significant statistical differences. To what extent this evidence is actually the expression of perennial arbitrage opportunities is still an open question.
Nonparametric Bayesian predictive distributions for future order statistics
Richard A. Johnson; James W. Evans; David W. Green
1999-01-01
We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures
Moore, Brian R.; Höhna, Sebastian; May, Michael R.; Rannala, Bruce; Huelsenbeck, John P.
2016-01-01
Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM. PMID:27512038
Statistical innovations in the medical device world sparked by the FDA.
Campbell, Gregory; Yue, Lilly Q
2016-01-01
The world of medical devices while highly diverse is extremely innovative, and this facilitates the adoption of innovative statistical techniques. Statisticians in the Center for Devices and Radiological Health (CDRH) at the Food and Drug Administration (FDA) have provided leadership in implementing statistical innovations. The innovations discussed include: the incorporation of Bayesian methods in clinical trials, adaptive designs, the use and development of propensity score methodology in the design and analysis of non-randomized observational studies, the use of tipping-point analysis for missing data, techniques for diagnostic test evaluation, bridging studies for companion diagnostic tests, quantitative benefit-risk decisions, and patient preference studies.
NASA Astrophysics Data System (ADS)
Gong, Maozhen
Selecting an appropriate prior distribution is a fundamental issue in Bayesian Statistics. In this dissertation, under the framework provided by Berger and Bernardo, I derive the reference priors for several models which include: Analysis of Variance (ANOVA)/Analysis of Covariance (ANCOVA) models with a categorical variable under common ordering constraints, the conditionally autoregressive (CAR) models and the simultaneous autoregressive (SAR) models with a spatial autoregression parameter rho considered. The performances of reference priors for ANOVA/ANCOVA models are evaluated by simulation studies with comparisons to Jeffreys' prior and Least Squares Estimation (LSE). The priors are then illustrated in a Bayesian model of the "Risk of Type 2 Diabetes in New Mexico" data, where the relationship between the type 2 diabetes risk (through Hemoglobin A1c) and different smoking levels is investigated. In both simulation studies and real data set modeling, the reference priors that incorporate internal order information show good performances and can be used as default priors. The reference priors for the CAR and SAR models are also illustrated in the "1999 SAT State Average Verbal Scores" data with a comparison to a Uniform prior distribution. Due to the complexity of the reference priors for both CAR and SAR models, only a portion (12 states in the Midwest) of the original data set is considered. The reference priors can give a different marginal posterior distribution compared to a Uniform prior, which provides an alternative for prior specifications for areal data in Spatial statistics.
"Magnitude-based inference": a statistical review.
Welsh, Alan H; Knight, Emma J
2015-04-01
We consider "magnitude-based inference" and its interpretation by examining in detail its use in the problem of comparing two means. We extract from the spreadsheets, which are provided to users of the analysis (http://www.sportsci.org/), a precise description of how "magnitude-based inference" is implemented. We compare the implemented version of the method with general descriptions of it and interpret the method in familiar statistical terms. We show that "magnitude-based inference" is not a progressive improvement on modern statistics. The additional probabilities introduced are not directly related to the confidence interval but, rather, are interpretable either as P values for two different nonstandard tests (for different null hypotheses) or as approximate Bayesian calculations, which also lead to a type of test. We also discuss sample size calculations associated with "magnitude-based inference" and show that the substantial reduction in sample sizes claimed for the method (30% of the sample size obtained from standard frequentist calculations) is not justifiable so the sample size calculations should not be used. Rather than using "magnitude-based inference," a better solution is to be realistic about the limitations of the data and use either confidence intervals or a fully Bayesian analysis.
Bayesian truthing and experimental validation in homeland security and defense
NASA Astrophysics Data System (ADS)
Jannson, Tomasz; Forrester, Thomas; Wang, Wenjian; Kostrzewski, Andrew; Pradhan, Ranjit
2014-05-01
In this paper we discuss relations between Bayesian Truthing (experimental validation), Bayesian statistics, and Binary Sensing in the context of selected Homeland Security and Intelligence, Surveillance, Reconnaissance (ISR) optical and nonoptical application scenarios. The basic Figure of Merit (FoM) is Positive Predictive Value (PPV), as well as false positives and false negatives. By using these simple binary statistics, we can analyze, classify, and evaluate a broad variety of events including: ISR; natural disasters; QC; and terrorism-related, GIS-related, law enforcement-related, and other C3I events.
NASA Astrophysics Data System (ADS)
Berliner, M.
2017-12-01
Bayesian statistical decision theory offers a natural framework for decision-policy making in the presence of uncertainty. Key advantages of the approach include efficient incorporation of information and observations. However, in complicated settings it is very difficult, perhaps essentially impossible, to formalize the mathematical inputs needed in the approach. Nevertheless, using the approach as a template is useful for decision support; that is, organizing and communicating our analyses. Bayesian hierarchical modeling is valuable in quantifying and managing uncertainty such cases. I review some aspects of the idea emphasizing statistical model development and use in the context of sea-level rise.
Fu, Zhibiao; Baker, Daniel; Cheng, Aili; Leighton, Julie; Appelbaum, Edward; Aon, Juan
2016-05-01
The principle of quality by design (QbD) has been widely applied to biopharmaceutical manufacturing processes. Process characterization is an essential step to implement the QbD concept to establish the design space and to define the proven acceptable ranges (PAR) for critical process parameters (CPPs). In this study, we present characterization of a Saccharomyces cerevisiae fermentation process using risk assessment analysis, statistical design of experiments (DoE), and the multivariate Bayesian predictive approach. The critical quality attributes (CQAs) and CPPs were identified with a risk assessment. The statistical model for each attribute was established using the results from the DoE study with consideration given to interactions between CPPs. Both the conventional overlapping contour plot and the multivariate Bayesian predictive approaches were used to establish the region of process operating conditions where all attributes met their specifications simultaneously. The quantitative Bayesian predictive approach was chosen to define the PARs for the CPPs, which apply to the manufacturing control strategy. Experience from the 10,000 L manufacturing scale process validation, including 64 continued process verification batches, indicates that the CPPs remain under a state of control and within the established PARs. The end product quality attributes were within their drug substance specifications. The probability generated with the Bayesian approach was also used as a tool to assess CPP deviations. This approach can be extended to develop other production process characterization and quantify a reliable operating region. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:799-812, 2016. © 2016 American Institute of Chemical Engineers.
An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations
Majumdar, Arunabha; Haldar, Tanushree; Bhattacharya, Sourabh; Witte, John S.
2018-01-01
Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy). For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes) that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC) technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package ‘CPBayes’ implementing the proposed method. PMID:29432419
Fully Bayesian Estimation of Data from Single Case Designs
ERIC Educational Resources Information Center
Rindskopf, David
2013-01-01
Single case designs (SCDs) generally consist of a small number of short time series in two or more phases. The analysis of SCDs statistically fits in the framework of a multilevel model, or hierarchical model. The usual analysis does not take into account the uncertainty in the estimation of the random effects. This not only has an effect on the…
Modeling Soot Oxidation and Gasification with Bayesian Statistics
Josephson, Alexander J.; Gaffin, Neal D.; Smith, Sean T.; ...
2017-08-22
This paper presents a statistical method for model calibration using data collected from literature. The method is used to calibrate parameters for global models of soot consumption in combustion systems. This consumption is broken into two different submodels: first for oxidation where soot particles are attacked by certain oxidizing agents; second for gasification where soot particles are attacked by H 2O or CO 2 molecules. Rate data were collected from 19 studies in the literature and evaluated using Bayesian statistics to calibrate the model parameters. Bayesian statistics are valued in their ability to quantify uncertainty in modeling. The calibrated consumptionmore » model with quantified uncertainty is presented here along with a discussion of associated implications. The oxidation results are found to be consistent with previous studies. Significant variation is found in the CO 2 gasification rates.« less
Modeling Soot Oxidation and Gasification with Bayesian Statistics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Josephson, Alexander J.; Gaffin, Neal D.; Smith, Sean T.
This paper presents a statistical method for model calibration using data collected from literature. The method is used to calibrate parameters for global models of soot consumption in combustion systems. This consumption is broken into two different submodels: first for oxidation where soot particles are attacked by certain oxidizing agents; second for gasification where soot particles are attacked by H 2O or CO 2 molecules. Rate data were collected from 19 studies in the literature and evaluated using Bayesian statistics to calibrate the model parameters. Bayesian statistics are valued in their ability to quantify uncertainty in modeling. The calibrated consumptionmore » model with quantified uncertainty is presented here along with a discussion of associated implications. The oxidation results are found to be consistent with previous studies. Significant variation is found in the CO 2 gasification rates.« less
Krishnamurthy, Krish
2013-12-01
The intrinsic quantitative nature of NMR is increasingly exploited in areas ranging from complex mixture analysis (as in metabolomics and reaction monitoring) to quality assurance/control. Complex NMR spectra are more common than not, and therefore, extraction of quantitative information generally involves significant prior knowledge and/or operator interaction to characterize resonances of interest. Moreover, in most NMR-based metabolomic experiments, the signals from metabolites are normally present as a mixture of overlapping resonances, making quantification difficult. Time-domain Bayesian approaches have been reported to be better than conventional frequency-domain analysis at identifying subtle changes in signal amplitude. We discuss an approach that exploits Bayesian analysis to achieve a complete reduction to amplitude frequency table (CRAFT) in an automated and time-efficient fashion - thus converting the time-domain FID to a frequency-amplitude table. CRAFT uses a two-step approach to FID analysis. First, the FID is digitally filtered and downsampled to several sub FIDs, and secondly, these sub FIDs are then modeled as sums of decaying sinusoids using the Bayesian approach. CRAFT tables can be used for further data mining of quantitative information using fingerprint chemical shifts of compounds of interest and/or statistical analysis of modulation of chemical quantity in a biological study (metabolomics) or process study (reaction monitoring) or quality assurance/control. The basic principles behind this approach as well as results to evaluate the effectiveness of this approach in mixture analysis are presented. Copyright © 2013 John Wiley & Sons, Ltd.
Bayesian data analysis for newcomers.
Kruschke, John K; Liddell, Torrin M
2018-02-01
This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Using Alien Coins to Test Whether Simple Inference Is Bayesian
ERIC Educational Resources Information Center
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
The Role of Probability-Based Inference in an Intelligent Tutoring System.
ERIC Educational Resources Information Center
Mislevy, Robert J.; Gitomer, Drew H.
Probability-based inference in complex networks of interdependent variables is an active topic in statistical research, spurred by such diverse applications as forecasting, pedigree analysis, troubleshooting, and medical diagnosis. This paper concerns the role of Bayesian inference networks for updating student models in intelligent tutoring…
Contributions to Statistical Problems Related to Microarray Data
ERIC Educational Resources Information Center
Hong, Feng
2009-01-01
Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…
USDA-ARS?s Scientific Manuscript database
Recent advances in technology have led to the collection of high-dimensional data not previously encountered in many scientific environments. As a result, scientists are often faced with the challenging task of including these high-dimensional data into statistical models. For example, data from sen...
Comparing interval estimates for small sample ordinal CFA models
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002
Comparing interval estimates for small sample ordinal CFA models.
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.
Bayesian theories of conditioning in a changing world.
Courville, Aaron C; Daw, Nathaniel D; Touretzky, David S
2006-07-01
The recent flowering of Bayesian approaches invites the re-examination of classic issues in behavior, even in areas as venerable as Pavlovian conditioning. A statistical account can offer a new, principled interpretation of behavior, and previous experiments and theories can inform many unexplored aspects of the Bayesian enterprise. Here we consider one such issue: the finding that surprising events provoke animals to learn faster. We suggest that, in a statistical account of conditioning, surprise signals change and therefore uncertainty and the need for new learning. We discuss inference in a world that changes and show how experimental results involving surprise can be interpreted from this perspective, and also how, thus understood, these phenomena help constrain statistical theories of animal and human learning.
Modelling maximum river flow by using Bayesian Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Cheong, R. Y.; Gabda, D.
2017-09-01
Analysis of flood trends is vital since flooding threatens human living in terms of financial, environment and security. The data of annual maximum river flows in Sabah were fitted into generalized extreme value (GEV) distribution. Maximum likelihood estimator (MLE) raised naturally when working with GEV distribution. However, previous researches showed that MLE provide unstable results especially in small sample size. In this study, we used different Bayesian Markov Chain Monte Carlo (MCMC) based on Metropolis-Hastings algorithm to estimate GEV parameters. Bayesian MCMC method is a statistical inference which studies the parameter estimation by using posterior distribution based on Bayes’ theorem. Metropolis-Hastings algorithm is used to overcome the high dimensional state space faced in Monte Carlo method. This approach also considers more uncertainty in parameter estimation which then presents a better prediction on maximum river flow in Sabah.
Development of uncertainty-based work injury model using Bayesian structural equation modelling.
Chatterjee, Snehamoy
2014-01-01
This paper proposed a Bayesian method-based structural equation model (SEM) of miners' work injury for an underground coal mine in India. The environmental and behavioural variables for work injury were identified and causal relationships were developed. For Bayesian modelling, prior distributions of SEM parameters are necessary to develop the model. In this paper, two approaches were adopted to obtain prior distribution for factor loading parameters and structural parameters of SEM. In the first approach, the prior distributions were considered as a fixed distribution function with specific parameter values, whereas, in the second approach, prior distributions of the parameters were generated from experts' opinions. The posterior distributions of these parameters were obtained by applying Bayesian rule. The Markov Chain Monte Carlo sampling in the form Gibbs sampling was applied for sampling from the posterior distribution. The results revealed that all coefficients of structural and measurement model parameters are statistically significant in experts' opinion-based priors, whereas, two coefficients are not statistically significant when fixed prior-based distributions are applied. The error statistics reveals that Bayesian structural model provides reasonably good fit of work injury with high coefficient of determination (0.91) and less mean squared error as compared to traditional SEM.
Bayesian inference for the spatio-temporal invasion of alien species.
Cook, Alex; Marion, Glenn; Butler, Adam; Gibson, Gavin
2007-08-01
In this paper we develop a Bayesian approach to parameter estimation in a stochastic spatio-temporal model of the spread of invasive species across a landscape. To date, statistical techniques, such as logistic and autologistic regression, have outstripped stochastic spatio-temporal models in their ability to handle large numbers of covariates. Here we seek to address this problem by making use of a range of covariates describing the bio-geographical features of the landscape. Relative to regression techniques, stochastic spatio-temporal models are more transparent in their representation of biological processes. They also explicitly model temporal change, and therefore do not require the assumption that the species' distribution (or other spatial pattern) has already reached equilibrium as is often the case with standard statistical approaches. In order to illustrate the use of such techniques we apply them to the analysis of data detailing the spread of an invasive plant, Heracleum mantegazzianum, across Britain in the 20th Century using geo-referenced covariate information describing local temperature, elevation and habitat type. The use of Markov chain Monte Carlo sampling within a Bayesian framework facilitates statistical assessments of differences in the suitability of different habitat classes for H. mantegazzianum, and enables predictions of future spread to account for parametric uncertainty and system variability. Our results show that ignoring such covariate information may lead to biased estimates of key processes and implausible predictions of future distributions.
Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K
2012-04-01
The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.
A Bayesian nonparametric method for prediction in EST analysis
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-01-01
Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445
An Analysis of Operational Suitability for Test and Evaluation of Highly Reliable Systems
1994-03-04
Exposition," Journal of the American Statistical A iation-59: 353-375 (June 1964). 17. SYS 229, Test and Evaluation Management Coursebook , School of Systems...in hours, 0 is 2-5 the desired MTBCF in hours, R is the number of critical failures, and a is the P[type-I error] of the X2 statistic with 2*R+2...design of experiments (DOE) tables and the use of Bayesian statistics to increase the confidence level of the test results that will be obtained from
Statistical Modeling for Radiation Hardness Assurance
NASA Technical Reports Server (NTRS)
Ladbury, Raymond L.
2014-01-01
We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.
Bayesian analyses of seasonal runoff forecasts
NASA Astrophysics Data System (ADS)
Krzysztofowicz, R.; Reese, S.
1991-12-01
Forecasts of seasonal snowmelt runoff volume provide indispensable information for rational decision making by water project operators, irrigation district managers, and farmers in the western United States. Bayesian statistical models and communication frames have been researched in order to enhance the forecast information disseminated to the users, and to characterize forecast skill from the decision maker's point of view. Four products are presented: (i) a Bayesian Processor of Forecasts, which provides a statistical filter for calibrating the forecasts, and a procedure for estimating the posterior probability distribution of the seasonal runoff; (ii) the Bayesian Correlation Score, a new measure of forecast skill, which is related monotonically to the ex ante economic value of forecasts for decision making; (iii) a statistical predictor of monthly cumulative runoffs within the snowmelt season, conditional on the total seasonal runoff forecast; and (iv) a framing of the forecast message that conveys the uncertainty associated with the forecast estimates to the users. All analyses are illustrated with numerical examples of forecasts for six gauging stations from the period 1971 1988.
Combining statistical inference and decisions in ecology.
Williams, Perry J; Hooten, Mevin B
2016-09-01
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods, including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem. © 2016 by the Ecological Society of America.
Bayesian networks in neuroscience: a survey.
Bielza, Concha; Larrañaga, Pedro
2014-01-01
Bayesian networks are a type of probabilistic graphical models lie at the intersection between statistics and machine learning. They have been shown to be powerful tools to encode dependence relationships among the variables of a domain under uncertainty. Thanks to their generality, Bayesian networks can accommodate continuous and discrete variables, as well as temporal processes. In this paper we review Bayesian networks and how they can be learned automatically from data by means of structure learning algorithms. Also, we examine how a user can take advantage of these networks for reasoning by exact or approximate inference algorithms that propagate the given evidence through the graphical structure. Despite their applicability in many fields, they have been little used in neuroscience, where they have focused on specific problems, like functional connectivity analysis from neuroimaging data. Here we survey key research in neuroscience where Bayesian networks have been used with different aims: discover associations between variables, perform probabilistic reasoning over the model, and classify new observations with and without supervision. The networks are learned from data of any kind-morphological, electrophysiological, -omics and neuroimaging-, thereby broadening the scope-molecular, cellular, structural, functional, cognitive and medical- of the brain aspects to be studied.
Bayesian networks in neuroscience: a survey
Bielza, Concha; Larrañaga, Pedro
2014-01-01
Bayesian networks are a type of probabilistic graphical models lie at the intersection between statistics and machine learning. They have been shown to be powerful tools to encode dependence relationships among the variables of a domain under uncertainty. Thanks to their generality, Bayesian networks can accommodate continuous and discrete variables, as well as temporal processes. In this paper we review Bayesian networks and how they can be learned automatically from data by means of structure learning algorithms. Also, we examine how a user can take advantage of these networks for reasoning by exact or approximate inference algorithms that propagate the given evidence through the graphical structure. Despite their applicability in many fields, they have been little used in neuroscience, where they have focused on specific problems, like functional connectivity analysis from neuroimaging data. Here we survey key research in neuroscience where Bayesian networks have been used with different aims: discover associations between variables, perform probabilistic reasoning over the model, and classify new observations with and without supervision. The networks are learned from data of any kind–morphological, electrophysiological, -omics and neuroimaging–, thereby broadening the scope–molecular, cellular, structural, functional, cognitive and medical– of the brain aspects to be studied. PMID:25360109
MapReduce Based Parallel Bayesian Network for Manufacturing Quality Control
NASA Astrophysics Data System (ADS)
Zheng, Mao-Kuan; Ming, Xin-Guo; Zhang, Xian-Yu; Li, Guo-Ming
2017-09-01
Increasing complexity of industrial products and manufacturing processes have challenged conventional statistics based quality management approaches in the circumstances of dynamic production. A Bayesian network and big data analytics integrated approach for manufacturing process quality analysis and control is proposed. Based on Hadoop distributed architecture and MapReduce parallel computing model, big volume and variety quality related data generated during the manufacturing process could be dealt with. Artificial intelligent algorithms, including Bayesian network learning, classification and reasoning, are embedded into the Reduce process. Relying on the ability of the Bayesian network in dealing with dynamic and uncertain problem and the parallel computing power of MapReduce, Bayesian network of impact factors on quality are built based on prior probability distribution and modified with posterior probability distribution. A case study on hull segment manufacturing precision management for ship and offshore platform building shows that computing speed accelerates almost directly proportionally to the increase of computing nodes. It is also proved that the proposed model is feasible for locating and reasoning of root causes, forecasting of manufacturing outcome, and intelligent decision for precision problem solving. The integration of bigdata analytics and BN method offers a whole new perspective in manufacturing quality control.
A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.
Yu, Qingzhao; Zhu, Lin; Zhu, Han
2017-11-01
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.
Bayesian models for cost-effectiveness analysis in the presence of structural zero costs
Baio, Gianluca
2014-01-01
Bayesian modelling for cost-effectiveness data has received much attention in both the health economics and the statistical literature, in recent years. Cost-effectiveness data are characterised by a relatively complex structure of relationships linking a suitable measure of clinical benefit (e.g. quality-adjusted life years) and the associated costs. Simplifying assumptions, such as (bivariate) normality of the underlying distributions, are usually not granted, particularly for the cost variable, which is characterised by markedly skewed distributions. In addition, individual-level data sets are often characterised by the presence of structural zeros in the cost variable. Hurdle models can be used to account for the presence of excess zeros in a distribution and have been applied in the context of cost data. We extend their application to cost-effectiveness data, defining a full Bayesian specification, which consists of a model for the individual probability of null costs, a marginal model for the costs and a conditional model for the measure of effectiveness (given the observed costs). We presented the model using a working example to describe its main features. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:24343868
Bayesian models for cost-effectiveness analysis in the presence of structural zero costs.
Baio, Gianluca
2014-05-20
Bayesian modelling for cost-effectiveness data has received much attention in both the health economics and the statistical literature, in recent years. Cost-effectiveness data are characterised by a relatively complex structure of relationships linking a suitable measure of clinical benefit (e.g. quality-adjusted life years) and the associated costs. Simplifying assumptions, such as (bivariate) normality of the underlying distributions, are usually not granted, particularly for the cost variable, which is characterised by markedly skewed distributions. In addition, individual-level data sets are often characterised by the presence of structural zeros in the cost variable. Hurdle models can be used to account for the presence of excess zeros in a distribution and have been applied in the context of cost data. We extend their application to cost-effectiveness data, defining a full Bayesian specification, which consists of a model for the individual probability of null costs, a marginal model for the costs and a conditional model for the measure of effectiveness (given the observed costs). We presented the model using a working example to describe its main features. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
An application of Bayesian statistics to the extragalactic Cepheid distance scale
NASA Astrophysics Data System (ADS)
Barnes, Thomas G., III; Moffett, Thomas J.; Jefferys, W. H.; Forestell, Amy D.
2004-05-01
We have determined quasi-geometric distances to the Magellanic Clouds, M31 and M33. Our analysis uses a Bayesian statistical method to provide mathematically rigorous and objective solutions for individual Cepheids. We combine the individual distances with a hierarchial Bayesian model to determine the galactic distances. We obtain distance moduli 18.87 ± 0.07 mag (LMC, 12 stars), 19.14 ± 0.10 (SMC, 8 stars), 23.83 ± 0.35 mag (M33, 1 star) and 25.2 ± 0.6 mag (M31, 1 star) - all uncorrected for metallicity. The M31 and M33 distances are very preliminary. If the Pl relations of the LMC, SMC, and Galaxy are identical, our results exclude the metallicity effect in the V, (V - R) surface brightness method predicted by Hindsley and Bell (1989) at the 5σ level. Alternately, if Hindsley & Bell's prediction is adopted as true, we find a metallicity effect intrinsic to the Cepheid PL relation requiring a correction Δ(V - Mv) = (0.36 ± 0.07)Δ[A/H] mag. The latter has the opposite sign to other observational estimates of the Cepheid metallicity effect.
SIG-VISA: Signal-based Vertically Integrated Seismic Monitoring
NASA Astrophysics Data System (ADS)
Moore, D.; Mayeda, K. M.; Myers, S. C.; Russell, S.
2013-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software; however, while such detections may constitute a useful summary of station activity, they discard large amounts of information present in the original recorded signal. We present SIG-VISA (Signal-based Vertically Integrated Seismic Analysis), a system for seismic monitoring through Bayesian inference on seismic signals. By directly modeling the recorded signal, our approach incorporates additional information unavailable to detection-based methods, enabling higher sensitivity and more accurate localization using techniques such as waveform matching. SIG-VISA's Bayesian forward model of seismic signal envelopes includes physically-derived models of travel times and source characteristics as well as Gaussian process (kriging) statistical models of signal properties that combine interpolation of historical data with extrapolation of learned physical trends. Applying Bayesian inference, we evaluate the model on earthquakes as well as the 2009 DPRK test event, demonstrating a waveform matching effect as part of the probabilistic inference, along with results on event localization and sensitivity. In particular, we demonstrate increased sensitivity from signal-based modeling, in which the SIGVISA signal model finds statistical evidence for arrivals even at stations for which the IMS station processing failed to register any detection.
Bayesian Atmospheric Radiative Transfer (BART) Code and Application to WASP-43b
NASA Astrophysics Data System (ADS)
Blecic, Jasmina; Harrington, Joseph; Cubillos, Patricio; Bowman, Oliver; Rojo, Patricio; Stemm, Madison; Lust, Nathaniel B.; Challener, Ryan; Foster, Austin James; Foster, Andrew S.; Blumenthal, Sarah D.; Bruce, Dylan
2016-01-01
We present a new open-source Bayesian radiative-transfer framework, Bayesian Atmospheric Radiative Transfer (BART, https://github.com/exosports/BART), and its application to WASP-43b. BART initializes a model for the atmospheric retrieval calculation, generates thousands of theoretical model spectra using parametrized pressure and temperature profiles and line-by-line radiative-transfer calculation, and employs a statistical package to compare the models with the observations. It consists of three self-sufficient modules available to the community under the reproducible-research license, the Thermochemical Equilibrium Abundances module (TEA, https://github.com/dzesmin/TEA, Blecic et al. 2015}, the radiative-transfer module (Transit, https://github.com/exosports/transit), and the Multi-core Markov-chain Monte Carlo statistical module (MCcubed, https://github.com/pcubillos/MCcubed, Cubillos et al. 2015). We applied BART on all available WASP-43b secondary eclipse data from the space- and ground-based observations constraining the temperature-pressure profile and molecular abundances of the dayside atmosphere of WASP-43b. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G. JB holds a NASA Earth and Space Science Fellowship.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2016-10-04
A novel method for compound identification in liquid chromatography-high resolution mass spectrometry (LC-HRMS) is proposed. The method, based on Bayesian statistics, accommodates all possible uncertainties involved, from instrumentation up to data analysis into a single model yielding the probability of the compound of interest being present/absent in the sample. This approach differs from the classical methods in two ways. First, it is probabilistic (instead of deterministic); hence, it computes the probability that the compound is (or is not) present in a sample. Second, it answers the hypothesis "the compound is present", opposed to answering the question "the compound feature is present". This second difference implies a shift in the way data analysis is tackled, since the probability of interfering compounds (i.e., isomers and isobaric compounds) is also taken into account.
NASA Astrophysics Data System (ADS)
Goodman, Steven N.
1989-11-01
This dissertation explores the use of a mathematical measure of statistical evidence, the log likelihood ratio, in clinical trials. The methods and thinking behind the use of an evidential measure are contrasted with traditional methods of analyzing data, which depend primarily on a p-value as an estimate of the statistical strength of an observed data pattern. It is contended that neither the behavioral dictates of Neyman-Pearson hypothesis testing methods, nor the coherency dictates of Bayesian methods are realistic models on which to base inference. The use of the likelihood alone is applied to four aspects of trial design or conduct: the calculation of sample size, the monitoring of data, testing for the equivalence of two treatments, and meta-analysis--the combining of results from different trials. Finally, a more general model of statistical inference, using belief functions, is used to see if it is possible to separate the assessment of evidence from our background knowledge. It is shown that traditional and Bayesian methods can be modeled as two ends of a continuum of structured background knowledge, methods which summarize evidence at the point of maximum likelihood assuming no structure, and Bayesian methods assuming complete knowledge. Both schools are seen to be missing a concept of ignorance- -uncommitted belief. This concept provides the key to understanding the problem of sampling to a foregone conclusion and the role of frequency properties in statistical inference. The conclusion is that statistical evidence cannot be defined independently of background knowledge, and that frequency properties of an estimator are an indirect measure of uncommitted belief. Several likelihood summaries need to be used in clinical trials, with the quantitative disparity between summaries being an indirect measure of our ignorance. This conclusion is linked with parallel ideas in the philosophy of science and cognitive psychology.
Bayesian Analysis of Biogeography when the Number of Areas is Large
Landis, Michael J.; Matzke, Nicholas J.; Moore, Brian R.; Huelsenbeck, John P.
2013-01-01
Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a “data-augmentation” approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea. [ancestral area analysis; Bayesian biogeographic inference; data augmentation; historical biogeography; Markov chain Monte Carlo.] PMID:23736102
Evaluating Courses of Actions at the Strategic Planning Level
2013-03-01
and statistical decision theory ( Schultz , Borrowman and Small 2011). Nowadays, it is hard to make a decision by ourselves. Modern organizations...Analysis." Lecture Slides, October 2011. Schultz , Martin T., Thomas D. Borrowman, and Mitchell J. Small. Bayesian Networks for Modeling Dredging...www.ukessays.com/essays/business/strategic-analysis-of-procter-and-gamble.php (accessed October 09, 2012). Vego, Milan . Joint Operational Warfare. Vol. Vol 1
NASA Astrophysics Data System (ADS)
Walker, David M.; Allingham, David; Lee, Heung Wing Joseph; Small, Michael
2010-02-01
Small world network models have been effective in capturing the variable behaviour of reported case data of the SARS coronavirus outbreak in Hong Kong during 2003. Simulations of these models have previously been realized using informed “guesses” of the proposed model parameters and tested for consistency with the reported data by surrogate analysis. In this paper we attempt to provide statistically rigorous parameter distributions using Approximate Bayesian Computation sampling methods. We find that such sampling schemes are a useful framework for fitting parameters of stochastic small world network models where simulation of the system is straightforward but expressing a likelihood is cumbersome.
Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background
NASA Astrophysics Data System (ADS)
McEwen, J. D.; Feeney, S. M.; Peiris, H. V.; Wiaux, Y.; Ringeval, C.; Bouchet, F. R.
2017-12-01
Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high-energy scales. We develop a framework for cosmic string inference from observations of the CMB made over the celestial sphere, performing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension Gμ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations, we demonstrate the application of our framework and evaluate its performance. The method is sensitive to Gμ ∼ 5 × 10-7 for Nambu-Goto string simulations that include an integrated Sachs-Wolfe contribution only and do not include any recombination effects, before any parameters of the analysis are optimized. The sensitivity of the method compares favourably with other techniques applied to the same simulations.
Bayesian estimates of the incidence of rare cancers in Europe.
Botta, Laura; Capocaccia, Riccardo; Trama, Annalisa; Herrmann, Christian; Salmerón, Diego; De Angelis, Roberta; Mallone, Sandra; Bidoli, Ettore; Marcos-Gragera, Rafael; Dudek-Godeau, Dorota; Gatta, Gemma; Cleries, Ramon
2018-04-21
The RARECAREnet project has updated the estimates of the burden of the 198 rare cancers in each European country. Suspecting that scant data could affect the reliability of statistical analysis, we employed a Bayesian approach to estimate the incidence of these cancers. We analyzed about 2,000,000 rare cancers diagnosed in 2000-2007 provided by 83 population-based cancer registries from 27 European countries. We considered European incidence rates (IRs), calculated over all the data available in RARECAREnet, as a valid a priori to merge with country-specific observed data. Therefore we provided (1) Bayesian estimates of IRs and the yearly numbers of cases of rare cancers in each country; (2) the expected time (T) in years needed to observe one new case; and (3) practical criteria to decide when to use the Bayesian approach. Bayesian and classical estimates did not differ much; substantial differences (>10%) ranged from 77 rare cancers in Iceland to 14 in England. The smaller the population the larger the number of rare cancers needing a Bayesian approach. Bayesian estimates were useful for cancers with fewer than 150 observed cases in a country during the study period; this occurred mostly when the population of the country is small. For the first time the Bayesian estimates of IRs and the yearly expected numbers of cases for each rare cancer in each individual European country were calculated. Moreover, the indicator T is useful to convey incidence estimates for exceptionally rare cancers and in small countries; it far exceeds the professional lifespan of a medical doctor. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Han, Feng; Zheng, Yi
2018-06-01
Significant Input uncertainty is a major source of error in watershed water quality (WWQ) modeling. It remains challenging to address the input uncertainty in a rigorous Bayesian framework. This study develops the Bayesian Analysis of Input and Parametric Uncertainties (BAIPU), an approach for the joint analysis of input and parametric uncertainties through a tight coupling of Markov Chain Monte Carlo (MCMC) analysis and Bayesian Model Averaging (BMA). The formal likelihood function for this approach is derived considering a lag-1 autocorrelated, heteroscedastic, and Skew Exponential Power (SEP) distributed error model. A series of numerical experiments were performed based on a synthetic nitrate pollution case and on a real study case in the Newport Bay Watershed, California. The Soil and Water Assessment Tool (SWAT) and Differential Evolution Adaptive Metropolis (DREAM(ZS)) were used as the representative WWQ model and MCMC algorithm, respectively. The major findings include the following: (1) the BAIPU can be implemented and used to appropriately identify the uncertain parameters and characterize the predictive uncertainty; (2) the compensation effect between the input and parametric uncertainties can seriously mislead the modeling based management decisions, if the input uncertainty is not explicitly accounted for; (3) the BAIPU accounts for the interaction between the input and parametric uncertainties and therefore provides more accurate calibration and uncertainty results than a sequential analysis of the uncertainties; and (4) the BAIPU quantifies the credibility of different input assumptions on a statistical basis and can be implemented as an effective inverse modeling approach to the joint inference of parameters and inputs.
Development of a Bayesian Belief Network Runway Incursion and Excursion Model
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2014-01-01
In a previous work, a statistical analysis of runway incursion (RI) event data was conducted to ascertain the relevance of this data to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to several of the AvSP top ten TC and identified numerous primary causes and contributing factors of RI events. The statistical analysis served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events, also previously reported. Through literature searches and data analysis, this RI event network has now been extended to also model runway excursion (RE) events. These RI and RE event networks have been further modified and vetted by a Subject Matter Expert (SME) panel. The combined system-level BBN model will allow NASA to generically model the causes of RI and RE events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of runway safety incidents/accidents, and to improve runway safety in general. The development and structure of the BBN for both RI and RE events are documented in this paper.
NASA Astrophysics Data System (ADS)
Mugnes, J.-M.; Robert, C.
2015-11-01
Spectral analysis is a powerful tool to investigate stellar properties and it has been widely used for decades now. However, the methods considered to perform this kind of analysis are mostly based on iteration among a few diagnostic lines to determine the stellar parameters. While these methods are often simple and fast, they can lead to errors and large uncertainties due to the required assumptions. Here, we present a method based on Bayesian statistics to find simultaneously the best combination of effective temperature, surface gravity, projected rotational velocity, and microturbulence velocity, using all the available spectral lines. Different tests are discussed to demonstrate the strength of our method, which we apply to 54 mid-resolution spectra of field and cluster B stars obtained at the Observatoire du Mont-Mégantic. We compare our results with those found in the literature. Differences are seen which are well explained by the different methods used. We conclude that the B-star microturbulence velocities are often underestimated. We also confirm the trend that B stars in clusters are on average faster rotators than field B stars.
Reduced Order Modeling Methods for Turbomachinery Design
2009-03-01
and Ma- terials Conference, May 2006. [45] A. Gelman , J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis. New York, NY: Chapman I& Hall...Macian- Juan , and R. Chawla, “A statistical methodology for quantif ca- tion of uncertainty in best estimate code physical models,” Annals of Nuclear En
Arenas, Miguel
2015-04-01
NGS technologies present a fast and cheap generation of genomic data. Nevertheless, ancestral genome inference is not so straightforward due to complex evolutionary processes acting on this material such as inversions, translocations, and other genome rearrangements that, in addition to their implicit complexity, can co-occur and confound ancestral inferences. Recently, models of genome evolution that accommodate such complex genomic events are emerging. This letter explores these novel evolutionary models and proposes their incorporation into robust statistical approaches based on computer simulations, such as approximate Bayesian computation, that may produce a more realistic evolutionary analysis of genomic data. Advantages and pitfalls in using these analytical methods are discussed. Potential applications of these ancestral genomic inferences are also pointed out.
Burroughs, N J; Pillay, D; Mutimer, D
1999-01-01
Bayesian analysis using a virus dynamics model is demonstrated to facilitate hypothesis testing of patterns in clinical time-series. Our Markov chain Monte Carlo implementation demonstrates that the viraemia time-series observed in two sets of hepatitis B patients on antiviral (lamivudine) therapy, chronic carriers and liver transplant patients, are significantly different, overcoming clinical trial design differences that question the validity of non-parametric tests. We show that lamivudine-resistant mutants grow faster in transplant patients than in chronic carriers, which probably explains the differences in emergence times and failure rates between these two sets of patients. Incorporation of dynamic models into Bayesian parameter analysis is of general applicability in medical statistics. PMID:10643081
Using data mining techniques to predict the severity of bicycle crashes.
Prati, Gabriele; Pietrantoni, Luca; Fraboni, Federico
2017-04-01
To investigate the factors predicting severity of bicycle crashes in Italy, we used an observational study of official statistics. We applied two of the most widely used data mining techniques, CHAID decision tree technique and Bayesian network analysis. We used data provided by the Italian National Institute of Statistics on road crashes that occurred on the Italian road network during the period ranging from 2011 to 2013. In the present study, the dataset contains information about road crashes occurred on the Italian road network during the period ranging from 2011 to 2013. We extracted 49,621 road accidents where at least one cyclist was injured or killed from the original database that comprised a total of 575,093 road accidents. CHAID decision tree technique was employed to establish the relationship between severity of bicycle crashes and factors related to crash characteristics (type of collision and opponent vehicle), infrastructure characteristics (type of carriageway, road type, road signage, pavement type, and type of road segment), cyclists (gender and age), and environmental factors (time of the day, day of the week, month, pavement condition, and weather). CHAID analysis revealed that the most important predictors were, in decreasing order of importance, road type (0.30), crash type (0.24), age of cyclist (0.19), road signage (0.08), gender of cyclist (0.07), type of opponent vehicle (0.05), month (0.04), and type of road segment (0.02). These eight most important predictors of the severity of bicycle crashes were included as predictors of the target (i.e., severity of bicycle crashes) in Bayesian network analysis. Bayesian network analysis identified crash type (0.31), road type (0.19), and type of opponent vehicle (0.18) as the most important predictors of severity of bicycle crashes. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.
Houseman, E Andres; Virji, M Abbas
2017-08-01
Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.
Additive Genetic Variability and the Bayesian Alphabet
Gianola, Daniel; de los Campos, Gustavo; Hill, William G.; Manfredi, Eduardo; Fernando, Rohan
2009-01-01
The use of all available molecular markers in statistical models for prediction of quantitative traits has led to what could be termed a genomic-assisted selection paradigm in animal and plant breeding. This article provides a critical review of some theoretical and statistical concepts in the context of genomic-assisted genetic evaluation of animals and crops. First, relationships between the (Bayesian) variance of marker effects in some regression models and additive genetic variance are examined under standard assumptions. Second, the connection between marker genotypes and resemblance between relatives is explored, and linkages between a marker-based model and the infinitesimal model are reviewed. Third, issues associated with the use of Bayesian models for marker-assisted selection, with a focus on the role of the priors, are examined from a theoretical angle. The sensitivity of a Bayesian specification that has been proposed (called “Bayes A”) with respect to priors is illustrated with a simulation. Methods that can solve potential shortcomings of some of these Bayesian regression procedures are discussed briefly. PMID:19620397
Why Bayesian Psychologists Should Change the Way They Use the Bayes Factor.
Hoijtink, Herbert; van Kooten, Pascal; Hulsker, Koenraad
2016-01-01
The discussion following Bem's ( 2011 ) psi research highlights that applications of the Bayes factor in psychological research are not without problems. The first problem is the omission to translate subjective prior knowledge into subjective prior distributions. In the words of Savage ( 1961 ): "they make the Bayesian omelet without breaking the Bayesian egg." The second problem occurs if the Bayesian egg is not broken: the omission to choose default prior distributions such that the ensuing inferences are well calibrated. The third problem is the adherence to inadequate rules for the interpretation of the size of the Bayes factor. The current paper will elaborate these problems and show how to avoid them using the basic hypotheses and statistical model used in the first experiment described in Bem ( 2011 ). It will be argued that a thorough investigation of these problems in the context of more encompassing hypotheses and statistical models is called for if Bayesian psychologists want to add a well-founded Bayes factor to the tool kit of psychological researchers.
NASA Astrophysics Data System (ADS)
Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.
2017-12-01
The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.
A Bayesian sequential processor approach to spectroscopic portal system decisions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sale, K; Candy, J; Breitfeller, E
The development of faster more reliable techniques to detect radioactive contraband in a portal type scenario is an extremely important problem especially in this era of constant terrorist threats. Towards this goal the development of a model-based, Bayesian sequential data processor for the detection problem is discussed. In the sequential processor each datum (detector energy deposit and pulse arrival time) is used to update the posterior probability distribution over the space of model parameters. The nature of the sequential processor approach is that a detection is produced as soon as it is statistically justified by the data rather than waitingmore » for a fixed counting interval before any analysis is performed. In this paper the Bayesian model-based approach, physics and signal processing models and decision functions are discussed along with the first results of our research.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jesus, J.F.; Valentim, R.; Andrade-Oliveira, F., E-mail: jfjesus@itapeva.unesp.br, E-mail: valentim.rodolfo@unifesp.br, E-mail: felipe.oliveira@port.ac.uk
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γmore » = 3α H {sub 0} model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.« less
The importance of proving the null.
Gallistel, C R
2009-04-01
Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose 3 common experimental questions: (a) Are 2 means the same? (b) Is performance at chance? (c) Are factors additive? (c) 2009 APA, all rights reserved
Tressoldi, Patrizio E.
2011-01-01
Starting from the famous phrase “extraordinary claims require extraordinary evidence,” we will present the evidence supporting the concept that human visual perception may have non-local properties, in other words, that it may operate beyond the space and time constraints of sensory organs, in order to discuss which criteria can be used to define evidence as extraordinary. This evidence has been obtained from seven databases which are related to six different protocols used to test the reality and the functioning of non-local perception, analyzed using both a frequentist and a new Bayesian meta-analysis statistical procedure. According to a frequentist meta-analysis, the null hypothesis can be rejected for all six protocols even if the effect sizes range from 0.007 to 0.28. According to Bayesian meta-analysis, the Bayes factors provides strong evidence to support the alternative hypothesis (H1) over the null hypothesis (H0), but only for three out of the six protocols. We will discuss whether quantitative psychology can contribute to defining the criteria for the acceptance of new scientific ideas in order to avoid the inconclusive controversies between supporters and opponents. PMID:21713069
“Magnitude-based Inference”: A Statistical Review
Welsh, Alan H.; Knight, Emma J.
2015-01-01
ABSTRACT Purpose We consider “magnitude-based inference” and its interpretation by examining in detail its use in the problem of comparing two means. Methods We extract from the spreadsheets, which are provided to users of the analysis (http://www.sportsci.org/), a precise description of how “magnitude-based inference” is implemented. We compare the implemented version of the method with general descriptions of it and interpret the method in familiar statistical terms. Results and Conclusions We show that “magnitude-based inference” is not a progressive improvement on modern statistics. The additional probabilities introduced are not directly related to the confidence interval but, rather, are interpretable either as P values for two different nonstandard tests (for different null hypotheses) or as approximate Bayesian calculations, which also lead to a type of test. We also discuss sample size calculations associated with “magnitude-based inference” and show that the substantial reduction in sample sizes claimed for the method (30% of the sample size obtained from standard frequentist calculations) is not justifiable so the sample size calculations should not be used. Rather than using “magnitude-based inference,” a better solution is to be realistic about the limitations of the data and use either confidence intervals or a fully Bayesian analysis. PMID:25051387
Bayesian Orbit Computation Tools for Objects on Geocentric Orbits
NASA Astrophysics Data System (ADS)
Virtanen, J.; Granvik, M.; Muinonen, K.; Oszkiewicz, D.
2013-08-01
We consider the space-debris orbital inversion problem via the concept of Bayesian inference. The methodology has been put forward for the orbital analysis of solar system small bodies in early 1990's [7] and results in a full solution of the statistical inverse problem given in terms of a posteriori probability density function (PDF) for the orbital parameters. We demonstrate the applicability of our statistical orbital analysis software to Earth orbiting objects, both using well-established Monte Carlo (MC) techniques (for a review, see e.g. [13] as well as recently developed Markov-chain MC (MCMC) techniques (e.g., [9]). In particular, we exploit the novel virtual observation MCMC method [8], which is based on the characterization of the phase-space volume of orbital solutions before the actual MCMC sampling. Our statistical methods and the resulting PDFs immediately enable probabilistic impact predictions to be carried out. Furthermore, this can be readily done also for very sparse data sets and data sets of poor quality - providing that some a priori information on the observational uncertainty is available. For asteroids, impact probabilities with the Earth from the discovery night onwards have been provided, e.g., by [11] and [10], the latter study includes the sampling of the observational-error standard deviation as a random variable.
Gamalo-Siebers, Margaret; Savic, Jasmina; Basu, Cynthia; Zhao, Xin; Gopalakrishnan, Mathangi; Gao, Aijun; Song, Guochen; Baygani, Simin; Thompson, Laura; Xia, H Amy; Price, Karen; Tiwari, Ram; Carlin, Bradley P
2017-07-01
Children represent a large underserved population of "therapeutic orphans," as an estimated 80% of children are treated off-label. However, pediatric drug development often faces substantial challenges, including economic, logistical, technical, and ethical barriers, among others. Among many efforts trying to remove these barriers, increased recent attention has been paid to extrapolation; that is, the leveraging of available data from adults or older age groups to draw conclusions for the pediatric population. The Bayesian statistical paradigm is natural in this setting, as it permits the combining (or "borrowing") of information across disparate sources, such as the adult and pediatric data. In this paper, authored by the pediatric subteam of the Drug Information Association Bayesian Scientific Working Group and Adaptive Design Working Group, we develop, illustrate, and provide suggestions on Bayesian statistical methods that could be used to design improved pediatric development programs that use all available information in the most efficient manner. A variety of relevant Bayesian approaches are described, several of which are illustrated through 2 case studies: extrapolating adult efficacy data to expand the labeling for Remicade to include pediatric ulcerative colitis and extrapolating adult exposure-response information for antiepileptic drugs to pediatrics. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Freni, Gabriele; Mannina, Giorgio
In urban drainage modelling, uncertainty analysis is of undoubted necessity. However, uncertainty analysis in urban water-quality modelling is still in its infancy and only few studies have been carried out. Therefore, several methodological aspects still need to be experienced and clarified especially regarding water quality modelling. The use of the Bayesian approach for uncertainty analysis has been stimulated by its rigorous theoretical framework and by the possibility of evaluating the impact of new knowledge on the modelling predictions. Nevertheless, the Bayesian approach relies on some restrictive hypotheses that are not present in less formal methods like the Generalised Likelihood Uncertainty Estimation (GLUE). One crucial point in the application of Bayesian method is the formulation of a likelihood function that is conditioned by the hypotheses made regarding model residuals. Statistical transformations, such as the use of Box-Cox equation, are generally used to ensure the homoscedasticity of residuals. However, this practice may affect the reliability of the analysis leading to a wrong uncertainty estimation. The present paper aims to explore the influence of the Box-Cox equation for environmental water quality models. To this end, five cases were considered one of which was the “real” residuals distributions (i.e. drawn from available data). The analysis was applied to the Nocella experimental catchment (Italy) which is an agricultural and semi-urbanised basin where two sewer systems, two wastewater treatment plants and a river reach were monitored during both dry and wet weather periods. The results show that the uncertainty estimation is greatly affected by residual transformation and a wrong assumption may also affect the evaluation of model uncertainty. The use of less formal methods always provide an overestimation of modelling uncertainty with respect to Bayesian method but such effect is reduced if a wrong assumption is made regarding the residuals distribution. If residuals are not normally distributed, the uncertainty is over-estimated if Box-Cox transformation is not applied or non-calibrated parameter is used.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics
Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.
2015-01-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564
Data Analysis Techniques for Physical Scientists
NASA Astrophysics Data System (ADS)
Pruneau, Claude A.
2017-10-01
Preface; How to read this book; 1. The scientific method; Part I. Foundation in Probability and Statistics: 2. Probability; 3. Probability models; 4. Classical inference I: estimators; 5. Classical inference II: optimization; 6. Classical inference III: confidence intervals and statistical tests; 7. Bayesian inference; Part II. Measurement Techniques: 8. Basic measurements; 9. Event reconstruction; 10. Correlation functions; 11. The multiple facets of correlation functions; 12. Data correction methods; Part III. Simulation Techniques: 13. Monte Carlo methods; 14. Collision and detector modeling; List of references; Index.
Unmasking the masked Universe: the 2M++ catalogue through Bayesian eyes
NASA Astrophysics Data System (ADS)
Lavaux, Guilhem; Jasche, Jens
2016-01-01
This work describes a full Bayesian analysis of the Nearby Universe as traced by galaxies of the 2M++ survey. The analysis is run in two sequential steps. The first step self-consistently derives the luminosity-dependent galaxy biases, the power spectrum of matter fluctuations and matter density fields within a Gaussian statistic approximation. The second step makes a detailed analysis of the three-dimensional large-scale structures, assuming a fixed bias model and a fixed cosmology. This second step allows for the reconstruction of both the final density field and the initial conditions at z = 1000 assuming a fixed bias model. From these, we derive fields that self-consistently extrapolate the observed large-scale structures. We give two examples of these extrapolation and their utility for the detection of structures: the visibility of the Sloan Great Wall, and the detection and characterization of the Local Void using DIVA, a Lagrangian based technique to classify structures.
Testing non-minimally coupled inflation with CMB data: a Bayesian analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campista, Marcela; Benetti, Micol; Alcaniz, Jailson, E-mail: campista@on.br, E-mail: micolbenetti@on.br, E-mail: alcaniz@on.br
2017-09-01
We use the most recent cosmic microwave background (CMB) data to perform a Bayesian statistical analysis and discuss the observational viability of inflationary models with a non-minimal coupling, ξ, between the inflaton field and the Ricci scalar. We particularize our analysis to two examples of small and large field inflationary models, namely, the Coleman-Weinberg and the chaotic quartic potentials. We find that ( i ) the ξ parameter is closely correlated with the primordial amplitude ; ( ii ) although improving the agreement with the CMB data in the r − n {sub s} plane, where r is the tensor-to-scalarmore » ratio and n {sub s} the primordial spectral index, a non-null coupling is strongly disfavoured with respect to the minimally coupled standard ΛCDM model, since the upper bounds of the Bayes factor (odds) for ξ parameter are greater than 150:1.« less
Rasmussen, Peter M.; Smith, Amy F.; Sakadžić, Sava; Boas, David A.; Pries, Axel R.; Secomb, Timothy W.; Østergaard, Leif
2017-01-01
Objective In vivo imaging of the microcirculation and network-oriented modeling have emerged as powerful means of studying microvascular function and understanding its physiological significance. Network-oriented modeling may provide the means of summarizing vast amounts of data produced by high-throughput imaging techniques in terms of key, physiological indices. To estimate such indices with sufficient certainty, however, network-oriented analysis must be robust to the inevitable presence of uncertainty due to measurement errors as well as model errors. Methods We propose the Bayesian probabilistic data analysis framework as a means of integrating experimental measurements and network model simulations into a combined and statistically coherent analysis. The framework naturally handles noisy measurements and provides posterior distributions of model parameters as well as physiological indices associated with uncertainty. Results We applied the analysis framework to experimental data from three rat mesentery networks and one mouse brain cortex network. We inferred distributions for more than five hundred unknown pressure and hematocrit boundary conditions. Model predictions were consistent with previous analyses, and remained robust when measurements were omitted from model calibration. Conclusion Our Bayesian probabilistic approach may be suitable for optimizing data acquisition and for analyzing and reporting large datasets acquired as part of microvascular imaging studies. PMID:27987383
Bayesian Inference: with ecological applications
Link, William A.; Barker, Richard J.
2010-01-01
This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.
Johnson, Eric D; Tubau, Elisabet
2017-06-01
Presenting natural frequencies facilitates Bayesian inferences relative to using percentages. Nevertheless, many people, including highly educated and skilled reasoners, still fail to provide Bayesian responses to these computationally simple problems. We show that the complexity of relational reasoning (e.g., the structural mapping between the presented and requested relations) can help explain the remaining difficulties. With a non-Bayesian inference that required identical arithmetic but afforded a more direct structural mapping, performance was universally high. Furthermore, reducing the relational demands of the task through questions that directed reasoners to use the presented statistics, as compared with questions that prompted the representation of a second, similar sample, also significantly improved reasoning. Distinct error patterns were also observed between these presented- and similar-sample scenarios, which suggested differences in relational-reasoning strategies. On the other hand, while higher numeracy was associated with better Bayesian reasoning, higher-numerate reasoners were not immune to the relational complexity of the task. Together, these findings validate the relational-reasoning view of Bayesian problem solving and highlight the importance of considering not only the presented task structure, but also the complexity of the structural alignment between the presented and requested relations.
NASA Astrophysics Data System (ADS)
Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.
2013-04-01
Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
NASA Astrophysics Data System (ADS)
Iskandar, I.
2018-03-01
The exponential distribution is the most widely used reliability analysis. This distribution is very suitable for representing the lengths of life of many cases and is available in a simple statistical form. The characteristic of this distribution is a constant hazard rate. The exponential distribution is the lower rank of the Weibull distributions. In this paper our effort is to introduce the basic notions that constitute an exponential competing risks model in reliability analysis using Bayesian analysis approach and presenting their analytic methods. The cases are limited to the models with independent causes of failure. A non-informative prior distribution is used in our analysis. This model describes the likelihood function and follows with the description of the posterior function and the estimations of the point, interval, hazard function, and reliability. The net probability of failure if only one specific risk is present, crude probability of failure due to a specific risk in the presence of other causes, and partial crude probabilities are also included.
Sokhey, Taegh; Gaebler-Spira, Deborah; Kording, Konrad P.
2017-01-01
Background It is important to understand the motor deficits of children with Cerebral Palsy (CP). Our understanding of this motor disorder can be enriched by computational models of motor control. One crucial stage in generating movement involves combining uncertain information from different sources, and deficits in this process could contribute to reduced motor function in children with CP. Healthy adults can integrate previously-learned information (prior) with incoming sensory information (likelihood) in a close-to-optimal way when estimating object location, consistent with the use of Bayesian statistics. However, there are few studies investigating how children with CP perform sensorimotor integration. We compare sensorimotor estimation in children with CP and age-matched controls using a model-based analysis to understand the process. Methods and findings We examined Bayesian sensorimotor integration in children with CP, aged between 5 and 12 years old, with Gross Motor Function Classification System (GMFCS) levels 1–3 and compared their estimation behavior with age-matched typically-developing (TD) children. We used a simple sensorimotor estimation task which requires participants to combine probabilistic information from different sources: a likelihood distribution (current sensory information) with a prior distribution (learned target information). In order to examine sensorimotor integration, we quantified how participants weighed statistical information from the two sources (prior and likelihood) and compared this to the statistical optimal weighting. We found that the weighing of statistical information in children with CP was as statistically efficient as that of TD children. Conclusions We conclude that Bayesian sensorimotor integration is not impaired in children with CP and therefore, does not contribute to their motor deficits. Future research has the potential to enrich our understanding of motor disorders by investigating the stages of motor processing set out by computational models. Therapeutic interventions should exploit the ability of children with CP to use statistical information. PMID:29186196
Bayesian approach to inverse statistical mechanics.
Habeck, Michael
2014-05-01
Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
Bayesian approach to inverse statistical mechanics
NASA Astrophysics Data System (ADS)
Habeck, Michael
2014-05-01
Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan
2015-01-01
Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.
2015-09-30
oil spills (unpublished data, Kellar). The second will be to conduct a more fine-scale analysis of the areas examined during this study. For this...REFERENCES Carlin BP , Chib S (1995) Bayesian model choice via Markov-chain Monte-Carlo methods. Journal of the Royal Statistical Society
Theory-based Bayesian Models of Inductive Inference
2010-07-19
Subjective randomness and natural scene statistics. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom/papers/randscenes.pdf Page 1...in press). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom
NASA Astrophysics Data System (ADS)
Melendez, Jordan; Wesolowski, Sarah; Furnstahl, Dick
2017-09-01
Chiral effective field theory (EFT) predictions are necessarily truncated at some order in the EFT expansion, which induces an error that must be quantified for robust statistical comparisons to experiment. A Bayesian model yields posterior probability distribution functions for these errors based on expectations of naturalness encoded in Bayesian priors and the observed order-by-order convergence pattern of the EFT. As a general example of a statistical approach to truncation errors, the model was applied to chiral EFT for neutron-proton scattering using various semi-local potentials of Epelbaum, Krebs, and Meißner (EKM). Here we discuss how our model can learn correlation information from the data and how to perform Bayesian model checking to validate that the EFT is working as advertised. Supported in part by NSF PHY-1614460 and DOE NUCLEI SciDAC DE-SC0008533.
Thomson, James R; Kimmerer, Wim J; Brown, Larry R; Newman, Ken B; Mac Nally, Ralph; Bennett, William A; Feyrer, Frederick; Fleishman, Erica
2010-07-01
We examined trends in abundance of four pelagic fish species (delta smelt, longfin smelt, striped bass, and threadfin shad) in the upper San Francisco Estuary, California, USA, over 40 years using Bayesian change point models. Change point models identify times of abrupt or unusual changes in absolute abundance (step changes) or in rates of change in abundance (trend changes). We coupled Bayesian model selection with linear regression splines to identify biotic or abiotic covariates with the strongest associations with abundances of each species. We then refitted change point models conditional on the selected covariates to explore whether those covariates could explain statistical trends or change points in species abundances. We also fitted a multispecies change point model that identified change points common to all species. All models included hierarchical structures to model data uncertainties, including observation errors and missing covariate values. There were step declines in abundances of all four species in the early 2000s, with a likely common decline in 2002. Abiotic variables, including water clarity, position of the 2 per thousand isohaline (X2), and the volume of freshwater exported from the estuary, explained some variation in species' abundances over the time series, but no selected covariates could explain statistically the post-2000 change points for any species.
Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference
Hines, Keegan E.; Bankston, John R.; Aldrich, Richard W.
2015-01-01
The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. PMID:25650922
The Accuracy of GBM GRB Localizations
NASA Astrophysics Data System (ADS)
Briggs, Michael Stephen; Connaughton, V.; Meegan, C.; Hurley, K.
2010-03-01
We report an study of the accuracy of GBM GRB localizations, analyzing three types of localizations: those produced automatically by the GBM Flight Software on board GBM, those produced automatically with ground software in near real time, and localizations produced with human guidance. The two types of automatic locations are distributed in near real-time via GCN Notices; the human-guided locations are distributed on timescale of many minutes or hours using GCN Circulars. This work uses a Bayesian analysis that models the distribution of the GBM total location error by comparing GBM locations to more accurate locations obtained with other instruments. Reference locations are obtained from Swift, Super-AGILE, the LAT, and with the IPN. We model the GBM total location errors as having systematic errors in addition to the statistical errors and use the Bayesian analysis to constrain the systematic errors.
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Ghosh, Sujit K
2010-01-01
Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
To P or Not to P: Backing Bayesian Statistics.
Buchinsky, Farrel J; Chadha, Neil K
2017-12-01
In biomedical research, it is imperative to differentiate chance variation from truth before we generalize what we see in a sample of subjects to the wider population. For decades, we have relied on null hypothesis significance testing, where we calculate P values for our data to decide whether to reject a null hypothesis. This methodology is subject to substantial misinterpretation and errant conclusions. Instead of working backward by calculating the probability of our data if the null hypothesis were true, Bayesian statistics allow us instead to work forward, calculating the probability of our hypothesis given the available data. This methodology gives us a mathematical means of incorporating our "prior probabilities" from previous study data (if any) to produce new "posterior probabilities." Bayesian statistics tell us how confidently we should believe what we believe. It is time to embrace and encourage their use in our otolaryngology research.
NASA Astrophysics Data System (ADS)
Licquia, Timothy C.; Newman, Jeffrey A.
2016-11-01
The exponential scale length (L d ) of the Milky Way’s (MW’s) disk is a critical parameter for describing the global physical size of our Galaxy, important both for interpreting other Galactic measurements and helping us to understand how our Galaxy fits into extragalactic contexts. Unfortunately, current estimates span a wide range of values and are often statistically incompatible with one another. Here, we perform a Bayesian meta-analysis to determine an improved, aggregate estimate for L d , utilizing a mixture-model approach to account for the possibility that any one measurement has not properly accounted for all statistical or systematic errors. Within this machinery, we explore a variety of ways of modeling the nature of problematic measurements, and then employ a Bayesian model averaging technique to derive net posterior distributions that incorporate any model-selection uncertainty. Our meta-analysis combines 29 different (15 visible and 14 infrared) photometric measurements of L d available in the literature; these involve a broad assortment of observational data sets, MW models and assumptions, and methodologies, all tabulated herein. Analyzing the visible and infrared measurements separately yields estimates for L d of {2.71}-0.20+0.22 kpc and {2.51}-0.13+0.15 kpc, respectively, whereas considering them all combined yields 2.64 ± 0.13 kpc. The ratio between the visible and infrared scale lengths determined here is very similar to that measured in external spiral galaxies. We use these results to update the model of the Galactic disk from our previous work, constraining its stellar mass to be {4.8}-1.1+1.5× {10}10 M ⊙, and the MW’s total stellar mass to be {5.7}-1.1+1.5× {10}10 M ⊙.
Quinolizidine alkaloids from Lupinus lanatus
NASA Astrophysics Data System (ADS)
Neto, Alexandre T.; Oliveira, Carolina Q.; Ilha, Vinicius; Pedroso, Marcelo; Burrow, Robert A.; Dalcol, Ionara I.; Morel, Ademir F.
2011-10-01
In this study, one new quinolizidine alkaloid, lanatine A ( 1), together with three other known alkaloids, 13-α- trans-cinnamoyloxylupanine ( 2), 13-α-hydroxylupanine ( 3), and (-)-multiflorine ( 4) were isolated from the aerial parts of Lupinus lanatus (Fabaceae). The structures of alkaloids 1- 4 were elucidated by spectroscopic data analysis. The stereochemistry of 1 was determined by single crystal X-ray analysis. Bayesian statistical analysis of the Bijvoet differences suggests the absolute stereochemistry of 1. In addition, the antimicrobial potential of alkaloids 1- 4 is also reported.
Kathman, Steven J; Potts, Ryan J; Ayres, Paul H; Harp, Paul R; Wilson, Cody L; Garner, Charles D
2010-10-01
The mouse dermal assay has long been used to assess the dermal tumorigenicity of cigarette smoke condensate (CSC). This mouse skin model has been developed for use in carcinogenicity testing utilizing the SENCAR mouse as the standard strain. Though the model has limitations, it remains as the most relevant method available to study the dermal tumor promoting potential of mainstream cigarette smoke. In the typical SENCAR mouse CSC bioassay, CSC is applied for 29 weeks following the application of a tumor initiator such as 7,12-dimethylbenz[a]anthracene (DMBA). Several endpoints are considered for analysis including: the percentage of animals with at least one mass, latency, and number of masses per animal. In this paper, a relatively straightforward analytic model and procedure is presented for analyzing the time course of the incidence of masses. The procedure considered here takes advantage of Bayesian statistical techniques, which provide powerful methods for model fitting and simulation. Two datasets are analyzed to illustrate how the model fits the data, how well the model may perform in predicting data from such trials, and how the model may be used as a decision tool when comparing the dermal tumorigenicity of cigarette smoke condensate from multiple cigarette types. The analysis presented here was developed as a statistical decision tool for differentiating between two or more prototype products based on the dermal tumorigenicity. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.
Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J
2015-07-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.
NASA Astrophysics Data System (ADS)
Yuksel, Kivanc; Chang, Xin; Skarbek, Władysław
2017-08-01
The novel smile recognition algorithm is presented based on extraction of 68 facial salient points (fp68) using the ensemble of regression trees. The smile detector exploits the Support Vector Machine linear model. It is trained with few hundreds exemplar images by SVM algorithm working in 136 dimensional space. It is shown by the strict statistical data analysis that such geometric detector strongly depends on the geometry of mouth opening area, measured by triangulation of outer lip contour. To this goal two Bayesian detectors were developed and compared with SVM detector. The first uses the mouth area in 2D image, while the second refers to the mouth area in 3D animated face model. The 3D modeling is based on Candide-3 model and it is performed in real time along with three smile detectors and statistics estimators. The mouth area/Bayesian detectors exhibit high correlation with fp68/SVM detector in a range [0:8; 1:0], depending mainly on light conditions and individual features with advantage of 3D technique, especially in hard light conditions.
Liew, Bernard X W; Drovandi, Christopher C; Clifford, Samuel; Keogh, Justin W L; Morris, Susan; Netto, Kevin
2018-01-01
There is convincing evidence for the benefits of resistance training on vertical jump improvements, but little evidence to guide optimal training prescription. The inability to detect small between modality effects may partially reflect the use of ANOVA statistics. This study represents the results of a sub-study from a larger project investigating the effects of two resistance training methods on load carriage running energetics. Bayesian statistics were used to compare the effectiveness of isoinertial resistance against speed-power training to change countermovement jump (CMJ) and squat jump (SJ) height, and joint energetics. Active adults were randomly allocated to either a six-week isoinertial ( n = 16; calf raises, leg press, and lunge), or a speed-power training program ( n = 14; countermovement jumps, hopping, with hip flexor training to target pre-swing running energetics). Primary outcome variables included jump height and joint power. Bayesian mixed modelling and Functional Data Analysis were used, where significance was determined by a non-zero crossing of the 95% Bayesian Credible Interval (CrI). The gain in CMJ height after isoinertial training was 1.95 cm (95% CrI [0.85-3.04] cm) greater than the gain after speed-power training, but the gain in SJ height was similar between groups. In the CMJ, isoinertial training produced a larger increase in power absorption at the hip by a mean 0.018% (equivalent to 35 W) (95% CrI [0.007-0.03]), knee by 0.014% (equivalent to 27 W) (95% CrI [0.006-0.02]) and foot by 0.011% (equivalent to 21 W) (95% CrI [0.005-0.02]) compared to speed-power training. Short-term isoinertial training improved CMJ height more than speed-power training. The principle adaptive difference between training modalities was at the level of hip, knee and foot power absorption.
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832
Probability, statistics, and computational science.
Beerenwinkel, Niko; Siebourg, Juliane
2012-01-01
In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.
Bayesian analysis of energy and count rate data for detection of low count rate radioactive sources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klumpp, John
We propose a radiation detection system which generates its own discrete sampling distribution based on past measurements of background. The advantage to this approach is that it can take into account variations in background with respect to time, location, energy spectra, detector-specific characteristics (i.e. different efficiencies at different count rates and energies), etc. This would therefore be a 'machine learning' approach, in which the algorithm updates and improves its characterization of background over time. The system would have a 'learning mode,' in which it measures and analyzes background count rates, and a 'detection mode,' in which it compares measurements frommore » an unknown source against its unique background distribution. By characterizing and accounting for variations in the background, general purpose radiation detectors can be improved with little or no increase in cost. The statistical and computational techniques to perform this kind of analysis have already been developed. The necessary signal analysis can be accomplished using existing Bayesian algorithms which account for multiple channels, multiple detectors, and multiple time intervals. Furthermore, Bayesian machine-learning techniques have already been developed which, with trivial modifications, can generate appropriate decision thresholds based on the comparison of new measurements against a nonparametric sampling distribution. (authors)« less
Information-Decay Pursuit of Dynamic Parameters in Student Models
1994-04-01
simple worked-through example). Commercially available computer programs for structuring and using Bayesian inference include ERGO ( Noetic Systems...Tukey, J.W. (1977). Data analysis and Regression: A second course in statistics. Reading, MA: Addison-Wesley. Noetic Systems, Inc. (1991). ERGO...Naval Academy Division of Educational Studies Annapolis MD 21402-5002 Elmory Univerity Dr Janice Gifford 210 Fiabburne Bldg University of
Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A
2009-10-01
Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.
2010-01-01
Background Methods for the calculation and application of quantitative electromyographic (EMG) statistics for the characterization of EMG data detected from forearm muscles of individuals with and without pain associated with repetitive strain injury are presented. Methods A classification procedure using a multi-stage application of Bayesian inference is presented that characterizes a set of motor unit potentials acquired using needle electromyography. The utility of this technique in characterizing EMG data obtained from both normal individuals and those presenting with symptoms of "non-specific arm pain" is explored and validated. The efficacy of the Bayesian technique is compared with simple voting methods. Results The aggregate Bayesian classifier presented is found to perform with accuracy equivalent to that of majority voting on the test data, with an overall accuracy greater than 0.85. Theoretical foundations of the technique are discussed, and are related to the observations found. Conclusions Aggregation of motor unit potential conditional probability distributions estimated using quantitative electromyographic analysis, may be successfully used to perform electrodiagnostic characterization of "non-specific arm pain." It is expected that these techniques will also be able to be applied to other types of electrodiagnostic data. PMID:20156353
Developing and Testing a Model to Predict Outcomes of Organizational Change
Gustafson, David H; Sainfort, François; Eichler, Mary; Adams, Laura; Bisognano, Maureen; Steudel, Harold
2003-01-01
Objective To test the effectiveness of a Bayesian model employing subjective probability estimates for predicting success and failure of health care improvement projects. Data Sources Experts' subjective assessment data for model development and independent retrospective data on 221 healthcare improvement projects in the United States, Canada, and the Netherlands collected between 1996 and 2000 for validation. Methods A panel of theoretical and practical experts and literature in organizational change were used to identify factors predicting the outcome of improvement efforts. A Bayesian model was developed to estimate probability of successful change using subjective estimates of likelihood ratios and prior odds elicited from the panel of experts. A subsequent retrospective empirical analysis of change efforts in 198 health care organizations was performed to validate the model. Logistic regression and ROC analysis were used to evaluate the model's performance using three alternative definitions of success. Data Collection For the model development, experts' subjective assessments were elicited using an integrative group process. For the validation study, a staff person intimately involved in each improvement project responded to a written survey asking questions about model factors and project outcomes. Results Logistic regression chi-square statistics and areas under the ROC curve demonstrated a high level of model performance in predicting success. Chi-square statistics were significant at the 0.001 level and areas under the ROC curve were greater than 0.84. Conclusions A subjective Bayesian model was effective in predicting the outcome of actual improvement projects. Additional prospective evaluations as well as testing the impact of this model as an intervention are warranted. PMID:12785571
The researcher and the consultant: from testing to probability statements.
Hamra, Ghassan B; Stang, Andreas; Poole, Charles
2015-09-01
In the first instalment of this series, Stang and Poole provided an overview of Fisher significance testing (ST), Neyman-Pearson null hypothesis testing (NHT), and their unfortunate and unintended offspring, null hypothesis significance testing. In addition to elucidating the distinction between the first two and the evolution of the third, the authors alluded to alternative models of statistical inference; namely, Bayesian statistics. Bayesian inference has experienced a revival in recent decades, with many researchers advocating for its use as both a complement and an alternative to NHT and ST. This article will continue in the direction of the first instalment, providing practicing researchers with an introduction to Bayesian inference. Our work will draw on the examples and discussion of the previous dialogue.
Spectral likelihood expansions for Bayesian inference
NASA Astrophysics Data System (ADS)
Nagel, Joseph B.; Sudret, Bruno
2016-03-01
A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.
On a full Bayesian inference for force reconstruction problems
NASA Astrophysics Data System (ADS)
Aucejo, M.; De Smet, O.
2018-05-01
In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
The Misidentified Identifiability Problem of Bayesian Knowledge Tracing
ERIC Educational Resources Information Center
Doroudi, Shayan; Brunskill, Emma
2017-01-01
In this paper, we investigate two purported problems with Bayesian Knowledge Tracing (BKT), a popular statistical model of student learning: "identifiability" and "semantic model degeneracy." In 2007, Beck and Chang stated that BKT is susceptible to an "identifiability problem"--various models with different…
Improved dynamical scaling analysis using the kernel method for nonequilibrium relaxation.
Echinaka, Yuki; Ozeki, Yukiyasu
2016-10-01
The dynamical scaling analysis for the Kosterlitz-Thouless transition in the nonequilibrium relaxation method is improved by the use of Bayesian statistics and the kernel method. This allows data to be fitted to a scaling function without using any parametric model function, which makes the results more reliable and reproducible and enables automatic and faster parameter estimation. Applying this method, the bootstrap method is introduced and a numerical discrimination for the transition type is proposed.
Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics
NASA Astrophysics Data System (ADS)
Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.
2018-03-01
Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2D seismic reflection data processing flow focused on pre - stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching (BHM), to estimate the uncertainties of the depths of key horizons near the borehole DSDP-258 located in the Mentelle Basin, south west of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ± 2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program (IODP), leg 369.
Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics
NASA Astrophysics Data System (ADS)
Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.
2018-06-01
Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2-D seismic reflection data processing flow focused on pre-stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching, to estimate the uncertainties of the depths of key horizons near the Deep Sea Drilling Project (DSDP) borehole 258 (DSDP-258) located in the Mentelle Basin, southwest of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ±2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent to the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program, leg 369.
Bayesian analysis of U.S. hurricane climate
Elsner, James B.; Bossak, Brian H.
2001-01-01
Predictive climate distributions of U.S. landfalling hurricanes are estimated from observational records over the period 1851–2000. The approach is Bayesian, combining the reliable records of hurricane activity during the twentieth century with the less precise accounts of activity during the nineteenth century to produce a best estimate of the posterior distribution on the annual rates. The methodology provides a predictive distribution of future activity that serves as a climatological benchmark. Results are presented for the entire coast as well as for the Gulf Coast, Florida, and the East Coast. Statistics on the observed annual counts of U.S. hurricanes, both for the entire coast and by region, are similar within each of the three consecutive 50-yr periods beginning in 1851. However, evidence indicates that the records during the nineteenth century are less precise. Bayesian theory provides a rational approach for defining hurricane climate that uses all available information and that makes no assumption about whether the 150-yr record of hurricanes has been adequately or uniformly monitored. The analysis shows that the number of major hurricanes expected to reach the U.S. coast over the next 30 yr is 18 and the number of hurricanes expected to hit Florida is 20.
GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.
Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H
2010-04-01
A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.
An introduction to using Bayesian linear regression with clinical data.
Baldwin, Scott A; Larson, Michael J
2017-11-01
Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Statistical modelling of networked human-automation performance using working memory capacity.
Ahmed, Nisar; de Visser, Ewart; Shaw, Tyler; Mohamed-Ameen, Amira; Campbell, Mark; Parasuraman, Raja
2014-01-01
This study examines the challenging problem of modelling the interaction between individual attentional limitations and decision-making performance in networked human-automation system tasks. Analysis of real experimental data from a task involving networked supervision of multiple unmanned aerial vehicles by human participants shows that both task load and network message quality affect performance, but that these effects are modulated by individual differences in working memory (WM) capacity. These insights were used to assess three statistical approaches for modelling and making predictions with real experimental networked supervisory performance data: classical linear regression, non-parametric Gaussian processes and probabilistic Bayesian networks. It is shown that each of these approaches can help designers of networked human-automated systems cope with various uncertainties in order to accommodate future users by linking expected operating conditions and performance from real experimental data to observable cognitive traits like WM capacity. Practitioner Summary: Working memory (WM) capacity helps account for inter-individual variability in operator performance in networked unmanned aerial vehicle supervisory tasks. This is useful for reliable performance prediction near experimental conditions via linear models; robust statistical prediction beyond experimental conditions via Gaussian process models and probabilistic inference about unknown task conditions/WM capacities via Bayesian network models.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nessi, G. T. von; Hole, M. J.; Svensson, J.
2012-01-15
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlledmore » Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.« less
Astrostatistical Analysis in Solar and Stellar Physics
NASA Astrophysics Data System (ADS)
Stenning, David Craig
This dissertation focuses on developing statistical models and methods to address data-analytic challenges in astrostatistics---a growing interdisciplinary field fostering collaborations between statisticians and astrophysicists. The astrostatistics projects we tackle can be divided into two main categories: modeling solar activity and Bayesian analysis of stellar evolution. These categories from Part I and Part II of this dissertation, respectively. The first line of research we pursue involves classification and modeling of evolving solar features. Advances in space-based observatories are increasing both the quality and quantity of solar data, primarily in the form of high-resolution images. To analyze massive streams of solar image data, we develop a science-driven dimension reduction methodology to extract scientifically meaningful features from images. This methodology utilizes mathematical morphology to produce a concise numerical summary of the magnetic flux distribution in solar "active regions'' that (i) is far easier to work with than the source images, (ii) encapsulates scientifically relevant information in a more informative manner than existing schemes (i.e., manual classification schemes), and (iii) is amenable to sophisticated statistical analyses. In a related line of research, we perform a Bayesian analysis of the solar cycle using multiple proxy variables, such as sunspot numbers. We take advantage of patterns and correlations among the proxy variables to model solar activity using data from proxies that have become available more recently, while also taking advantage of the long history of observations of sunspot numbers. This model is an extension of the Yu et al. (2012) Bayesian hierarchical model for the solar cycle that used the sunspot numbers alone. Since proxies have different temporal coverage, we devise a multiple imputation scheme to account for missing data. We find that incorporating multiple proxies reveals important features of the solar cycle that are missed when the model is fit using only the sunspot numbers. In Part II of this dissertation we focus on two related lines of research involving Bayesian analysis of stellar evolution. We first focus on modeling multiple stellar populations in star clusters. It has long been assumed that all star clusters are comprised of single stellar populations---stars that formed at roughly the same time from a common molecular cloud. However, recent studies have produced evidence that some clusters host multiple populations, which has far-reaching scientific implications. We develop a Bayesian hierarchical model for multiple-population star clusters, extending earlier statistical models of stellar evolution (e.g., van Dyk et al. 2009, Stein et al. 2013). We also devise an adaptive Markov chain Monte Carlo algorithm to explore the complex posterior distribution. We use numerical studies to demonstrate that our method can recover parameters of multiple-population clusters, and also show how model misspecification can be diagnosed. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We also explore statistical properties of the estimators and determine that the influence of the prior distribution does not diminish with larger sample sizes, leading to non-standard asymptotics. In a final line of research, we present the first-ever attempt to estimate the carbon fraction of white dwarfs. This quantity has important implications for both astrophysics and fundamental nuclear physics, but is currently unknown. We use a numerical study to demonstrate that assuming an incorrect value for the carbon fraction leads to incorrect white-dwarf ages of star clusters. Finally, we present our attempt to estimate the carbon fraction of the white dwarfs in the well-studied star cluster 47 Tucanae.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
Of bugs and birds: Markov Chain Monte Carlo for hierarchical modeling in wildlife research
Link, W.A.; Cam, E.; Nichols, J.D.; Cooch, E.G.
2002-01-01
Markov chain Monte Carlo (MCMC) is a statistical innovation that allows researchers to fit far more complex models to data than is feasible using conventional methods. Despite its widespread use in a variety of scientific fields, MCMC appears to be underutilized in wildlife applications. This may be due to a misconception that MCMC requires the adoption of a subjective Bayesian analysis, or perhaps simply to its lack of familiarity among wildlife researchers. We introduce the basic ideas of MCMC and software BUGS (Bayesian inference using Gibbs sampling), stressing that a simple and satisfactory intuition for MCMC does not require extraordinary mathematical sophistication. We illustrate the use of MCMC with an analysis of the association between latent factors governing individual heterogeneity in breeding and survival rates of kittiwakes (Rissa tridactyla). We conclude with a discussion of the importance of individual heterogeneity for understanding population dynamics and designing management plans.
NASA Astrophysics Data System (ADS)
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.
Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale
2016-08-01
Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty. © The Author(s) 2016.
Bayesian approach to non-Gaussian field statistics for diffusive broadband terahertz pulses.
Pearce, Jeremy; Jian, Zhongping; Mittleman, Daniel M
2005-11-01
We develop a closed-form expression for the probability distribution function for the field components of a diffusive broadband wave propagating through a random medium. We consider each spectral component to provide an individual observation of a random variable, the configurationally averaged spectral intensity. Since the intensity determines the variance of the field distribution at each frequency, this random variable serves as the Bayesian prior that determines the form of the non-Gaussian field statistics. This model agrees well with experimental results.
Probabilistic Model for Untargeted Peak Detection in LC-MS Using Bayesian Statistics.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2015-07-21
We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography-mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a chromatogram are affected by a chromatographic peak and which ones are only affected by noise. The use of probabilities contrasts with the traditional method in which a binary answer is given, relying on a threshold. By contrast, with the Bayesian peak detection presented here, the values of probability can be further propagated into other preprocessing steps, which will increase (or decrease) the importance of chromatographic regions into the final results. The present work is based on the use of the statistical overlap theory of component overlap from Davis and Giddings (Davis, J. M.; Giddings, J. Anal. Chem. 1983, 55, 418-424) as prior probability in the Bayesian formulation. The algorithm was tested on LC-MS Orbitrap data and was able to successfully distinguish chemical noise from actual peaks without any data preprocessing.
Built environment and Property Crime in Seattle, 1998-2000: A Bayesian Analysis.
Matthews, Stephen A; Yang, Tse-Chuan; Hayslett-McCall, Karen L; Ruback, R Barry
2010-06-01
The past decade has seen a rapid growth in the use of a spatial perspective in studies of crime. In part this growth has been driven by the availability of georeferenced data, and the tools to analyze and visualize them: geographic information systems (GIS), spatial analysis, and spatial statistics. In this paper we use exploratory spatial data analysis (ESDA) tools and Bayesian models to help better understand the spatial patterning and predictors of property crime in Seattle, Washington for 1998-2000, including a focus on built environment variables. We present results for aggregate property crime data as well as models for specific property crime types: residential burglary, nonresidential burglary, theft, auto theft, and arson. ESDA confirms the presence of spatial clustering of property crime and we seek to explain these patterns using spatial Poisson models implemented in WinBUGS. Our results indicate that built environment variables were significant predictors of property crime, especially the presence of a highway on auto theft and burglary.
Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler.
Jin, Ick Hoon; Yuan, Ying; Liang, Faming
2013-10-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the intractable normalizing constant and model degeneracy. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the intractable normalizing constant and model degeneracy issues encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
Hofer, Jeffrey D; Rauk, Adam P
2017-02-01
The purpose of this work was to develop a straightforward and robust approach to analyze and summarize the ability of content uniformity data to meet different criteria. A robust Bayesian statistical analysis methodology is presented which provides a concise and easily interpretable visual summary of the content uniformity analysis results. The visualization displays individual batch analysis results and shows whether there is high confidence that different content uniformity criteria could be met a high percentage of the time in the future. The 3 tests assessed are as follows: (a) United States Pharmacopeia Uniformity of Dosage Units <905>, (b) a specific ASTM E2810 Sampling Plan 1 criterion to potentially be used for routine release testing, and (c) another specific ASTM E2810 Sampling Plan 2 criterion to potentially be used for process validation. The approach shown here could readily be used to create similar result summaries for other potential criteria. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Built environment and Property Crime in Seattle, 1998–2000: A Bayesian Analysis
Matthews, Stephen A.; Yang, Tse-chuan; Hayslett-McCall, Karen L.; Ruback, R. Barry
2014-01-01
The past decade has seen a rapid growth in the use of a spatial perspective in studies of crime. In part this growth has been driven by the availability of georeferenced data, and the tools to analyze and visualize them: geographic information systems (GIS), spatial analysis, and spatial statistics. In this paper we use exploratory spatial data analysis (ESDA) tools and Bayesian models to help better understand the spatial patterning and predictors of property crime in Seattle, Washington for 1998–2000, including a focus on built environment variables. We present results for aggregate property crime data as well as models for specific property crime types: residential burglary, nonresidential burglary, theft, auto theft, and arson. ESDA confirms the presence of spatial clustering of property crime and we seek to explain these patterns using spatial Poisson models implemented in WinBUGS. Our results indicate that built environment variables were significant predictors of property crime, especially the presence of a highway on auto theft and burglary. PMID:24737924
Functional Interaction Network Construction and Analysis for Disease Discovery.
Wu, Guanming; Haw, Robin
2017-01-01
Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
Bayesian data analysis in population ecology: motivations, methods, and benefits
Dorazio, Robert
2016-01-01
During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.
Wafer, Lucas; Kloczewiak, Marek; Luo, Yin
2016-07-01
Analytical ultracentrifugation-sedimentation velocity (AUC-SV) is often used to quantify high molar mass species (HMMS) present in biopharmaceuticals. Although these species are often present in trace quantities, they have received significant attention due to their potential immunogenicity. Commonly, AUC-SV data is analyzed as a diffusion-corrected, sedimentation coefficient distribution, or c(s), using SEDFIT to numerically solve Lamm-type equations. SEDFIT also utilizes maximum entropy or Tikhonov-Phillips regularization to further allow the user to determine relevant sample information, including the number of species present, their sedimentation coefficients, and their relative abundance. However, this methodology has several, often unstated, limitations, which may impact the final analysis of protein therapeutics. These include regularization-specific effects, artificial "ripple peaks," and spurious shifts in the sedimentation coefficients. In this investigation, we experimentally verified that an explicit Bayesian approach, as implemented in SEDFIT, can largely correct for these effects. Clear guidelines on how to implement this technique and interpret the resulting data, especially for samples containing micro-heterogeneity (e.g., differential glycosylation), are also provided. In addition, we demonstrated how the Bayesian approach can be combined with F statistics to draw more accurate conclusions and rigorously exclude artifactual peaks. Numerous examples with an antibody and an antibody-drug conjugate were used to illustrate the strengths and drawbacks of each technique.
NASA Astrophysics Data System (ADS)
Echeverria, Alex; Silva, Jorge F.; Mendez, Rene A.; Orchard, Marcos
2016-10-01
Context. The best precision that can be achieved to estimate the location of a stellar-like object is a topic of permanent interest in the astrometric community. Aims: We analyze bounds for the best position estimation of a stellar-like object on a CCD detector array in a Bayesian setting where the position is unknown, but where we have access to a prior distribution. In contrast to a parametric setting where we estimate a parameter from observations, the Bayesian approach estimates a random object (I.e., the position is a random variable) from observations that are statistically dependent on the position. Methods: We characterize the Bayesian Cramér-Rao (CR) that bounds the minimum mean square error (MMSE) of the best estimator of the position of a point source on a linear CCD-like detector, as a function of the properties of detector, the source, and the background. Results: We quantify and analyze the increase in astrometric performance from the use of a prior distribution of the object position, which is not available in the classical parametric setting. This gain is shown to be significant for various observational regimes, in particular in the case of faint objects or when the observations are taken under poor conditions. Furthermore, we present numerical evidence that the MMSE estimator of this problem tightly achieves the Bayesian CR bound. This is a remarkable result, demonstrating that all the performance gains presented in our analysis can be achieved with the MMSE estimator. Conclusions: The Bayesian CR bound can be used as a benchmark indicator of the expected maximum positional precision of a set of astrometric measurements in which prior information can be incorporated. This bound can be achieved through the conditional mean estimator, in contrast to the parametric case where no unbiased estimator precisely reaches the CR bound.
Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics
NASA Astrophysics Data System (ADS)
Abe, Sumiyoshi
2014-11-01
The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.
ERIC Educational Resources Information Center
Martuza, Victor R.; Engel, John D.
Results from classical power analysis (Brewer, 1972) suggest that a researcher should not set a=p (when p is less than a) in a posteriori fashion when a study yields statistically significant results because of a resulting decrease in power. The purpose of the present report is to use Bayesian theory in examining the validity of this…
Bayesian statistical techniques have proven useful in clinical and environmental epidemiological applications to evaluate and integrate available information, and in regulatory applications such as the National Ambient Air Quality Assessment for Nitrogen Oxides. A recent special...
2014-10-02
intervals (Neil, Tailor, Marquez, Fenton , & Hear, 2007). This is cumbersome, error prone and usually inaccurate. Even though a universal framework...Science. Neil, M., Tailor, M., Marquez, D., Fenton , N., & Hear. (2007). Inference in Bayesian networks using dynamic discretisation. Statistics
A Bayesian Approach to Interactive Retrieval
ERIC Educational Resources Information Center
Tague, Jean M.
1973-01-01
A probabilistic model for interactive retrieval is presented. Bayesian statistical decision theory principles are applied: use of prior and sample information about the relationship of document descriptions to query relevance; maximization of expected value of a utility function, to the problem of optimally restructuring search strategies in an…
Personalized Multi-Student Improvement Based on Bayesian Cybernetics
ERIC Educational Resources Information Center
Kaburlasos, Vassilis G.; Marinagi, Catherine C.; Tsoukalas, Vassilis Th.
2008-01-01
This work presents innovative cybernetics (feedback) techniques based on Bayesian statistics for drawing questions from an Item Bank towards personalized multi-student improvement. A novel software tool, namely "Module for Adaptive Assessment of Students" (or, "MAAS" for short), implements the proposed (feedback) techniques. In conclusion, a pilot…
Scliar, Marilia O; Gouveia, Mateus H; Benazzo, Andrea; Ghirotto, Silvia; Fagundes, Nelson J R; Leal, Thiago P; Magalhães, Wagner C S; Pereira, Latife; Rodrigues, Maira R; Soares-Souza, Giordano B; Cabrera, Lilia; Berg, Douglas E; Gilman, Robert H; Bertorelle, Giorgio; Tarazona-Santos, Eduardo
2014-09-30
Archaeology reports millenary cultural contacts between Peruvian Coast-Andes and the Amazon Yunga, a rainforest transitional region between Andes and Lower Amazonia. To clarify the relationships between cultural and biological evolution of these populations, in particular between Amazon Yungas and Andeans, we used DNA-sequence data, a model-based Bayesian approach and several statistical validations to infer a set of demographic parameters. We found that the genetic diversity of the Shimaa (an Amazon Yunga population) is a subset of that of Quechuas from Central-Andes. Using the Isolation-with-Migration population genetics model, we inferred that the Shimaa ancestors were a small subgroup that split less than 5300 years ago (after the development of complex societies) from an ancestral Andean population. After the split, the most plausible scenario compatible with our results is that the ancestors of Shimaas moved toward the Peruvian Amazon Yunga and incorporated the culture and language of some of their neighbors, but not a substantial amount of their genes. We validated our results using Approximate Bayesian Computations, posterior predictive tests and the analysis of pseudo-observed datasets. We presented a case study in which model-based Bayesian approaches, combined with necessary statistical validations, shed light into the prehistoric demographic relationship between Andeans and a population from the Amazon Yunga. Our results offer a testable model for the peopling of this large transitional environmental region between the Andes and the Lower Amazonia. However, studies on larger samples and involving more populations of these regions are necessary to confirm if the predominant Andean biological origin of the Shimaas is the rule, and not the exception.
NASA Astrophysics Data System (ADS)
Eadie, Gwendolyn M.; Springford, Aaron; Harris, William E.
2017-02-01
We present a hierarchical Bayesian method for estimating the total mass and mass profile of the Milky Way Galaxy. The new hierarchical Bayesian approach further improves the framework presented by Eadie et al. and Eadie and Harris and builds upon the preliminary reports by Eadie et al. The method uses a distribution function f({ E },L) to model the Galaxy and kinematic data from satellite objects, such as globular clusters (GCs), to trace the Galaxy’s gravitational potential. A major advantage of the method is that it not only includes complete and incomplete data simultaneously in the analysis, but also incorporates measurement uncertainties in a coherent and meaningful way. We first test the hierarchical Bayesian framework, which includes measurement uncertainties, using the same data and power-law model assumed in Eadie and Harris and find the results are similar but more strongly constrained. Next, we take advantage of the new statistical framework and incorporate all possible GC data, finding a cumulative mass profile with Bayesian credible regions. This profile implies a mass within 125 kpc of 4.8× {10}11{M}⊙ with a 95% Bayesian credible region of (4.0{--}5.8)× {10}11{M}⊙ . Our results also provide estimates of the true specific energies of all the GCs. By comparing these estimated energies to the measured energies of GCs with complete velocity measurements, we observe that (the few) remote tracers with complete measurements may play a large role in determining a total mass estimate of the Galaxy. Thus, our study stresses the need for more remote tracers with complete velocity measurements.
Compromise decision support problems for hierarchical design involving uncertainty
NASA Astrophysics Data System (ADS)
Vadde, S.; Allen, J. K.; Mistree, F.
1994-08-01
In this paper an extension to the traditional compromise Decision Support Problem (DSP) formulation is presented. Bayesian statistics is used in the formulation to model uncertainties associated with the information being used. In an earlier paper a compromise DSP that accounts for uncertainty using fuzzy set theory was introduced. The Bayesian Decision Support Problem is described in this paper. The method for hierarchical design is demonstrated by using this formulation to design a portal frame. The results are discussed and comparisons are made with those obtained using the fuzzy DSP. Finally, the efficacy of incorporating Bayesian statistics into the traditional compromise DSP formulation is discussed and some pending research issues are described. Our emphasis in this paper is on the method rather than the results per se.
Bucci, Melanie E.; Callahan, Peggy; Koprowski, John L.; Polfus, Jean L.; Krausman, Paul R.
2015-01-01
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable. PMID:25803664
Derbridge, Jonathan J; Merkle, Jerod A; Bucci, Melanie E; Callahan, Peggy; Koprowski, John L; Polfus, Jean L; Krausman, Paul R
2015-01-01
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable.
Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum
2011-01-01
Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…
Is probabilistic bias analysis approximately Bayesian?
MacLehose, Richard F.; Gustafson, Paul
2011-01-01
Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. While the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well. PMID:22157311
ERIC Educational Resources Information Center
Yuan, Ying; MacKinnon, David P.
2009-01-01
In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…
NASA Astrophysics Data System (ADS)
Olugboji, T. M.; Lekic, V.; McDonough, W.
2017-07-01
We present a new approach for evaluating existing crustal models using ambient noise data sets and its associated uncertainties. We use a transdimensional hierarchical Bayesian inversion approach to invert ambient noise surface wave phase dispersion maps for Love and Rayleigh waves using measurements obtained from Ekström (2014). Spatiospectral analysis shows that our results are comparable to a linear least squares inverse approach (except at higher harmonic degrees), but the procedure has additional advantages: (1) it yields an autoadaptive parameterization that follows Earth structure without making restricting assumptions on model resolution (regularization or damping) and data errors; (2) it can recover non-Gaussian phase velocity probability distributions while quantifying the sources of uncertainties in the data measurements and modeling procedure; and (3) it enables statistical assessments of different crustal models (e.g., CRUST1.0, LITHO1.0, and NACr14) using variable resolution residual and standard deviation maps estimated from the ensemble. These assessments show that in the stable old crust of the Archean, the misfits are statistically negligible, requiring no significant update to crustal models from the ambient noise data set. In other regions of the U.S., significant updates to regionalization and crustal structure are expected especially in the shallow sedimentary basins and the tectonically active regions, where the differences between model predictions and data are statistically significant.
Bayesian change-point analysis reveals developmental change in a classic theory of mind task.
Baker, Sara T; Leslie, Alan M; Gallistel, C R; Hood, Bruce M
2016-12-01
Although learning and development reflect changes situated in an individual brain, most discussions of behavioral change are based on the evidence of group averages. Our reliance on group-averaged data creates a dilemma. On the one hand, we need to use traditional inferential statistics. On the other hand, group averages are highly ambiguous when we need to understand change in the individual; the average pattern of change may characterize all, some, or none of the individuals in the group. Here we present a new method for statistically characterizing developmental change in each individual child we study. Using false-belief tasks, fifty-two children in two cohorts were repeatedly tested for varying lengths of time between 3 and 5 years of age. Using a novel Bayesian change point analysis, we determined both the presence and-just as importantly-the absence of change in individual longitudinal cumulative records. Whenever the analysis supports a change conclusion, it identifies in that child's record the most likely point at which change occurred. Results show striking variability in patterns of change and stability across individual children. We then group the individuals by their various patterns of change or no change. The resulting patterns provide scarce support for sudden changes in competence and shed new light on the concepts of "passing" and "failing" in developmental studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Spatio-temporal Genetic Structuring of Leishmania major in Tunisia by Microsatellite Analysis
Harrabi, Myriam; Bettaieb, Jihène; Ghawar, Wissem; Toumi, Amine; Zaâtour, Amor; Yazidi, Rihab; Chaâbane, Sana; Chalghaf, Bilel; Hide, Mallorie; Bañuls, Anne-Laure; Ben Salah, Afif
2015-01-01
In Tunisia, cases of zoonotic cutaneous leishmaniasis caused by Leishmania major are increasing and spreading from the south-west to new areas in the center. To improve the current knowledge on L. major evolution and population dynamics, we performed multi-locus microsatellite typing of human isolates from Tunisian governorates where the disease is endemic (Gafsa, Kairouan and Sidi Bouzid governorates) and collected during two periods: 1991–1992 and 2008–2012. Analysis (F-statistics and Bayesian model-based approach) of the genotyping results of isolates collected in Sidi Bouzid in 1991–1992 and 2008–2012 shows that, over two decades, in the same area, Leishmania parasites evolved by generating genetically differentiated populations. The genetic patterns of 2008–2012 isolates from the three governorates indicate that L. major populations did not spread gradually from the south to the center of Tunisia, according to a geographical gradient, suggesting that human activities might be the source of the disease expansion. The genotype analysis also suggests previous (Bayesian model-based approach) and current (F-statistics) flows of genotypes between governorates and districts. Human activities as well as reservoir dynamics and the effects of environmental changes could explain how the disease progresses. This study provides new insights into the evolution and spread of L. major in Tunisia that might improve our understanding of the parasite flow between geographically and temporally distinct populations. PMID:26302440
TOWARDS A BAYESIAN PERSPECTIVE ON STATISTICAL DISCLOSURE LIMITATION
National statistical offices and other organizations collect data on individual subjects (person, businesses, organizations), typically while assuring the subject that data pertaining to them will be held confidential. These data provide the raw material for statistical data pro...
Bayesian analysis of biogeography when the number of areas is large.
Landis, Michael J; Matzke, Nicholas J; Moore, Brian R; Huelsenbeck, John P
2013-11-01
Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a "data-augmentation" approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea.
A critique of statistical hypothesis testing in clinical research
Raha, Somik
2011-01-01
Many have documented the difficulty of using the current paradigm of Randomized Controlled Trials (RCTs) to test and validate the effectiveness of alternative medical systems such as Ayurveda. This paper critiques the applicability of RCTs for all clinical knowledge-seeking endeavors, of which Ayurveda research is a part. This is done by examining statistical hypothesis testing, the underlying foundation of RCTs, from a practical and philosophical perspective. In the philosophical critique, the two main worldviews of probability are that of the Bayesian and the frequentist. The frequentist worldview is a special case of the Bayesian worldview requiring the unrealistic assumptions of knowing nothing about the universe and believing that all observations are unrelated to each other. Many have claimed that the first belief is necessary for science, and this claim is debunked by comparing variations in learning with different prior beliefs. Moving beyond the Bayesian and frequentist worldviews, the notion of hypothesis testing itself is challenged on the grounds that a hypothesis is an unclear distinction, and assigning a probability on an unclear distinction is an exercise that does not lead to clarity of action. This critique is of the theory itself and not any particular application of statistical hypothesis testing. A decision-making frame is proposed as a way of both addressing this critique and transcending ideological debates on probability. An example of a Bayesian decision-making approach is shown as an alternative to statistical hypothesis testing, utilizing data from a past clinical trial that studied the effect of Aspirin on heart attacks in a sample population of doctors. As a big reason for the prevalence of RCTs in academia is legislation requiring it, the ethics of legislating the use of statistical methods for clinical research is also examined. PMID:22022152
Bayesian analysis of the kinetics of quantal transmitter secretion at the neuromuscular junction.
Saveliev, Anatoly; Khuzakhmetova, Venera; Samigullin, Dmitry; Skorinkin, Andrey; Kovyazina, Irina; Nikolsky, Eugeny; Bukharaeva, Ellya
2015-10-01
The timing of transmitter release from nerve endings is considered nowadays as one of the factors determining the plasticity and efficacy of synaptic transmission. In the neuromuscular junction, the moments of release of individual acetylcholine quanta are related to the synaptic delays of uniquantal endplate currents recorded under conditions of lowered extracellular calcium. Using Bayesian modelling, we performed a statistical analysis of synaptic delays in mouse neuromuscular junction with different patterns of rhythmic nerve stimulation and when the entry of calcium ions into the nerve terminal was modified. We have obtained a statistical model of the release timing which is represented as the summation of two independent statistical distributions. The first of these is the exponentially modified Gaussian distribution. The mixture of normal and exponential components in this distribution can be interpreted as a two-stage mechanism of early and late periods of phasic synchronous secretion. The parameters of this distribution depend on both the stimulation frequency of the motor nerve and the calcium ions' entry conditions. The second distribution was modelled as quasi-uniform, with parameters independent of nerve stimulation frequency and calcium entry. Two different probability density functions for the distribution of synaptic delays suggest at least two independent processes controlling the time course of secretion, one of them potentially involving two stages. The relative contribution of these processes to the total number of mediator quanta released depends differently on the motor nerve stimulation pattern and on calcium ion entry into nerve endings.
Enhancing pediatric clinical trial feasibility through the use of Bayesian statistics.
Huff, Robin A; Maca, Jeff D; Puri, Mala; Seltzer, Earl W
2017-11-01
BackgroundPediatric clinical trials commonly experience recruitment challenges including limited number of patients and investigators, inclusion/exclusion criteria that further reduce the patient pool, and a competitive research landscape created by pediatric regulatory commitments. To overcome these challenges, innovative approaches are needed.MethodsThis article explores the use of Bayesian statistics to improve pediatric trial feasibility, using pediatric Type-2 diabetes as an example. Data for six therapies approved for adults were used to perform simulations to determine the impact on pediatric trial size.ResultsWhen the number of adult patients contributing to the simulation was assumed to be the same as the number of patients to be enrolled in the pediatric trial, the pediatric trial size was reduced by 75-78% when compared with a frequentist statistical approach, but was associated with a 34-45% false-positive rate. In subsequent simulations, greater control was exerted over the false-positive rate by decreasing the contribution of the adult data. A 30-33% reduction in trial size was achieved when false-positives were held to less than 10%.ConclusionReducing the trial size through the use of Bayesian statistics would facilitate completion of pediatric trials, enabling drugs to be labeled appropriately for children.
Theory-based Bayesian models of inductive learning and reasoning.
Tenenbaum, Joshua B; Griffiths, Thomas L; Kemp, Charles
2006-07-01
Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. We argue that both components are necessary to explain the nature, use and acquisition of human knowledge, and we introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.
Data free inference with processed data products
Chowdhary, K.; Najm, H. N.
2014-07-12
Here, we consider the context of probabilistic inference of model parameters given error bars or confidence intervals on model output values, when the data is unavailable. We introduce a class of algorithms in a Bayesian framework, relying on maximum entropy arguments and approximate Bayesian computation methods, to generate consistent data with the given summary statistics. Once we obtain consistent data sets, we pool the respective posteriors, to arrive at a single, averaged density on the parameters. This approach allows us to perform accurate forward uncertainty propagation consistent with the reported statistics.
Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals
Fourment, Mathieu; Claywell, Brian C; Dinh, Vu; McCoy, Connor; Matsen IV, Frederick A; Darling, Aaron E
2018-01-01
Abstract Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop “guided” proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy. PMID:29186587
Henschel, Volkmar; Engel, Jutta; Hölzel, Dieter; Mansmann, Ulrich
2009-02-10
Multivariate analysis of interval censored event data based on classical likelihood methods is notoriously cumbersome. Likelihood inference for models which additionally include random effects are not available at all. Developed algorithms bear problems for practical users like: matrix inversion, slow convergence, no assessment of statistical uncertainty. MCMC procedures combined with imputation are used to implement hierarchical models for interval censored data within a Bayesian framework. Two examples from clinical practice demonstrate the handling of clustered interval censored event times as well as multilayer random effects for inter-institutional quality assessment. The software developed is called survBayes and is freely available at CRAN. The proposed software supports the solution of complex analyses in many fields of clinical epidemiology as well as health services research.
Quantum-Like Representation of Non-Bayesian Inference
NASA Astrophysics Data System (ADS)
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Cyber-T web server: differential analysis of high-throughput data.
Kayala, Matthew A; Baldi, Pierre
2012-07-01
The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.
Bayesian Estimation in the One-Parameter Latent Trait Model.
1980-03-01
Journal of Mathematical and Statistical Psychology , 1973, 26, 31-44. (a) Andersen, E. B. A goodness of fit test for the Rasch model. Psychometrika, 1973, 28...technique for estimating latent trait mental test parameters. Educational and Psychological Measurement, 1976, 36, 705-715. Lindley, D. V. The...Lord, F. M. An analysis of verbal Scholastic Aptitude Test using Birnbaum’s three-parameter logistic model. Educational and Psychological
Computer-Based Model Calibration and Uncertainty Analysis: Terms and Concepts
2015-07-01
uncertainty analyses throughout the lifecycle of planning, designing, and operating of Civil Works flood risk management projects as described in...value 95% of the time. In the frequentist approach to PE, model parameters area regarded as having true values, and their estimate is based on the...in catchment models. 1. Evaluating parameter uncertainty. Water Resources Research 19(5):1151–1172. Lee, P. M. 2012. Bayesian statistics: An
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
When decision heuristics and science collide.
Yu, Erica C; Sprenger, Amber M; Thomas, Rick P; Dougherty, Michael R
2014-04-01
The ongoing discussion among scientists about null-hypothesis significance testing and Bayesian data analysis has led to speculation about the practices and consequences of "researcher degrees of freedom." This article advances this debate by asking the broader questions that we, as scientists, should be asking: How do scientists make decisions in the course of doing research, and what is the impact of these decisions on scientific conclusions? We asked practicing scientists to collect data in a simulated research environment, and our findings show that some scientists use data collection heuristics that deviate from prescribed methodology. Monte Carlo simulations show that data collection heuristics based on p values lead to biases in estimated effect sizes and Bayes factors and to increases in both false-positive and false-negative rates, depending on the specific heuristic. We also show that using Bayesian data collection methods does not eliminate these biases. Thus, our study highlights the little appreciated fact that the process of doing science is a behavioral endeavor that can bias statistical description and inference in a manner that transcends adherence to any particular statistical framework.
Tom, Jennifer A; Sinsheimer, Janet S; Suchard, Marc A
Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic iterative reweighting MCMC algorithm. In doing so, we reuse the available realizations by reweighting them with importance weights, recycling them into a now tractable joint hierarchical model. We apply this technique to intermediate realizations generated from stratified analyses of 687 influenza A genomes spanning 13 years allowing us to revisit hypotheses regarding the evolutionary history of influenza within a hierarchical statistical framework.
Tom, Jennifer A.; Sinsheimer, Janet S.; Suchard, Marc A.
2015-01-01
Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic iterative reweighting MCMC algorithm. In doing so, we reuse the available realizations by reweighting them with importance weights, recycling them into a now tractable joint hierarchical model. We apply this technique to intermediate realizations generated from stratified analyses of 687 influenza A genomes spanning 13 years allowing us to revisit hypotheses regarding the evolutionary history of influenza within a hierarchical statistical framework. PMID:26681992
A Bayesian approach to reliability and confidence
NASA Technical Reports Server (NTRS)
Barnes, Ron
1989-01-01
The historical evolution of NASA's interest in quantitative measures of reliability assessment is outlined. The introduction of some quantitative methodologies into the Vehicle Reliability Branch of the Safety, Reliability and Quality Assurance (SR and QA) Division at Johnson Space Center (JSC) was noted along with the development of the Extended Orbiter Duration--Weakest Link study which will utilize quantitative tools for a Bayesian statistical analysis. Extending the earlier work of NASA sponsor, Richard Heydorn, researchers were able to produce a consistent Bayesian estimate for the reliability of a component and hence by a simple extension for a system of components in some cases where the rate of failure is not constant but varies over time. Mechanical systems in general have this property since the reliability usually decreases markedly as the parts degrade over time. While they have been able to reduce the Bayesian estimator to a simple closed form for a large class of such systems, the form for the most general case needs to be attacked by the computer. Once a table is generated for this form, researchers will have a numerical form for the general solution. With this, the corresponding probability statements about the reliability of a system can be made in the most general setting. Note that the utilization of uniform Bayesian priors represents a worst case scenario in the sense that as researchers incorporate more expert opinion into the model, they will be able to improve the strength of the probability calculations.
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Bayesian Analysis of High Dimensional Classification
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Subhadeep; Liang, Faming
2009-12-01
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.
Modeling Error Distributions of Growth Curve Models through Bayesian Methods
ERIC Educational Resources Information Center
Zhang, Zhiyong
2016-01-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is…
Application of Bayesian Methods for Detecting Fraudulent Behavior on Tests
ERIC Educational Resources Information Center
Sinharay, Sandip
2018-01-01
Producers and consumers of test scores are increasingly concerned about fraudulent behavior before and during the test. There exist several statistical or psychometric methods for detecting fraudulent behavior on tests. This paper provides a review of the Bayesian approaches among them. Four hitherto-unpublished real data examples are provided to…
A baker's dozen of new particle flows for nonlinear filters, Bayesian decisions and transport
NASA Astrophysics Data System (ADS)
Daum, Fred; Huang, Jim
2015-05-01
We describe a baker's dozen of new particle flows to compute Bayes' rule for nonlinear filters, Bayesian decisions and learning as well as transport. Several of these new flows were inspired by transport theory, but others were inspired by physics or statistics or Markov chain Monte Carlo methods.
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing—with its unique statistical properties—became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca. PMID:28817636
Ramachandran, Parameswaran; Sánchez-Taltavull, Daniel; Perkins, Theodore J
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing-with its unique statistical properties-became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca.
Bayesian characterization of uncertainty in species interaction strengths.
Wolf, Christopher; Novak, Mark; Gitelman, Alix I
2017-06-01
Considerable effort has been devoted to the estimation of species interaction strengths. This effort has focused primarily on statistical significance testing and obtaining point estimates of parameters that contribute to interaction strength magnitudes, leaving the characterization of uncertainty associated with those estimates unconsidered. We consider a means of characterizing the uncertainty of a generalist predator's interaction strengths by formulating an observational method for estimating a predator's prey-specific per capita attack rates as a Bayesian statistical model. This formulation permits the explicit incorporation of multiple sources of uncertainty. A key insight is the informative nature of several so-called non-informative priors that have been used in modeling the sparse data typical of predator feeding surveys. We introduce to ecology a new neutral prior and provide evidence for its superior performance. We use a case study to consider the attack rates in a New Zealand intertidal whelk predator, and we illustrate not only that Bayesian point estimates can be made to correspond with those obtained by frequentist approaches, but also that estimation uncertainty as described by 95% intervals is more useful and biologically realistic using the Bayesian method. In particular, unlike in bootstrap confidence intervals, the lower bounds of the Bayesian posterior intervals for attack rates do not include zero when a predator-prey interaction is in fact observed. We conclude that the Bayesian framework provides a straightforward, probabilistic characterization of interaction strength uncertainty, enabling future considerations of both the deterministic and stochastic drivers of interaction strength and their impact on food webs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marzouk, Youssef
Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less
NASA Astrophysics Data System (ADS)
Hincks, Ian; Granade, Christopher; Cory, David G.
2018-01-01
The analysis of photon count data from the standard nitrogen vacancy (NV) measurement process is treated as a statistical inference problem. This has applications toward gaining better and more rigorous error bars for tasks such as parameter estimation (e.g. magnetometry), tomography, and randomized benchmarking. We start by providing a summary of the standard phenomenological model of the NV optical process in terms of Lindblad jump operators. This model is used to derive random variables describing emitted photons during measurement, to which finite visibility, dark counts, and imperfect state preparation are added. NV spin-state measurement is then stated as an abstract statistical inference problem consisting of an underlying biased coin obstructed by three Poisson rates. Relevant frequentist and Bayesian estimators are provided, discussed, and quantitatively compared. We show numerically that the risk of the maximum likelihood estimator is well approximated by the Cramér-Rao bound, for which we provide a simple formula. Of the estimators, we in particular promote the Bayes estimator, owing to its slightly better risk performance, and straightforward error propagation into more complex experiments. This is illustrated on experimental data, where quantum Hamiltonian learning is performed and cross-validated in a fully Bayesian setting, and compared to a more traditional weighted least squares fit.
Love, Jeffrey J.
2012-01-01
Statistical analysis is made of rare, extreme geophysical events recorded in historical data -- counting the number of events $k$ with sizes that exceed chosen thresholds during specific durations of time $\\tau$. Under transformations that stabilize data and model-parameter variances, the most likely Poisson-event occurrence rate, $k/\\tau$, applies for frequentist inference and, also, for Bayesian inference with a Jeffreys prior that ensures posterior invariance under changes of variables. Frequentist confidence intervals and Bayesian (Jeffreys) credibility intervals are approximately the same and easy to calculate: $(1/\\tau)[(\\sqrt{k} - z/2)^{2},(\\sqrt{k} + z/2)^{2}]$, where $z$ is a parameter that specifies the width, $z=1$ ($z=2$) corresponding to $1\\sigma$, $68.3\\%$ ($2\\sigma$, $95.4\\%$). If only a few events have been observed, as is usually the case for extreme events, then these "error-bar" intervals might be considered to be relatively wide. From historical records, we estimate most likely long-term occurrence rates, 10-yr occurrence probabilities, and intervals of frequentist confidence and Bayesian credibility for large earthquakes, explosive volcanic eruptions, and magnetic storms.
Efficient Posterior Probability Mapping Using Savage-Dickey Ratios
Penny, William D.; Ridgway, Gerard R.
2013-01-01
Statistical Parametric Mapping (SPM) is the dominant paradigm for mass-univariate analysis of neuroimaging data. More recently, a Bayesian approach termed Posterior Probability Mapping (PPM) has been proposed as an alternative. PPM offers two advantages: (i) inferences can be made about effect size thus lending a precise physiological meaning to activated regions, (ii) regions can be declared inactive. This latter facility is most parsimoniously provided by PPMs based on Bayesian model comparisons. To date these comparisons have been implemented by an Independent Model Optimization (IMO) procedure which separately fits null and alternative models. This paper proposes a more computationally efficient procedure based on Savage-Dickey approximations to the Bayes factor, and Taylor-series approximations to the voxel-wise posterior covariance matrices. Simulations show the accuracy of this Savage-Dickey-Taylor (SDT) method to be comparable to that of IMO. Results on fMRI data show excellent agreement between SDT and IMO for second-level models, and reasonable agreement for first-level models. This Savage-Dickey test is a Bayesian analogue of the classical SPM-F and allows users to implement model comparison in a truly interactive manner. PMID:23533640
Khana, Diba; Rossen, Lauren M; Hedegaard, Holly; Warner, Margaret
2018-01-01
Hierarchical Bayes models have been used in disease mapping to examine small scale geographic variation. State level geographic variation for less common causes of mortality outcomes have been reported however county level variation is rarely examined. Due to concerns about statistical reliability and confidentiality, county-level mortality rates based on fewer than 20 deaths are suppressed based on Division of Vital Statistics, National Center for Health Statistics (NCHS) statistical reliability criteria, precluding an examination of spatio-temporal variation in less common causes of mortality outcomes such as suicide rates (SRs) at the county level using direct estimates. Existing Bayesian spatio-temporal modeling strategies can be applied via Integrated Nested Laplace Approximation (INLA) in R to a large number of rare causes of mortality outcomes to enable examination of spatio-temporal variations on smaller geographic scales such as counties. This method allows examination of spatiotemporal variation across the entire U.S., even where the data are sparse. We used mortality data from 2005-2015 to explore spatiotemporal variation in SRs, as one particular application of the Bayesian spatio-temporal modeling strategy in R-INLA to predict year and county-specific SRs. Specifically, hierarchical Bayesian spatio-temporal models were implemented with spatially structured and unstructured random effects, correlated time effects, time varying confounders and space-time interaction terms in the software R-INLA, borrowing strength across both counties and years to produce smoothed county level SRs. Model-based estimates of SRs were mapped to explore geographic variation.
Plant selection for ethnobotanical uses on the Amalfi Coast (Southern Italy).
Savo, V; Joy, R; Caneva, G; McClatchey, W C
2015-07-15
Many ethnobotanical studies have investigated selection criteria for medicinal and non-medicinal plants. In this paper we test several statistical methods using different ethnobotanical datasets in order to 1) define to which extent the nature of the datasets can affect the interpretation of results; 2) determine if the selection for different plant uses is based on phylogeny, or other selection criteria. We considered three different ethnobotanical datasets: two datasets of medicinal plants and a dataset of non-medicinal plants (handicraft production, domestic and agro-pastoral practices) and two floras of the Amalfi Coast. We performed residual analysis from linear regression, the binomial test and the Bayesian approach for calculating under-used and over-used plant families within ethnobotanical datasets. Percentages of agreement were calculated to compare the results of the analyses. We also analyzed the relationship between plant selection and phylogeny, chorology, life form and habitat using the chi-square test. Pearson's residuals for each of the significant chi-square analyses were examined for investigating alternative hypotheses of plant selection criteria. The three statistical analysis methods differed within the same dataset, and between different datasets and floras, but with some similarities. In the two medicinal datasets, only Lamiaceae was identified in both floras as an over-used family by all three statistical methods. All statistical methods in one flora agreed that Malvaceae was over-used and Poaceae under-used, but this was not found to be consistent with results of the second flora in which one statistical result was non-significant. All other families had some discrepancy in significance across methods, or floras. Significant over- or under-use was observed in only a minority of cases. The chi-square analyses were significant for phylogeny, life form and habitat. Pearson's residuals indicated a non-random selection of woody species for non-medicinal uses and an under-use of plants of temperate forests for medicinal uses. Our study showed that selection criteria for plant uses (including medicinal) are not always based on phylogeny. The comparison of different statistical methods (regression, binomial and Bayesian) under different conditions led to the conclusion that the most conservative results are obtained using regression analysis.
A SAS Interface for Bayesian Analysis with WinBUGS
ERIC Educational Resources Information Center
Zhang, Zhiyong; McArdle, John J.; Wang, Lijuan; Hamagami, Fumiaki
2008-01-01
Bayesian methods are becoming very popular despite some practical difficulties in implementation. To assist in the practical application of Bayesian methods, we show how to implement Bayesian analysis with WinBUGS as part of a standard set of SAS routines. This implementation procedure is first illustrated by fitting a multiple regression model…
NASA Astrophysics Data System (ADS)
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Meta-analysis of the effect of natural frequencies on Bayesian reasoning.
McDowell, Michelle; Jacobs, Perke
2017-12-01
The natural frequency facilitation effect describes the finding that people are better able to solve descriptive Bayesian inference tasks when represented as joint frequencies obtained through natural sampling, known as natural frequencies, than as conditional probabilities. The present meta-analysis reviews 20 years of research seeking to address when, why, and for whom natural frequency formats are most effective. We review contributions from research associated with the 2 dominant theoretical perspectives, the ecological rationality framework and nested-sets theory, and test potential moderators of the effect. A systematic review of relevant literature yielded 35 articles representing 226 performance estimates. These estimates were statistically integrated using a bivariate mixed-effects model that yields summary estimates of average performances across the 2 formats and estimates of the effects of different study characteristics on performance. These study characteristics range from moderators representing individual characteristics (e.g., numeracy, expertise), to methodological differences (e.g., use of incentives, scoring criteria) and features of problem representation (e.g., short menu format, visual aid). Short menu formats (less computationally complex representations showing joint-events) and visual aids demonstrated some of the strongest moderation effects, improving performance for both conditional probability and natural frequency formats. A number of methodological factors (e.g., exposure to both problem formats) were also found to affect performance rates, emphasizing the importance of a systematic approach. We suggest how research on Bayesian reasoning can be strengthened by broadening the definition of successful Bayesian reasoning to incorporate choice and process and by applying different research methodologies. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
UNIFORMLY MOST POWERFUL BAYESIAN TESTS
Johnson, Valen E.
2014-01-01
Uniformly most powerful tests are statistical hypothesis tests that provide the greatest power against a fixed null hypothesis among all tests of a given size. In this article, the notion of uniformly most powerful tests is extended to the Bayesian setting by defining uniformly most powerful Bayesian tests to be tests that maximize the probability that the Bayes factor, in favor of the alternative hypothesis, exceeds a specified threshold. Like their classical counterpart, uniformly most powerful Bayesian tests are most easily defined in one-parameter exponential family models, although extensions outside of this class are possible. The connection between uniformly most powerful tests and uniformly most powerful Bayesian tests can be used to provide an approximate calibration between p-values and Bayes factors. Finally, issues regarding the strong dependence of resulting Bayes factors and p-values on sample size are discussed. PMID:24659829
Natanegara, Fanni; Neuenschwander, Beat; Seaman, John W; Kinnersley, Nelson; Heilmann, Cory R; Ohlssen, David; Rochester, George
2014-01-01
Bayesian applications in medical product development have recently gained popularity. Despite many advances in Bayesian methodology and computations, increase in application across the various areas of medical product development has been modest. The DIA Bayesian Scientific Working Group (BSWG), which includes representatives from industry, regulatory agencies, and academia, has adopted the vision to ensure Bayesian methods are well understood, accepted more broadly, and appropriately utilized to improve decision making and enhance patient outcomes. As Bayesian applications in medical product development are wide ranging, several sub-teams were formed to focus on various topics such as patient safety, non-inferiority, prior specification, comparative effectiveness, joint modeling, program-wide decision making, analytical tools, and education. The focus of this paper is on the recent effort of the BSWG Education sub-team to administer a Bayesian survey to statisticians across 17 organizations involved in medical product development. We summarize results of this survey, from which we provide recommendations on how to accelerate progress in Bayesian applications throughout medical product development. The survey results support findings from the literature and provide additional insight on regulatory acceptance of Bayesian methods and information on the need for a Bayesian infrastructure within an organization. The survey findings support the claim that only modest progress in areas of education and implementation has been made recently, despite substantial progress in Bayesian statistical research and software availability. Copyright © 2013 John Wiley & Sons, Ltd.
The statistical analysis of circadian phase and amplitude in constant-routine core-temperature data
NASA Technical Reports Server (NTRS)
Brown, E. N.; Czeisler, C. A.
1992-01-01
Accurate estimation of the phases and amplitude of the endogenous circadian pacemaker from constant-routine core-temperature series is crucial for making inferences about the properties of the human biological clock from data collected under this protocol. This paper presents a set of statistical methods based on a harmonic-regression-plus-correlated-noise model for estimating the phases and the amplitude of the endogenous circadian pacemaker from constant-routine core-temperature data. The methods include a Bayesian Monte Carlo procedure for computing the uncertainty in these circadian functions. We illustrate the techniques with a detailed study of a single subject's core-temperature series and describe their relationship to other statistical methods for circadian data analysis. In our laboratory, these methods have been successfully used to analyze more than 300 constant routines and provide a highly reliable means of extracting phase and amplitude information from core-temperature data.
A stochastic approach to uncertainty quantification in residual moveout analysis
NASA Astrophysics Data System (ADS)
Johng-Ay, T.; Landa, E.; Dossou-Gbété, S.; Bordes, L.
2015-06-01
Oil and gas exploration and production relies usually on the interpretation of a single seismic image, which is obtained from observed data. However, the statistical nature of seismic data and the various approximations and assumptions are sources of uncertainties which may corrupt the evaluation of parameters. The quantification of these uncertainties is a major issue which supposes to help in decisions that have important social and commercial implications. The residual moveout analysis, which is an important step in seismic data processing is usually performed by a deterministic approach. In this paper we discuss a Bayesian approach to the uncertainty analysis.
NASA Technical Reports Server (NTRS)
Buntine, Wray
1991-01-01
Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Inverse problems and computational cell metabolic models: a statistical approach
NASA Astrophysics Data System (ADS)
Calvetti, D.; Somersalo, E.
2008-07-01
In this article, we give an overview of the Bayesian modelling of metabolic systems at the cellular and subcellular level. The models are based on detailed description of key biochemical reactions occurring in tissue, which may in turn be compartmentalized into cytosol and mitochondria, and of transports between the compartments. The classical deterministic approach which models metabolic systems as dynamical systems with Michaelis-Menten kinetics, is replaced by a stochastic extension where the model parameters are interpreted as random variables with an appropriate probability density. The inverse problem of cell metabolism in this setting consists of estimating the density of the model parameters. After discussing some possible approaches to solving the problem, we address the issue of how to assess the reliability of the predictions of a stochastic model by proposing an output analysis in terms of model uncertainties. Visualization modalities for organizing the large amount of information provided by the Bayesian dynamic sensitivity analysis are also illustrated.
Three Insights from a Bayesian Interpretation of the One-Sided "P" Value
ERIC Educational Resources Information Center
Marsman, Maarten; Wagenmakers, Eric-Jan
2017-01-01
P values have been critiqued on several grounds but remain entrenched as the dominant inferential method in the empirical sciences. In this article, we elaborate on the fact that in many statistical models, the one-sided "P" value has a direct Bayesian interpretation as the approximate posterior mass for values lower than zero. The…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-08
... DEPARTMENT OF HEALTH AND HUMAN SERVICES Food and Drug Administration [Docket No. FDA-2006-D-0410] (formerly Docket No. 2006D-0191) Guidance for Industry and Food and Drug Administration; Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials; Availability AGENCY: Food and Drug...
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
ERIC Educational Resources Information Center
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Structure Learning in Bayesian Sensorimotor Integration
Genewein, Tim; Hez, Eduard; Razzaghpanah, Zeynab; Braun, Daniel A.
2015-01-01
Previous studies have shown that sensorimotor processing can often be described by Bayesian learning, in particular the integration of prior and feedback information depending on its degree of reliability. Here we test the hypothesis that the integration process itself can be tuned to the statistical structure of the environment. We exposed human participants to a reaching task in a three-dimensional virtual reality environment where we could displace the visual feedback of their hand position in a two dimensional plane. When introducing statistical structure between the two dimensions of the displacement, we found that over the course of several days participants adapted their feedback integration process in order to exploit this structure for performance improvement. In control experiments we found that this adaptation process critically depended on performance feedback and could not be induced by verbal instructions. Our results suggest that structural learning is an important meta-learning component of Bayesian sensorimotor integration. PMID:26305797
Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.
Houpt, Joseph W; Bittner, Jennifer L
2018-07-01
Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Toman, Blaza; Nelson, Michael A.; Lippa, Katrice A.
2016-10-01
Chemical purity assessment using quantitative 1H-nuclear magnetic resonance spectroscopy is a method based on ratio references of mass and signal intensity of the analyte species to that of chemical standards of known purity. As such, it is an example of a calculation using a known measurement equation with multiple inputs. Though multiple samples are often analyzed during purity evaluations in order to assess measurement repeatability, the uncertainty evaluation must also account for contributions from inputs to the measurement equation. Furthermore, there may be other uncertainty components inherent in the experimental design, such as independent implementation of multiple calibration standards. As such, the uncertainty evaluation is not purely bottom up (based on the measurement equation) or top down (based on the experimental design), but inherently contains elements of both. This hybrid form of uncertainty analysis is readily implemented with Bayesian statistical analysis. In this article we describe this type of analysis in detail and illustrate it using data from an evaluation of chemical purity and its uncertainty for a folic acid material.
Statistical Methods for Turbine Blade Dynamics
2008-09-30
disks Journal of Sound and Vibration 317 , pp. 625-645. Calanni, G., Volovoi, V., Ruzzene, M, Vining, C., Cento, P., (2007). Application of Bayesian...are investigated for two vibration problems regarding a one-dimensional beam and a three-dimensional plate structure. It is to be noted that the...gaps," Reliability Engi- neering and System Safety, no. 85, pp. 249-266, 2004. [8] BENFIELD, W. A. andHRUDA, R. F., " Vibration analysis of structures
Modeling the Space Debris Environment with MASTER-2009 and ORDEM2010
NASA Technical Reports Server (NTRS)
Flegel, S.; Gelhaus, J.; Wiedemann, C.; Mockel, M.; Vorsmann, P.; Krisko, P.; Xu, Y. -L.; Horstman, M. F.; Opiela, J. N.; Matney, M.;
2010-01-01
Spacecraft analysis using ORDEM2010 uses a high-fidelity population model to compute risk to on-orbit assets. The ORDEM2010 GUI allows visualization of spacecraft flux in 2-D and 1-D. The population was produced using a Bayesian statistical approach with measured and modeled environment data. Validation of sizes < 1mm were performed using Shuttle window and radiator impact measurements. Validation of sizes > 1mm is on-going.
A product Pearson-type VII density distribution
NASA Astrophysics Data System (ADS)
Nadarajah, Saralees; Kotz, Samuel
2008-01-01
The Pearson-type VII distributions (containing the Student's t distributions) are becoming increasing prominent and are being considered as competitors to the normal distribution. Motivated by real examples in decision sciences, Bayesian statistics, probability theory and Physics, a new Pearson-type VII distribution is introduced by taking the product of two Pearson-type VII pdfs. Various structural properties of this distribution are derived, including its cdf, moments, mean deviation about the mean, mean deviation about the median, entropy, asymptotic distribution of the extreme order statistics, maximum likelihood estimates and the Fisher information matrix. Finally, an application to a Bayesian testing problem is illustrated.
NASA Technical Reports Server (NTRS)
Hailperin, Max
1993-01-01
This thesis provides design and analysis of techniques for global load balancing on ensemble architectures running soft-real-time object-oriented applications with statistically periodic loads. It focuses on estimating the instantaneous average load over all the processing elements. The major contribution is the use of explicit stochastic process models for both the loading and the averaging itself. These models are exploited via statistical time-series analysis and Bayesian inference to provide improved average load estimates, and thus to facilitate global load balancing. This thesis explains the distributed algorithms used and provides some optimality results. It also describes the algorithms' implementation and gives performance results from simulation. These results show that our techniques allow more accurate estimation of the global system load ing, resulting in fewer object migration than local methods. Our method is shown to provide superior performance, relative not only to static load-balancing schemes but also to many adaptive methods.
A new prior for bayesian anomaly detection: application to biosurveillance.
Shen, Y; Cooper, G F
2010-01-01
Bayesian anomaly detection computes posterior probabilities of anomalous events by combining prior beliefs and evidence from data. However, the specification of prior probabilities can be challenging. This paper describes a Bayesian prior in the context of disease outbreak detection. The goal is to provide a meaningful, easy-to-use prior that yields a posterior probability of an outbreak that performs at least as well as a standard frequentist approach. If this goal is achieved, the resulting posterior could be usefully incorporated into a decision analysis about how to act in light of a possible disease outbreak. This paper describes a Bayesian method for anomaly detection that combines learning from data with a semi-informative prior probability over patterns of anomalous events. A univariate version of the algorithm is presented here for ease of illustration of the essential ideas. The paper describes the algorithm in the context of disease-outbreak detection, but it is general and can be used in other anomaly detection applications. For this application, the semi-informative prior specifies that an increased count over baseline is expected for the variable being monitored, such as the number of respiratory chief complaints per day at a given emergency department. The semi-informative prior is derived based on the baseline prior, which is estimated from using historical data. The evaluation reported here used semi-synthetic data to evaluate the detection performance of the proposed Bayesian method and a control chart method, which is a standard frequentist algorithm that is closest to the Bayesian method in terms of the type of data it uses. The disease-outbreak detection performance of the Bayesian method was statistically significantly better than that of the control chart method when proper baseline periods were used to estimate the baseline behavior to avoid seasonal effects. When using longer baseline periods, the Bayesian method performed as well as the control chart method. The time complexity of the Bayesian algorithm is linear in the number of the observed events being monitored, due to a novel, closed-form derivation that is introduced in the paper. This paper introduces a novel prior probability for Bayesian outbreak detection that is expressive, easy-to-apply, computationally efficient, and performs as well or better than a standard frequentist method.
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.
Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa
2017-01-01
TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.
Violent crime in San Antonio, Texas: an application of spatial epidemiological methods.
Sparks, Corey S
2011-12-01
Violent crimes are rarely considered a public health problem or investigated using epidemiological methods. But patterns of violent crime and other health conditions are often affected by similar characteristics of the built environment. In this paper, methods and perspectives from spatial epidemiology are used in an analysis of violent crimes in San Antonio, TX. Bayesian statistical methods are used to examine the contextual influence of several aspects of the built environment. Additionally, spatial regression models using Bayesian model specifications are used to examine spatial patterns of violent crime risk. Results indicate that the determinants of violent crime depend on the model specification, but are primarily related to the built environment and neighborhood socioeconomic conditions. Results are discussed within the context of a rapidly growing urban area with a diverse population. Copyright © 2011 Elsevier Ltd. All rights reserved.
A Method of Face Detection with Bayesian Probability
NASA Astrophysics Data System (ADS)
Sarker, Goutam
2010-10-01
The objective of face detection is to identify all images which contain a face, irrespective of its orientation, illumination conditions etc. This is a hard problem, because the faces are highly variable in size, shape lighting conditions etc. Many methods have been designed and developed to detect faces in a single image. The present paper is based on one `Appearance Based Method' which relies on learning the facial and non facial features from image examples. This in its turn is based on statistical analysis of examples and counter examples of facial images and employs Bayesian Conditional Classification Rule to detect the probability of belongingness of a face (or non-face) within an image frame. The detection rate of the present system is very high and thereby the number of false positive and false negative detection is substantially low.
Integrating ecosystem studies: A Bayesian comparison of hypotheses
Adkison, Milo D.; Ballachey, Brenda E.; Bodkin, James L.; Holland-Bartels, Leslie E.; Funk, F.; Quinn, T.J.; Heifetz, J.; Ianelli, J.N.; Powers, J.E.; Schweigert, J.F.; Sullivan, P.J.; Zhang, C.-I.
1998-01-01
Ecosystem studies are difficult to interpret because of the complexity and number of pathways that may affect a phenomenon of interest. It is not possible to study all aspects of a problem; thus subjective judgment is required to weigh what has been observed in the context of components that were not studied but may have been important. This subjective judgment is usually a poorly documented and ad hoc addendum to a statistical analysis of the data. We present a Bayesian methodology for documenting, quantifying, and incorporating these necessary subjective elements into an ecosystem study. The end product of this methodology is the probability of each of the competing hypotheses. As an example, this method is applied to an ecosystem study designed to discriminate among competing hypotheses for a low abundance of sea otters at a previously oiled site in Prince William Sound, Alaska.
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey B.; Raymond, C.; Smrekar, S.; Millbury, C.
2004-01-01
This viewgraph presentation reviews a Bayesian approach to the inversion of gravity and magnetic data with specific application to the Ismenius Area of Mars. Many inverse problems encountered in geophysics and planetary science are well known to be non-unique (i.e. inversion of gravity the density structure of a body). In hopes of reducing the non-uniqueness of solutions, there has been interest in the joint analysis of data. An example is the joint inversion of gravity and magnetic data, with the assumption that the same physical anomalies generate both the observed magnetic and gravitational anomalies. In this talk, we formulate the joint analysis of different types of data in a Bayesian framework and apply the formalism to the inference of the density and remanent magnetization structure for a local region in the Ismenius area of Mars. The Bayesian approach allows prior information or constraints in the solutions to be incorporated in the inversion, with the "best" solutions those whose forward predictions most closely match the data while remaining consistent with assumed constraints. The application of this framework to the inversion of gravity and magnetic data on Mars reveals two typical challenges - the forward predictions of the data have a linear dependence on some of the quantities of interest, and non-linear dependence on others (termed the "linear" and "non-linear" variables, respectively). For observations with Gaussian noise, a Bayesian approach to inversion for "linear" variables reduces to a linear filtering problem, with an explicitly computable "error" matrix. However, for models whose forward predictions have non-linear dependencies, inference is no longer given by such a simple linear problem, and moreover, the uncertainty in the solution is no longer completely specified by a computable "error matrix". It is therefore important to develop methods for sampling from the full Bayesian posterior to provide a complete and statistically consistent picture of model uncertainty, and what has been learned from observations. We will discuss advanced numerical techniques, including Monte Carlo Markov
On Some Assumptions of the Null Hypothesis Statistical Testing
ERIC Educational Resources Information Center
Patriota, Alexandre Galvão
2017-01-01
Bayesian and classical statistical approaches are based on different types of logical principles. In order to avoid mistaken inferences and misguided interpretations, the practitioner must respect the inference rules embedded into each statistical method. Ignoring these principles leads to the paradoxical conclusions that the hypothesis…
Howard Stauffer; Nadav Nur
2005-01-01
The papers included in the Advances in Statistics section of the Partners in Flight (PIF) 2002 Proceedings represent a small sample of statistical topics of current importance to Partners In Flight research scientists: hierarchical modeling, estimation of detection probabilities, and Bayesian applications. Sauer et al. (this volume) examines a hierarchical model...
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
A default Bayesian hypothesis test for mediation.
Nuijten, Michèle B; Wetzels, Ruud; Matzke, Dora; Dolan, Conor V; Wagenmakers, Eric-Jan
2015-03-01
In order to quantify the relationship between multiple variables, researchers often carry out a mediation analysis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estimation and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301-322, 2009), who outlined a Bayesian parameter estimation procedure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the Jeffreys-Zellner-Siow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo techniques implemented in JAGS. All Bayesian tests are implemented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014).
Tutorial: Asteroseismic Data Analysis with DIAMONDS
NASA Astrophysics Data System (ADS)
Corsaro, Enrico
Since the advent of the space-based photometric missions such as CoRoT and NASA's Kepler, asteroseismology has acquired a central role in our understanding about stellar physics. The Kepler spacecraft, especially, is still releasing excellent photometric observations that contain a large amount of information not yet investigated. For exploiting the full potential of these data, sophisticated and robust analysis tools are now essential, so that further constraining of stellar structure and evolutionary models can be obtained. In addition, extracting detailed asteroseismic properties for many stars can yield new insights on their correlations to fundamental stellar properties and dynamics. After a brief introduction to the Bayesian notion of probability, I describe the code Diamonds for Bayesian parameter estimation and model comparison by means of the nested sampling Monte Carlo (NSMC) algorithm. NSMC constitutes an efficient and powerful method, in replacement to standard Markov chain Monte Carlo, very suitable for high-dimensional and multimodal problems that are typical of detailed asteroseismic analyses, such as the fitting and mode identification of individual oscillation modes in stars (known as peak-bagging). Diamonds is able to provide robust results for statistical inferences involving tens of individual oscillation modes, while at the same time preserving a considerable computational efficiency for identifying the solution. In the tutorial, I will present the fitting of the stellar background signal and the peak-bagging analysis of the oscillation modes in a red-giant star, providing an example to use Bayesian evidence for assessing the peak significance of the fitted oscillation peaks.
Risk analysis of emergent water pollution accidents based on a Bayesian Network.
Tang, Caihong; Yi, Yujun; Yang, Zhifeng; Sun, Jie
2016-01-01
To guarantee the security of water quality in water transfer channels, especially in open channels, analysis of potential emergent pollution sources in the water transfer process is critical. It is also indispensable for forewarnings and protection from emergent pollution accidents. Bridges above open channels with large amounts of truck traffic are the main locations where emergent accidents could occur. A Bayesian Network model, which consists of six root nodes and three middle layer nodes, was developed in this paper, and was employed to identify the possibility of potential pollution risk. Dianbei Bridge is reviewed as a typical bridge on an open channel of the Middle Route of the South to North Water Transfer Project where emergent traffic accidents could occur. Risk of water pollutions caused by leakage of pollutants into water is focused in this study. The risk for potential traffic accidents at the Dianbei Bridge implies a risk for water pollution in the canal. Based on survey data, statistical analysis, and domain specialist knowledge, a Bayesian Network model was established. The human factor of emergent accidents has been considered in this model. Additionally, this model has been employed to describe the probability of accidents and the risk level. The sensitive reasons for pollution accidents have been deduced. The case has also been simulated that sensitive factors are in a state of most likely to lead to accidents. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Gogu, C.; Haftka, R.; LeRiche, R.; Molimard, J.; Vautrin, A.; Sankar, B.
2008-11-01
The basic formulation of the least squares method, based on the L2 norm of the misfit, is still widely used today for identifying elastic material properties from experimental data. An alternative statistical approach is the Bayesian method. We seek here situations with significant difference between the material properties found by the two methods. For a simple three bar truss example we illustrate three such situations in which the Bayesian approach leads to more accurate results: different magnitude of the measurements, different uncertainty in the measurements and correlation among measurements. When all three effects add up, the Bayesian approach can have a large advantage. We then compared the two methods for identification of elastic constants from plate vibration natural frequencies.
NASA Astrophysics Data System (ADS)
Granade, Christopher; Combes, Joshua; Cory, D. G.
2016-03-01
In recent years, Bayesian methods have been proposed as a solution to a wide range of issues in quantum state and process tomography. State-of-the-art Bayesian tomography solutions suffer from three problems: numerical intractability, a lack of informative prior distributions, and an inability to track time-dependent processes. Here, we address all three problems. First, we use modern statistical methods, as pioneered by Huszár and Houlsby (2012 Phys. Rev. A 85 052120) and by Ferrie (2014 New J. Phys. 16 093035), to make Bayesian tomography numerically tractable. Our approach allows for practical computation of Bayesian point and region estimators for quantum states and channels. Second, we propose the first priors on quantum states and channels that allow for including useful experimental insight. Finally, we develop a method that allows tracking of time-dependent states and estimates the drift and diffusion processes affecting a state. We provide source code and animated visual examples for our methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Le; Timbie, Peter T.; Bunn, Emory F.
In this paper, we present a new Bayesian semi-blind approach for foreground removal in observations of the 21 cm signal measured by interferometers. The technique, which we call H i Expectation–Maximization Independent Component Analysis (HIEMICA), is an extension of the Independent Component Analysis technique developed for two-dimensional (2D) cosmic microwave background maps to three-dimensional (3D) 21 cm cosmological signals measured by interferometers. This technique provides a fully Bayesian inference of power spectra and maps and separates the foregrounds from the signal based on the diversity of their power spectra. Relying only on the statistical independence of the components, this approachmore » can jointly estimate the 3D power spectrum of the 21 cm signal, as well as the 2D angular power spectrum and the frequency dependence of each foreground component, without any prior assumptions about the foregrounds. This approach has been tested extensively by applying it to mock data from interferometric 21 cm intensity mapping observations under idealized assumptions of instrumental effects. We also discuss the impact when the noise properties are not known completely. As a first step toward solving the 21 cm power spectrum analysis problem, we compare the semi-blind HIEMICA technique to the commonly used Principal Component Analysis. Under the same idealized circumstances, the proposed technique provides significantly improved recovery of the power spectrum. This technique can be applied in a straightforward manner to all 21 cm interferometric observations, including epoch of reionization measurements, and can be extended to single-dish observations as well.« less
Bayesian analysis of the astrobiological implications of life’s early emergence on Earth
Spiegel, David S.; Turner, Edwin L.
2012-01-01
Life arose on Earth sometime in the first few hundred million years after the young planet had cooled to the point that it could support water-based organisms on its surface. The early emergence of life on Earth has been taken as evidence that the probability of abiogenesis is high, if starting from young Earth-like conditions. We revisit this argument quantitatively in a Bayesian statistical framework. By constructing a simple model of the probability of abiogenesis, we calculate a Bayesian estimate of its posterior probability, given the data that life emerged fairly early in Earth’s history and that, billions of years later, curious creatures noted this fact and considered its implications. We find that, given only this very limited empirical information, the choice of Bayesian prior for the abiogenesis probability parameter has a dominant influence on the computed posterior probability. Although terrestrial life's early emergence provides evidence that life might be abundant in the universe if early-Earth-like conditions are common, the evidence is inconclusive and indeed is consistent with an arbitrarily low intrinsic probability of abiogenesis for plausible uninformative priors. Finding a single case of life arising independently of our lineage (on Earth, elsewhere in the solar system, or on an extrasolar planet) would provide much stronger evidence that abiogenesis is not extremely rare in the universe. PMID:22198766
A Bayesian bird's eye view of ‘Replications of important results in social psychology’
Schönbrodt, Felix D.; Yao, Yuling; Gelman, Andrew; Wagenmakers, Eric-Jan
2017-01-01
We applied three Bayesian methods to reanalyse the preregistered contributions to the Social Psychology special issue ‘Replications of Important Results in Social Psychology’ (Nosek & Lakens. 2014 Registered reports: a method to increase the credibility of published results. Soc. Psychol. 45, 137–141. (doi:10.1027/1864-9335/a000192)). First, individual-experiment Bayesian parameter estimation revealed that for directed effect size measures, only three out of 44 central 95% credible intervals did not overlap with zero and fell in the expected direction. For undirected effect size measures, only four out of 59 credible intervals contained values greater than 0.10 (10% of variance explained) and only 19 intervals contained values larger than 0.05. Second, a Bayesian random-effects meta-analysis for all 38 t-tests showed that only one out of the 38 hierarchically estimated credible intervals did not overlap with zero and fell in the expected direction. Third, a Bayes factor hypothesis test was used to quantify the evidence for the null hypothesis against a default one-sided alternative. Only seven out of 60 Bayes factors indicated non-anecdotal support in favour of the alternative hypothesis (BF10>3), whereas 51 Bayes factors indicated at least some support for the null hypothesis. We hope that future analyses of replication success will embrace a more inclusive statistical approach by adopting a wider range of complementary techniques. PMID:28280547
Garrard, Lili; Price, Larry R.; Bott, Marjorie J.; Gajewski, Byron J.
2016-01-01
Item response theory (IRT) models provide an appropriate alternative to the classical ordinal confirmatory factor analysis (CFA) during the development of patient-reported outcome measures (PROMs). Current literature has identified the assessment of IRT model fit as both challenging and underdeveloped (Sinharay & Johnson, 2003; Sinharay, Johnson, & Stern, 2006). This study evaluates the performance of Ordinal Bayesian Instrument Development (OBID), a Bayesian IRT model with a probit link function approach, through applications in two breast cancer-related instrument development studies. The primary focus is to investigate an appropriate method for comparing Bayesian IRT models in PROMs development. An exact Bayesian leave-one-out cross-validation (LOO-CV) approach (Vehtari & Lampinen, 2002) is implemented to assess prior selection for the item discrimination parameter in the IRT model and subject content experts’ bias (in a statistical sense and not to be confused with psychometric bias as in differential item functioning) toward the estimation of item-to-domain correlations. Results support the utilization of content subject experts’ information in establishing evidence for construct validity when sample size is small. However, the incorporation of subject experts’ content information in the OBID approach can be sensitive to the level of expertise of the recruited experts. More stringent efforts need to be invested in the appropriate selection of subject experts to efficiently use the OBID approach and reduce potential bias during PROMs development. PMID:27667878
Garrard, Lili; Price, Larry R; Bott, Marjorie J; Gajewski, Byron J
2016-10-01
Item response theory (IRT) models provide an appropriate alternative to the classical ordinal confirmatory factor analysis (CFA) during the development of patient-reported outcome measures (PROMs). Current literature has identified the assessment of IRT model fit as both challenging and underdeveloped (Sinharay & Johnson, 2003; Sinharay, Johnson, & Stern, 2006). This study evaluates the performance of Ordinal Bayesian Instrument Development (OBID), a Bayesian IRT model with a probit link function approach, through applications in two breast cancer-related instrument development studies. The primary focus is to investigate an appropriate method for comparing Bayesian IRT models in PROMs development. An exact Bayesian leave-one-out cross-validation (LOO-CV) approach (Vehtari & Lampinen, 2002) is implemented to assess prior selection for the item discrimination parameter in the IRT model and subject content experts' bias (in a statistical sense and not to be confused with psychometric bias as in differential item functioning) toward the estimation of item-to-domain correlations. Results support the utilization of content subject experts' information in establishing evidence for construct validity when sample size is small. However, the incorporation of subject experts' content information in the OBID approach can be sensitive to the level of expertise of the recruited experts. More stringent efforts need to be invested in the appropriate selection of subject experts to efficiently use the OBID approach and reduce potential bias during PROMs development.
Bayesian analysis of the astrobiological implications of life's early emergence on Earth.
Spiegel, David S; Turner, Edwin L
2012-01-10
Life arose on Earth sometime in the first few hundred million years after the young planet had cooled to the point that it could support water-based organisms on its surface. The early emergence of life on Earth has been taken as evidence that the probability of abiogenesis is high, if starting from young Earth-like conditions. We revisit this argument quantitatively in a bayesian statistical framework. By constructing a simple model of the probability of abiogenesis, we calculate a bayesian estimate of its posterior probability, given the data that life emerged fairly early in Earth's history and that, billions of years later, curious creatures noted this fact and considered its implications. We find that, given only this very limited empirical information, the choice of bayesian prior for the abiogenesis probability parameter has a dominant influence on the computed posterior probability. Although terrestrial life's early emergence provides evidence that life might be abundant in the universe if early-Earth-like conditions are common, the evidence is inconclusive and indeed is consistent with an arbitrarily low intrinsic probability of abiogenesis for plausible uninformative priors. Finding a single case of life arising independently of our lineage (on Earth, elsewhere in the solar system, or on an extrasolar planet) would provide much stronger evidence that abiogenesis is not extremely rare in the universe.
Bayesian depth estimation from monocular natural images.
Su, Che-Chun; Cormack, Lawrence K; Bovik, Alan C
2017-05-01
Estimating an accurate and naturalistic dense depth map from a single monocular photographic image is a difficult problem. Nevertheless, human observers have little difficulty understanding the depth structure implied by photographs. Two-dimensional (2D) images of the real-world environment contain significant statistical information regarding the three-dimensional (3D) structure of the world that the vision system likely exploits to compute perceived depth, monocularly as well as binocularly. Toward understanding how this might be accomplished, we propose a Bayesian model of monocular depth computation that recovers detailed 3D scene structures by extracting reliable, robust, depth-sensitive statistical features from single natural images. These features are derived using well-accepted univariate natural scene statistics (NSS) models and recent bivariate/correlation NSS models that describe the relationships between 2D photographic images and their associated depth maps. This is accomplished by building a dictionary of canonical local depth patterns from which NSS features are extracted as prior information. The dictionary is used to create a multivariate Gaussian mixture (MGM) likelihood model that associates local image features with depth patterns. A simple Bayesian predictor is then used to form spatial depth estimates. The depth results produced by the model, despite its simplicity, correlate well with ground-truth depths measured by a current-generation terrestrial light detection and ranging (LIDAR) scanner. Such a strong form of statistical depth information could be used by the visual system when creating overall estimated depth maps incorporating stereopsis, accommodation, and other conditions. Indeed, even in isolation, the Bayesian predictor delivers depth estimates that are competitive with state-of-the-art "computer vision" methods that utilize highly engineered image features and sophisticated machine learning algorithms.
ERIC Educational Resources Information Center
Vrieze, Scott I.
2012-01-01
This article reviews the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) in model selection and the appraisal of psychological theory. The focus is on latent variable models, given their growing use in theory testing and construction. Theoretical statistical results in regression are discussed, and more important…
Howard B. Stauffer; Cynthia J. Zabel; Jeffrey R. Dunk
2005-01-01
We compared a set of competing logistic regression habitat selection models for Northern Spotted Owls (Strix occidentalis caurina) in California. The habitat selection models were estimated, compared, evaluated, and tested using multiple sample datasets collected on federal forestlands in northern California. We used Bayesian methods in interpreting...
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring
Carlos Carroll; Devin S. Johnson; Jeffrey R. Dunk; William J. Zielinski
2010-01-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their dataâs spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and...
NASA Astrophysics Data System (ADS)
Gaál, Ladislav; Szolgay, Ján; Kohnová, Silvia; Hlavčová, Kamila; Viglione, Alberto
2010-01-01
The paper deals with at-site flood frequency estimation in the case when also information on hydrological events from the past with extraordinary magnitude are available. For the joint frequency analysis of systematic observations and historical data, respectively, the Bayesian framework is chosen, which, through adequately defined likelihood functions, allows for incorporation of different sources of hydrological information, e.g., maximum annual flood peaks, historical events as well as measurement errors. The distribution of the parameters of the fitted distribution function and the confidence intervals of the flood quantiles are derived by means of the Markov chain Monte Carlo simulation (MCMC) technique. The paper presents a sensitivity analysis related to the choice of the most influential parameters of the statistical model, which are the length of the historical period
Selecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models
Burr, Tom
2013-01-01
Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the “go-to” option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example. PMID:24288668
Selecting summary statistics in approximate Bayesian computation for calibrating stochastic models.
Burr, Tom; Skurikhin, Alexei
2013-01-01
Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the "go-to" option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example.
Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363
Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.
A Bayesian approach for parameter estimation and prediction using a computationally intensive model
Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...
2015-02-05
Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less
NASA Technical Reports Server (NTRS)
Hailperin, M.
1993-01-01
This thesis provides design and analysis of techniques for global load balancing on ensemble architectures running soft-real-time object-oriented applications with statistically periodic loads. It focuses on estimating the instantaneous average load over all the processing elements. The major contribution is the use of explicit stochastic process models for both the loading and the averaging itself. These models are exploited via statistical time-series analysis and Bayesian inference to provide improved average load estimates, and thus to facilitate global load balancing. This thesis explains the distributed algorithms used and provides some optimality results. It also describes the algorithms' implementation and gives performance results from simulation. These results show that the authors' techniques allow more accurate estimation of the global system loading, resulting in fewer object migrations than local methods. The authors' method is shown to provide superior performance, relative not only to static load-balancing schemes but also to many adaptive load-balancing methods. Results from a preliminary analysis of another system and from simulation with a synthetic load provide some evidence of more general applicability.
MacLean, Adam L; Harrington, Heather A; Stumpf, Michael P H; Byrne, Helen M
2016-01-01
The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non-exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine.
NASA Astrophysics Data System (ADS)
Bakoban, Rana A.
2017-08-01
The coefficient of variation [CV] has several applications in applied statistics. So in this paper, we adopt Bayesian and non-Bayesian approaches for the estimation of CV under type-II censored data from extension exponential distribution [EED]. The point and interval estimate of the CV are obtained for each of the maximum likelihood and parametric bootstrap techniques. Also the Bayesian approach with the help of MCMC method is presented. A real data set is presented and analyzed, hence the obtained results are used to assess the obtained theoretical results.
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W
2006-03-01
Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.
Bayesian Hierarchical Random Effects Models in Forensic Science.
Aitken, Colin G G
2018-01-01
Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios) was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Bayesian Tracking of Emerging Epidemics Using Ensemble Optimal Statistical Interpolation
Cobb, Loren; Krishnamurthy, Ashok; Mandel, Jan; Beezley, Jonathan D.
2014-01-01
We present a preliminary test of the Ensemble Optimal Statistical Interpolation (EnOSI) method for the statistical tracking of an emerging epidemic, with a comparison to its popular relative for Bayesian data assimilation, the Ensemble Kalman Filter (EnKF). The spatial data for this test was generated by a spatial susceptible-infectious-removed (S-I-R) epidemic model of an airborne infectious disease. Both tracking methods in this test employed Poisson rather than Gaussian noise, so as to handle epidemic data more accurately. The EnOSI and EnKF tracking methods worked well on the main body of the simulated spatial epidemic, but the EnOSI was able to detect and track a distant secondary focus of infection that the EnKF missed entirely. PMID:25113590
NASA Astrophysics Data System (ADS)
Kumari, K.; Oberheide, J.
2017-12-01
Nonmigrating tidal diagnostics of SABER temperature observations in the ionospheric dynamo region reveal a large amount of variability on time-scales of a few days to weeks. In this paper, we discuss the physical reasons for the observed short-term tidal variability using a novel approach based on Information theory and Bayesian statistics. We diagnose short-term tidal variability as a function of season, QBO, ENSO, and solar cycle and other drivers using time dependent probability density functions, Shannon entropy and Kullback-Leibler divergence. The statistical significance of the approach and its predictive capability is exemplified using SABER tidal diagnostics with emphasis on the responses to the QBO and solar cycle. Implications for F-region plasma density will be discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Host, Ole; Lahav, Ofer; Abdalla, Filipe B.
We present a showcase for deriving bounds on the neutrino masses from laboratory experiments and cosmological observations. We compare the frequentist and Bayesian bounds on the effective electron neutrino mass m{sub {beta}} which the KATRIN neutrino mass experiment is expected to obtain, using both an analytical likelihood function and Monte Carlo simulations of KATRIN. Assuming a uniform prior in m{sub {beta}}, we find that a null result yields an upper bound of about 0.17 eV at 90% confidence in the Bayesian analysis, to be compared with the frequentist KATRIN reference value of 0.20 eV. This is a significant difference whenmore » judged relative to the systematic and statistical uncertainties of the experiment. On the other hand, an input m{sub {beta}}=0.35 eV, which is the KATRIN 5{sigma} detection threshold, would be detected at virtually the same level. Finally, we combine the simulated KATRIN results with cosmological data in the form of present (post-WMAP) and future (simulated Planck) observations. If an input of m{sub {beta}}=0.2 eV is assumed in our simulations, KATRIN alone excludes a zero neutrino mass at 2.2{sigma}. Adding Planck data increases the probability of detection to a median 2.7{sigma}. The analysis highlights the importance of combining cosmological and laboratory data on an equal footing.« less
Influence of Averaging Preprocessing on Image Analysis with a Markov Random Field Model
NASA Astrophysics Data System (ADS)
Sakamoto, Hirotaka; Nakanishi-Ohno, Yoshinori; Okada, Masato
2018-02-01
This paper describes our investigations into the influence of averaging preprocessing on the performance of image analysis. Averaging preprocessing involves a trade-off: image averaging is often undertaken to reduce noise while the number of image data available for image analysis is decreased. We formulated a process of generating image data by using a Markov random field (MRF) model to achieve image analysis tasks such as image restoration and hyper-parameter estimation by a Bayesian approach. According to the notions of Bayesian inference, posterior distributions were analyzed to evaluate the influence of averaging. There are three main results. First, we found that the performance of image restoration with a predetermined value for hyper-parameters is invariant regardless of whether averaging is conducted. We then found that the performance of hyper-parameter estimation deteriorates due to averaging. Our analysis of the negative logarithm of the posterior probability, which is called the free energy based on an analogy with statistical mechanics, indicated that the confidence of hyper-parameter estimation remains higher without averaging. Finally, we found that when the hyper-parameters are estimated from the data, the performance of image restoration worsens as averaging is undertaken. We conclude that averaging adversely influences the performance of image analysis through hyper-parameter estimation.
Factors contributing to academic achievement: a Bayesian structure equation modelling study
NASA Astrophysics Data System (ADS)
Payandeh Najafabadi, Amir T.; Omidi Najafabadi, Maryam; Farid-Rohani, Mohammad Reza
2013-06-01
In Iran, high school graduates enter university after taking a very difficult entrance exam called the Konkoor. Therefore, only the top-performing students are admitted by universities to continue their bachelor's education in statistics. Surprisingly, statistically, most of such students fall into the following categories: (1) do not succeed in their education despite their excellent performance on the Konkoor and in high school; (2) graduate with a grade point average (GPA) that is considerably lower than their high school GPA; (3) continue their master's education in majors other than statistics and (4) try to find jobs unrelated to statistics. This article employs the well-known and powerful statistical technique, the Bayesian structural equation modelling (SEM), to study the academic success of recent graduates who have studied statistics at Shahid Beheshti University in Iran. This research: (i) considered academic success as a latent variable, which was measured by GPA and other academic success (see below) of students in the target population; (ii) employed the Bayesian SEM, which works properly for small sample sizes and ordinal variables; (iii), which is taken from the literature, developed five main factors that affected academic success and (iv) considered several standard psychological tests and measured characteristics such as 'self-esteem' and 'anxiety'. We then study the impact of such factors on the academic success of the target population. Six factors that positively impact student academic success were identified in the following order of relative impact (from greatest to least): 'Teaching-Evaluation', 'Learner', 'Environment', 'Family', 'Curriculum' and 'Teaching Knowledge'. Particularly, influential variables within each factor have also been noted.
Nicoulaud-Gouin, V; Garcia-Sanchez, L; Giacalone, M; Attard, J C; Martin-Garin, A; Bois, F Y
2016-10-01
This paper addresses the methodological conditions -particularly experimental design and statistical inference- ensuring the identifiability of sorption parameters from breakthrough curves measured during stirred flow-through reactor experiments also known as continuous flow stirred-tank reactor (CSTR) experiments. The equilibrium-kinetic (EK) sorption model was selected as nonequilibrium parameterization embedding the K d approach. Parameter identifiability was studied formally on the equations governing outlet concentrations. It was also studied numerically on 6 simulated CSTR experiments on a soil with known equilibrium-kinetic sorption parameters. EK sorption parameters can not be identified from a single breakthrough curve of a CSTR experiment, because K d,1 and k - were diagnosed collinear. For pairs of CSTR experiments, Bayesian inference allowed to select the correct models of sorption and error among sorption alternatives. Bayesian inference was conducted with SAMCAT software (Sensitivity Analysis and Markov Chain simulations Applied to Transfer models) which launched the simulations through the embedded simulation engine GNU-MCSim, and automated their configuration and post-processing. Experimental designs consisting in varying flow rates between experiments reaching equilibrium at contamination stage were found optimal, because they simultaneously gave accurate sorption parameters and predictions. Bayesian results were comparable to maximum likehood method but they avoided convergence problems, the marginal likelihood allowed to compare all models, and credible interval gave directly the uncertainty of sorption parameters θ. Although these findings are limited to the specific conditions studied here, in particular the considered sorption model, the chosen parameter values and error structure, they help in the conception and analysis of future CSTR experiments with radionuclides whose kinetic behaviour is suspected. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Bayesian Approach to the Overlap Analysis of Epidemiologically Linked Traits.
Asimit, Jennifer L; Panoutsopoulou, Kalliope; Wheeler, Eleanor; Berndt, Sonja I; Cordell, Heather J; Morris, Andrew P; Zeggini, Eleftheria; Barroso, Inês
2015-12-01
Diseases often cooccur in individuals more often than expected by chance, and may be explained by shared underlying genetic etiology. A common approach to genetic overlap analyses is to use summary genome-wide association study data to identify single-nucleotide polymorphisms (SNPs) that are associated with multiple traits at a selected P-value threshold. However, P-values do not account for differences in power, whereas Bayes' factors (BFs) do, and may be approximated using summary statistics. We use simulation studies to compare the power of frequentist and Bayesian approaches with overlap analyses, and to decide on appropriate thresholds for comparison between the two methods. It is empirically illustrated that BFs have the advantage over P-values of a decreasing type I error rate as study size increases for single-disease associations. Consequently, the overlap analysis of traits from different-sized studies encounters issues in fair P-value threshold selection, whereas BFs are adjusted automatically. Extensive simulations show that Bayesian overlap analyses tend to have higher power than those that assess association strength with P-values, particularly in low-power scenarios. Calibration tables between BFs and P-values are provided for a range of sample sizes, as well as an approximation approach for sample sizes that are not in the calibration table. Although P-values are sometimes thought more intuitive, these tables assist in removing the opaqueness of Bayesian thresholds and may also be used in the selection of a BF threshold to meet a certain type I error rate. An application of our methods is used to identify variants associated with both obesity and osteoarthritis. © 2015 The Authors. *Genetic Epidemiology published by Wiley Periodicals, Inc.
Jackson, George S.; Hillegonds, Darren J.; Muzikar, Paul; Goehring, Brent
2013-01-01
A 41Ca interlaboratory comparison between Lawrence Livermore National Laboratory (LLNL) and the Purdue Rare Isotope Laboratory (PRIME Lab) has been completed. Analysis of the ratios assayed by accelerator mass spectrometry (AMS) shows that there is no statistically significant difference in the ratios. Further, Bayesian analysis shows that the uncertainties reported by both facilities are correct with the possibility of a slight under-estimation by one laboratory. Finally, the chemistry procedures used by the two facilities to produce CaF2 for the cesium sputter ion source are robust and don't yield any significant differences in the final result. PMID:24179312
Sequential Inverse Problems Bayesian Principles and the Logistic Map Example
NASA Astrophysics Data System (ADS)
Duan, Lian; Farmer, Chris L.; Moroz, Irene M.
2010-09-01
Bayesian statistics provides a general framework for solving inverse problems, but is not without interpretation and implementation problems. This paper discusses difficulties arising from the fact that forward models are always in error to some extent. Using a simple example based on the one-dimensional logistic map, we argue that, when implementation problems are minimal, the Bayesian framework is quite adequate. In this paper the Bayesian Filter is shown to be able to recover excellent state estimates in the perfect model scenario (PMS) and to distinguish the PMS from the imperfect model scenario (IMS). Through a quantitative comparison of the way in which the observations are assimilated in both the PMS and the IMS scenarios, we suggest that one can, sometimes, measure the degree of imperfection.
Bayesian Networks for Modeling Dredging Decisions
2011-10-01
change scenarios. Arctic Expert elicitation Netica Bacon et al . 2002 Identify factors that might lead to a change in land use from farming to...tree) algorithms developed by Lauritzen and Spiegelhalter (1988) and Jensen et al . (1990). Statistical inference is simply the process of...causality when constructing a Bayesian network (Kjaerulff and Madsen 2008, Darwiche 2009, Marcot et al . 2006). A knowledge representation approach is the
Photoacoustic discrimination of vascular and pigmented lesions using classical and Bayesian methods
NASA Astrophysics Data System (ADS)
Swearingen, Jennifer A.; Holan, Scott H.; Feldman, Mary M.; Viator, John A.
2010-01-01
Discrimination of pigmented and vascular lesions in skin can be difficult due to factors such as size, subungual location, and the nature of lesions containing both melanin and vascularity. Misdiagnosis may lead to precancerous or cancerous lesions not receiving proper medical care. To aid in the rapid and accurate diagnosis of such pathologies, we develop a photoacoustic system to determine the nature of skin lesions in vivo. By irradiating skin with two laser wavelengths, 422 and 530 nm, we induce photoacoustic responses, and the relative response at these two wavelengths indicates whether the lesion is pigmented or vascular. This response is due to the distinct absorption spectrum of melanin and hemoglobin. In particular, pigmented lesions have ratios of photoacoustic amplitudes of approximately 1.4 to 1 at the two wavelengths, while vascular lesions have ratios of about 4.0 to 1. Furthermore, we consider two statistical methods for conducting classification of lesions: standard multivariate analysis classification techniques and a Bayesian-model-based approach. We study 15 human subjects with eight vascular and seven pigmented lesions. Using the classical method, we achieve a perfect classification rate, while the Bayesian approach has an error rate of 20%.
A Bayesian state-space approach for damage detection and classification
NASA Astrophysics Data System (ADS)
Dzunic, Zoran; Chen, Justin G.; Mobahi, Hossein; Büyüköztürk, Oral; Fisher, John W.
2017-11-01
The problem of automatic damage detection in civil structures is complex and requires a system that can interpret collected sensor data into meaningful information. We apply our recently developed switching Bayesian model for dependency analysis to the problems of damage detection and classification. The model relies on a state-space approach that accounts for noisy measurement processes and missing data, which also infers the statistical temporal dependency between measurement locations signifying the potential flow of information within the structure. A Gibbs sampling algorithm is used to simultaneously infer the latent states, parameters of the state dynamics, the dependence graph, and any changes in behavior. By employing a fully Bayesian approach, we are able to characterize uncertainty in these variables via their posterior distribution and provide probabilistic estimates of the occurrence of damage or a specific damage scenario. We also implement a single class classification method which is more realistic for most real world situations where training data for a damaged structure is not available. We demonstrate the methodology with experimental test data from a laboratory model structure and accelerometer data from a real world structure during different environmental and excitation conditions.
Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy
NASA Astrophysics Data System (ADS)
Sharma, Sanjib
2017-08-01
Markov Chain Monte Carlo based Bayesian data analysis has now become the method of choice for analyzing and interpreting data in almost all disciplines of science. In astronomy, over the last decade, we have also seen a steady increase in the number of papers that employ Monte Carlo based Bayesian analysis. New, efficient Monte Carlo based methods are continuously being developed and explored. In this review, we first explain the basics of Bayesian theory and discuss how to set up data analysis problems within this framework. Next, we provide an overview of various Monte Carlo based methods for performing Bayesian data analysis. Finally, we discuss advanced ideas that enable us to tackle complex problems and thus hold great promise for the future. We also distribute downloadable computer software (available at https://github.com/sanjibs/bmcmc/ ) that implements some of the algorithms and examples discussed here.
Development of a Bayesian Belief Network Runway Incursion Model
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2014-01-01
In a previous paper, a statistical analysis of runway incursion (RI) events was conducted to ascertain their relevance to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to perhaps several of the AvSP top ten TC. That data also identified several primary causes and contributing factors for RI events that served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events. The system-level BBN model will allow NASA to generically model the causes of RI events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of RI events in particular, and to improve runway safety in general. The development, structure and assessment of that BBN for RI events by a Subject Matter Expert panel are documented in this paper.
A Bayesian paradigm for decision-making in proof-of-concept trials.
Pulkstenis, Erik; Patra, Kaushik; Zhang, Jianliang
2017-01-01
Decision-making is central to every phase of drug development, and especially at the proof of concept stage where risk and evidence must be weighed carefully, often in the presence of significant uncertainty. The decision to proceed or not to large expensive Phase 3 trials has significant implications to both patients and sponsors alike. Recent experience has shown that Phase 3 failure rates remain high. We present a flexible Bayesian quantitative decision-making paradigm that evaluates evidence relative to achieving a multilevel target product profile. A framework for operating characteristics is provided that allows the drug developer to design a proof-of-concept trial in light of its ability to support decision-making rather than merely achieve statistical significance. Operating characteristics are shown to be superior to traditional p-value-based methods. In addition, discussion related to sample size considerations, application to interim futility analysis and incorporation of prior historical information is evaluated.
A Bayesian Measurment Error Model for Misaligned Radiographic Data
Lennox, Kristin P.; Glascoe, Lee G.
2013-09-06
An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error inmore » addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.« less
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors
Woldegerima, Woldegebriel Assefa
2017-01-01
TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437
Power in Bayesian Mediation Analysis for Small Sample Research
Miočević, Milica; MacKinnon, David P.; Levy, Roy
2018-01-01
It was suggested that Bayesian methods have potential for increasing power in mediation analysis (Koopman, Howe, Hollenbeck, & Sin, 2015; Yuan & MacKinnon, 2009). This paper compares the power of Bayesian credibility intervals for the mediated effect to the power of normal theory, distribution of the product, percentile, and bias-corrected bootstrap confidence intervals at N≤ 200. Bayesian methods with diffuse priors have power comparable to the distribution of the product and bootstrap methods, and Bayesian methods with informative priors had the most power. Varying degrees of precision of prior distributions were also examined. Increased precision led to greater power only when N≥ 100 and the effects were small, N < 60 and the effects were large, and N < 200 and the effects were medium. An empirical example from psychology illustrated a Bayesian analysis of the single mediator model from prior selection to interpreting results. PMID:29662296
Power in Bayesian Mediation Analysis for Small Sample Research.
Miočević, Milica; MacKinnon, David P; Levy, Roy
2017-01-01
It was suggested that Bayesian methods have potential for increasing power in mediation analysis (Koopman, Howe, Hollenbeck, & Sin, 2015; Yuan & MacKinnon, 2009). This paper compares the power of Bayesian credibility intervals for the mediated effect to the power of normal theory, distribution of the product, percentile, and bias-corrected bootstrap confidence intervals at N≤ 200. Bayesian methods with diffuse priors have power comparable to the distribution of the product and bootstrap methods, and Bayesian methods with informative priors had the most power. Varying degrees of precision of prior distributions were also examined. Increased precision led to greater power only when N≥ 100 and the effects were small, N < 60 and the effects were large, and N < 200 and the effects were medium. An empirical example from psychology illustrated a Bayesian analysis of the single mediator model from prior selection to interpreting results.
Kennedy, Paula L; Woodbury, Allan D
2002-01-01
In ground water flow and transport modeling, the heterogeneous nature of porous media has a considerable effect on the resulting flow and solute transport. Some method of generating the heterogeneous field from a limited dataset of uncertain measurements is required. Bayesian updating is one method that interpolates from an uncertain dataset using the statistics of the underlying probability distribution function. In this paper, Bayesian updating was used to determine the heterogeneous natural log transmissivity field for a carbonate and a sandstone aquifer in southern Manitoba. It was determined that the transmissivity in m2/sec followed a natural log normal distribution for both aquifers with a mean of -7.2 and - 8.0 for the carbonate and sandstone aquifers, respectively. The variograms were calculated using an estimator developed by Li and Lake (1994). Fractal nature was not evident in the variogram from either aquifer. The Bayesian updating heterogeneous field provided good results even in cases where little data was available. A large transmissivity zone in the sandstone aquifer was created by the Bayesian procedure, which is not a reflection of any deterministic consideration, but is a natural outcome of updating a prior probability distribution function with observations. The statistical model returns a result that is very reasonable; that is homogeneous in regions where little or no information is available to alter an initial state. No long range correlation trends or fractal behavior of the log-transmissivity field was observed in either aquifer over a distance of about 300 km.
Zeng, Irene Sui Lan; Lumley, Thomas
2018-01-01
Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2015-12-01
Models in biogeoscience involve uncertainties in observation data, model inputs, model structure, model processes and modeling scenarios. To accommodate for different sources of uncertainty, multimodal analysis such as model combination, model selection, model elimination or model discrimination are becoming more popular. To illustrate theoretical and practical challenges of multimodal analysis, we use an example about microbial soil respiration modeling. Global soil respiration releases more than ten times more carbon dioxide to the atmosphere than all anthropogenic emissions. Thus, improving our understanding of microbial soil respiration is essential for improving climate change models. This study focuses on a poorly understood phenomena, which is the soil microbial respiration pulses in response to episodic rainfall pulses (the "Birch effect"). We hypothesize that the "Birch effect" is generated by the following three mechanisms. To test our hypothesis, we developed and assessed five evolving microbial-enzyme models against field measurements from a semiarid Savannah that is characterized by pulsed precipitation. These five model evolve step-wise such that the first model includes none of these three mechanism, while the fifth model includes the three mechanisms. The basic component of Bayesian multimodal analysis is the estimation of marginal likelihood to rank the candidate models based on their overall likelihood with respect to observation data. The first part of the study focuses on using this Bayesian scheme to discriminate between these five candidate models. The second part discusses some theoretical and practical challenges, which are mainly the effect of likelihood function selection and the marginal likelihood estimation methods on both model ranking and Bayesian model averaging. The study shows that making valid inference from scientific data is not a trivial task, since we are not only uncertain about the candidate scientific models, but also about the statistical methods that are used to discriminate between these models.
Bayesian methods including nonrandomized study data increased the efficiency of postlaunch RCTs.
Schmidt, Amand F; Klugkist, Irene; Klungel, Olaf H; Nielen, Mirjam; de Boer, Anthonius; Hoes, Arno W; Groenwold, Rolf H H
2015-04-01
Findings from nonrandomized studies on safety or efficacy of treatment in patient subgroups may trigger postlaunch randomized clinical trials (RCTs). In the analysis of such RCTs, results from nonrandomized studies are typically ignored. This study explores the trade-off between bias and power of Bayesian RCT analysis incorporating information from nonrandomized studies. A simulation study was conducted to compare frequentist with Bayesian analyses using noninformative and informative priors in their ability to detect interaction effects. In simulated subgroups, the effect of a hypothetical treatment differed between subgroups (odds ratio 1.00 vs. 2.33). Simulations varied in sample size, proportions of the subgroups, and specification of the priors. As expected, the results for the informative Bayesian analyses were more biased than those from the noninformative Bayesian analysis or frequentist analysis. However, because of a reduction in posterior variance, informative Bayesian analyses were generally more powerful to detect an effect. In scenarios where the informative priors were in the opposite direction of the RCT data, type 1 error rates could be 100% and power 0%. Bayesian methods incorporating data from nonrandomized studies can meaningfully increase power of interaction tests in postlaunch RCTs. Copyright © 2015 Elsevier Inc. All rights reserved.
Estimation of Lithological Classification in Taipei Basin: A Bayesian Maximum Entropy Method
NASA Astrophysics Data System (ADS)
Wu, Meng-Ting; Lin, Yuan-Chien; Yu, Hwa-Lung
2015-04-01
In environmental or other scientific applications, we must have a certain understanding of geological lithological composition. Because of restrictions of real conditions, only limited amount of data can be acquired. To find out the lithological distribution in the study area, many spatial statistical methods used to estimate the lithological composition on unsampled points or grids. This study applied the Bayesian Maximum Entropy (BME method), which is an emerging method of the geological spatiotemporal statistics field. The BME method can identify the spatiotemporal correlation of the data, and combine not only the hard data but the soft data to improve estimation. The data of lithological classification is discrete categorical data. Therefore, this research applied Categorical BME to establish a complete three-dimensional Lithological estimation model. Apply the limited hard data from the cores and the soft data generated from the geological dating data and the virtual wells to estimate the three-dimensional lithological classification in Taipei Basin. Keywords: Categorical Bayesian Maximum Entropy method, Lithological Classification, Hydrogeological Setting
Local coexistence of VO 2 phases revealed by deep data analysis
Strelcov, Evgheni; Ievlev, Anton; Tselev, Alexander; ...
2016-07-07
We report a synergistic approach of micro-Raman spectroscopic mapping and deep data analysis to study the distribution of crystallographic phases and ferroelastic domains in a defected Al-doped VO 2 microcrystal. Bayesian linear unmixing revealed an uneven distribution of the T phase, which is stabilized by the surface defects and uneven local doping that went undetectable by other classical analysis techniques such as PCA and SIMPLISMA. This work demonstrates the impact of information recovery via statistical analysis and full mapping in spectroscopic studies of vanadium dioxide systems, which is commonly substituted by averaging or single point-probing approaches, both of which suffermore » from information misinterpretation due to low resolving power.« less
On the Origins of Suboptimality in Human Probabilistic Inference
Acerbi, Luigi; Vijayakumar, Sethu; Wolpert, Daniel M.
2014-01-01
Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior. PMID:24945142
Flood quantile estimation at ungauged sites by Bayesian networks
NASA Astrophysics Data System (ADS)
Mediero, L.; Santillán, D.; Garrote, L.
2012-04-01
Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
Characterizing Giant Exoplanets through Multiwavelength Transit Observations: HD 189733b
NASA Astrophysics Data System (ADS)
Kar, Aman; Cole, Jackson Lane; Gardner, Cristilyn N.; Garver, Bethany Ray; Jarka, Kyla L.; McGough, Aylin Marie; PeQueen, David Jeffrey; Rivera, Daniel Ivan; Kasper, David; Jang-Condell, Hannah; Kobulnicky, Henry; Dale, Daniel
2018-01-01
Observing the transits of exoplanets in multiple wavelengths enables the characterization of their atmospheres. We used the Wyoming Infrared Observatory to obtain high precision photometry on HD 189733b, one of the most studied exoplanets. We employed the photometry package AIJ and Bayesian statistics in our analysis. Preliminary results suggest a wavelength dependence in the size of the exoplanet, indicative of scattering in the atmosphere. This work is supported by the National Science Foundation under REU grant AST 1560461.
2015-10-24
reference material (SRM) was entered into IBIS 10 times by each of three users (EBF, RLJ, and EFL). Each of these users has more than 12 months...bullet leaves the muzzle. In a recoil action (locked breech) pistol, the barrel and slide are securely locked together at the moment of firing. They...travel backward together until the barrel unlocks, forced down by a link or inclined plane, and continues rearward under its own momentum. A
Estimate of main local sources to ambient ultrafine particle number concentrations in an urban area
NASA Astrophysics Data System (ADS)
Rahman, Md Mahmudur; Mazaheri, Mandana; Clifford, Sam; Morawska, Lidia
2017-09-01
Quantifying and apportioning the contribution of a range of sources to ultrafine particles (UFPs, D < 100 nm) is a challenge due to the complex nature of the urban environments. Although vehicular emissions have long been considered one of the major sources of ultrafine particles in urban areas, the contribution of other major urban sources is not yet fully understood. This paper aims to determine and quantify the contribution of local ground traffic, nucleated particle (NP) formation and distant non-traffic (e.g. airport, oil refineries, and seaport) sources to the total ambient particle number concentration (PNC) in a busy, inner-city area in Brisbane, Australia using Bayesian statistical modelling and other exploratory tools. The Bayesian model was trained on the PNC data on days where NP formations were known to have not occurred, hourly traffic counts, solar radiation data, and smooth daily trend. The model was applied to apportion and quantify the contribution of NP formations and local traffic and non-traffic sources to UFPs. The data analysis incorporated long-term measured time-series of total PNC (D ≥ 6 nm), particle number size distributions (PSD, D = 8 to 400 nm), PM2.5, PM10, NOx, CO, meteorological parameters and traffic counts at a stationary monitoring site. The developed Bayesian model showed reliable predictive performances in quantifying the contribution of NP formation events to UFPs (up to 4 × 104 particles cm- 3), with a significant day to day variability. The model identified potential NP formation and no-formations days based on PNC data and quantified the sources contribution to UFPs. Exploratory statistical analyses show that total mean PNC during the middle of the day was up to 32% higher than during peak morning and evening traffic periods, which were associated with NP formation events. The majority of UFPs measured during the peak traffic and NP formation periods were between 30-100 nm and smaller than 30 nm, respectively. To date, this is the first application of Bayesian model to apportion different sources contribution to UFPs, and therefore the importance of this study is not only in its modelling outcomes but in demonstrating the applicability and advantages of this statistical approach to air pollution studies.
Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong
2015-01-01
Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies conservation planning. Journal compilation © 2010 Society for Conservation Biology. No claim to original US government works.
Hewett, Paul; Bullock, William H
2014-01-01
For more than 20 years CSX Transportation (CSXT) has collected exposure measurements from locomotive engineers and conductors who are potentially exposed to diesel emissions. The database included measurements for elemental and total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, carbon monoxide, and nitrogen dioxide. This database was statistically analyzed and summarized, and the resulting statistics and exposure profiles were compared to relevant occupational exposure limits (OELs) using both parametric and non-parametric descriptive and compliance statistics. Exposure ratings, using the American Industrial Health Association (AIHA) exposure categorization scheme, were determined using both the compliance statistics and Bayesian Decision Analysis (BDA). The statistical analysis of the elemental carbon data (a marker for diesel particulate) strongly suggests that the majority of levels in the cabs of the lead locomotives (n = 156) were less than the California guideline of 0.020 mg/m(3). The sample 95th percentile was roughly half the guideline; resulting in an AIHA exposure rating of category 2/3 (determined using BDA). The elemental carbon (EC) levels in the trailing locomotives tended to be greater than those in the lead locomotive; however, locomotive crews rarely ride in the trailing locomotive. Lead locomotive EC levels were similar to those reported by other investigators studying locomotive crew exposures and to levels measured in urban areas. Lastly, both the EC sample mean and 95%UCL were less than the Environmental Protection Agency (EPA) reference concentration of 0.005 mg/m(3). With the exception of nitrogen dioxide, the overwhelming majority of the measurements for total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, and combustion gases in the cabs of CSXT locomotives were either non-detects or considerably less than the working OELs for the years represented in the database. When compared to the previous American Conference of Governmental Industrial Hygienists (ACGIH) threshold limit value (TLV) of 3 ppm the nitrogen dioxide exposure profile merits an exposure rating of AIHA exposure category 1. However, using the newly adopted TLV of 0.2 ppm the exposure profile receives an exposure rating of category 4. Further evaluation is recommended to determine the current status of nitrogen dioxide exposures. [Supplementary materials are available for this article. Go to the publisher's online edition of Journal of Occupational and Environmental Hygiene for the following free supplemental resource: additional text on OELs, methods, results, and additional figures and tables.].
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
Bayesian performance metrics of binary sensors in homeland security applications
NASA Astrophysics Data System (ADS)
Jannson, Tomasz P.; Forrester, Thomas C.
2008-04-01
Bayesian performance metrics, based on such parameters, as: prior probability, probability of detection (or, accuracy), false alarm rate, and positive predictive value, characterizes the performance of binary sensors; i.e., sensors that have only binary response: true target/false target. Such binary sensors, very common in Homeland Security, produce an alarm that can be true, or false. They include: X-ray airport inspection, IED inspections, product quality control, cancer medical diagnosis, part of ATR, and many others. In this paper, we analyze direct and inverse conditional probabilities in the context of Bayesian inference and binary sensors, using X-ray luggage inspection statistical results as a guideline.
Lim, Cherry; Wannapinij, Prapass; White, Lisa; Day, Nicholas P J; Cooper, Ben S; Peacock, Sharon J; Limmathurotsakul, Direk
2013-01-01
Estimates of the sensitivity and specificity for new diagnostic tests based on evaluation against a known gold standard are imprecise when the accuracy of the gold standard is imperfect. Bayesian latent class models (LCMs) can be helpful under these circumstances, but the necessary analysis requires expertise in computational programming. Here, we describe open-access web-based applications that allow non-experts to apply Bayesian LCMs to their own data sets via a user-friendly interface. Applications for Bayesian LCMs were constructed on a web server using R and WinBUGS programs. The models provided (http://mice.tropmedres.ac) include two Bayesian LCMs: the two-tests in two-population model (Hui and Walter model) and the three-tests in one-population model (Walter and Irwig model). Both models are available with simplified and advanced interfaces. In the former, all settings for Bayesian statistics are fixed as defaults. Users input their data set into a table provided on the webpage. Disease prevalence and accuracy of diagnostic tests are then estimated using the Bayesian LCM, and provided on the web page within a few minutes. With the advanced interfaces, experienced researchers can modify all settings in the models as needed. These settings include correlation among diagnostic test results and prior distributions for all unknown parameters. The web pages provide worked examples with both models using the original data sets presented by Hui and Walter in 1980, and by Walter and Irwig in 1988. We also illustrate the utility of the advanced interface using the Walter and Irwig model on a data set from a recent melioidosis study. The results obtained from the web-based applications were comparable to those published previously. The newly developed web-based applications are open-access and provide an important new resource for researchers worldwide to evaluate new diagnostic tests.
Periodic benefit-risk assessment using Bayesian stochastic multi-criteria acceptability analysis
Li, Kan; Yuan, Shuai Sammy; Wang, William; Wan, Shuyan Sabrina; Ceesay, Paulette; Heyse, Joseph F.; Mt-Isa, Shahrul; Luo, Sheng
2018-01-01
Benefit-risk (BR) assessment is essential to ensure the best decisions are made for a medical product in the clinical development process, regulatory marketing authorization, post-market surveillance, and coverage and reimbursement decisions. One challenge of BR assessment in practice is that the benefit and risk profile may keep evolving while new evidence is accumulating. Regulators and the International Conference on Harmonization (ICH) recommend performing periodic benefit-risk evaluation report (PBRER) through the product's lifecycle. In this paper, we propose a general statistical framework for periodic benefit-risk assessment, in which Bayesian meta-analysis and stochastic multi-criteria acceptability analysis (SMAA) will be combined to synthesize the accumulating evidence. The proposed approach allows us to compare the acceptability of different drugs dynamically and effectively and accounts for the uncertainty of clinical measurements and imprecise or incomplete preference information of decision makers. We apply our approaches to two real examples in a post-hoc way for illustration purpose. The proposed method may easily be modified for other pre and post market settings, and thus be an important complement to the current structured benefit-risk assessment (sBRA) framework to improve the transparent and consistency of the decision-making process. PMID:29505866
Performing Contrast Analysis in Factorial Designs: From NHST to Confidence Intervals and Beyond
Wiens, Stefan; Nilsson, Mats E.
2016-01-01
Because of the continuing debates about statistics, many researchers may feel confused about how to analyze and interpret data. Current guidelines in psychology advocate the use of effect sizes and confidence intervals (CIs). However, researchers may be unsure about how to extract effect sizes from factorial designs. Contrast analysis is helpful because it can be used to test specific questions of central interest in studies with factorial designs. It weighs several means and combines them into one or two sets that can be tested with t tests. The effect size produced by a contrast analysis is simply the difference between means. The CI of the effect size informs directly about direction, hypothesis exclusion, and the relevance of the effects of interest. However, any interpretation in terms of precision or likelihood requires the use of likelihood intervals or credible intervals (Bayesian). These various intervals and even a Bayesian t test can be obtained easily with free software. This tutorial reviews these methods to guide researchers in answering the following questions: When I analyze mean differences in factorial designs, where can I find the effects of central interest, and what can I learn about their effect sizes? PMID:29805179
Combining information from multiple flood projections in a hierarchical Bayesian framework
NASA Astrophysics Data System (ADS)
Le Vine, Nataliya
2016-04-01
This study demonstrates, in the context of flood frequency analysis, the potential of a recently proposed hierarchical Bayesian approach to combine information from multiple models. The approach explicitly accommodates shared multimodel discrepancy as well as the probabilistic nature of the flood estimates, and treats the available models as a sample from a hypothetical complete (but unobserved) set of models. The methodology is applied to flood estimates from multiple hydrological projections (the Future Flows Hydrology data set) for 135 catchments in the UK. The advantages of the approach are shown to be: (1) to ensure adequate "baseline" with which to compare future changes; (2) to reduce flood estimate uncertainty; (3) to maximize use of statistical information in circumstances where multiple weak predictions individually lack power, but collectively provide meaningful information; (4) to diminish the importance of model consistency when model biases are large; and (5) to explicitly consider the influence of the (model performance) stationarity assumption. Moreover, the analysis indicates that reducing shared model discrepancy is the key to further reduction of uncertainty in the flood frequency analysis. The findings are of value regarding how conclusions about changing exposure to flooding are drawn, and to flood frequency change attribution studies.
Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module
NASA Astrophysics Data System (ADS)
Martinez, Gregory D.; McKay, James; Farmer, Ben; Scott, Pat; Roebber, Elinore; Putze, Antje; Conrad, Jan
2017-11-01
We introduce ScannerBit, the statistics and sampling module of the public, open-source global fitting framework GAMBIT. ScannerBit provides a standardised interface to different sampling algorithms, enabling the use and comparison of multiple computational methods for inferring profile likelihoods, Bayesian posteriors, and other statistical quantities. The current version offers random, grid, raster, nested sampling, differential evolution, Markov Chain Monte Carlo (MCMC) and ensemble Monte Carlo samplers. We also announce the release of a new standalone differential evolution sampler, Diver, and describe its design, usage and interface to ScannerBit. We subject Diver and three other samplers (the nested sampler MultiNest, the MCMC GreAT, and the native ScannerBit implementation of the ensemble Monte Carlo algorithm T-Walk) to a battery of statistical tests. For this we use a realistic physical likelihood function, based on the scalar singlet model of dark matter. We examine the performance of each sampler as a function of its adjustable settings, and the dimensionality of the sampling problem. We evaluate performance on four metrics: optimality of the best fit found, completeness in exploring the best-fit region, number of likelihood evaluations, and total runtime. For Bayesian posterior estimation at high resolution, T-Walk provides the most accurate and timely mapping of the full parameter space. For profile likelihood analysis in less than about ten dimensions, we find that Diver and MultiNest score similarly in terms of best fit and speed, outperforming GreAT and T-Walk; in ten or more dimensions, Diver substantially outperforms the other three samplers on all metrics.
The utility of Bayesian predictive probabilities for interim monitoring of clinical trials
Connor, Jason T.; Ayers, Gregory D; Alvarez, JoAnn
2014-01-01
Background Bayesian predictive probabilities can be used for interim monitoring of clinical trials to estimate the probability of observing a statistically significant treatment effect if the trial were to continue to its predefined maximum sample size. Purpose We explore settings in which Bayesian predictive probabilities are advantageous for interim monitoring compared to Bayesian posterior probabilities, p-values, conditional power, or group sequential methods. Results For interim analyses that address prediction hypotheses, such as futility monitoring and efficacy monitoring with lagged outcomes, only predictive probabilities properly account for the amount of data remaining to be observed in a clinical trial and have the flexibility to incorporate additional information via auxiliary variables. Limitations Computational burdens limit the feasibility of predictive probabilities in many clinical trial settings. The specification of prior distributions brings additional challenges for regulatory approval. Conclusions The use of Bayesian predictive probabilities enables the choice of logical interim stopping rules that closely align with the clinical decision making process. PMID:24872363
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Li, Ben; Sun, Zhaonan; He, Qing; Zhu, Yu; Qin, Zhaohui S.
2016-01-01
Motivation: Modern high-throughput biotechnologies such as microarray are capable of producing a massive amount of information for each sample. However, in a typical high-throughput experiment, only limited number of samples were assayed, thus the classical ‘large p, small n’ problem. On the other hand, rapid propagation of these high-throughput technologies has resulted in a substantial collection of data, often carried out on the same platform and using the same protocol. It is highly desirable to utilize the existing data when performing analysis and inference on a new dataset. Results: Utilizing existing data can be carried out in a straightforward fashion under the Bayesian framework in which the repository of historical data can be exploited to build informative priors and used in new data analysis. In this work, using microarray data, we investigate the feasibility and effectiveness of deriving informative priors from historical data and using them in the problem of detecting differentially expressed genes. Through simulation and real data analysis, we show that the proposed strategy significantly outperforms existing methods including the popular and state-of-the-art Bayesian hierarchical model-based approaches. Our work illustrates the feasibility and benefits of exploiting the increasingly available genomics big data in statistical inference and presents a promising practical strategy for dealing with the ‘large p, small n’ problem. Availability and implementation: Our method is implemented in R package IPBT, which is freely available from https://github.com/benliemory/IPBT. Contact: yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26519502
Validation of Bayesian analysis of compartmental kinetic models in medical imaging.
Sitek, Arkadiusz; Li, Quanzheng; El Fakhri, Georges; Alpert, Nathaniel M
2016-10-01
Kinetic compartmental analysis is frequently used to compute physiologically relevant quantitative values from time series of images. In this paper, a new approach based on Bayesian analysis to obtain information about these parameters is presented and validated. The closed-form of the posterior distribution of kinetic parameters is derived with a hierarchical prior to model the standard deviation of normally distributed noise. Markov chain Monte Carlo methods are used for numerical estimation of the posterior distribution. Computer simulations of the kinetics of F18-fluorodeoxyglucose (FDG) are used to demonstrate drawing statistical inferences about kinetic parameters and to validate the theory and implementation. Additionally, point estimates of kinetic parameters and covariance of those estimates are determined using the classical non-linear least squares approach. Posteriors obtained using methods proposed in this work are accurate as no significant deviation from the expected shape of the posterior was found (one-sided P>0.08). It is demonstrated that the results obtained by the standard non-linear least-square methods fail to provide accurate estimation of uncertainty for the same data set (P<0.0001). The results of this work validate new methods for a computer simulations of FDG kinetics. Results show that in situations where the classical approach fails in accurate estimation of uncertainty, Bayesian estimation provides an accurate information about the uncertainties in the parameters. Although a particular example of FDG kinetics was used in the paper, the methods can be extended for different pharmaceuticals and imaging modalities. Copyright © 2016 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Identifiability of PBPK Models with Applications to ...
Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss different types of identifiability that occur in PBPK models and give reasons why they occur. We particularly focus on how the mathematical structure of a PBPK model and lack of appropriate data can lead to statistical models in which it is impossible to estimate at least some parameters precisely. Methods are reviewed which can determine whether a purely linear PBPK model is globally identifiable. We propose a theorem which determines when identifiability at a set of finite and specific values of the mathematical PBPK model (global discrete identifiability) implies identifiability of the statistical model. However, we are unable to establish conditions that imply global discrete identifiability, and conclude that the only safe approach to analysis of PBPK models involves Bayesian analysis with truncated priors. Finally, computational issues regarding posterior simulations of PBPK models are discussed. The methodology is very general and can be applied to numerous PBPK models which can be expressed as linear time-invariant systems. A real data set of a PBPK model for exposure to dimethyl arsinic acid (DMA(V)) is presented to illustrate the proposed methodology. We consider statistical analy
Holm Hansen, Christian; Warner, Pamela; Parker, Richard A; Walker, Brian R; Critchley, Hilary Od; Weir, Christopher J
2017-12-01
It is often unclear what specific adaptive trial design features lead to an efficient design which is also feasible to implement. This article describes the preparatory simulation study for a Bayesian response-adaptive dose-finding trial design. Dexamethasone for Excessive Menstruation aims to assess the efficacy of Dexamethasone in reducing excessive menstrual bleeding and to determine the best dose for further study. To maximise learning about the dose response, patients receive placebo or an active dose with randomisation probabilities adapting based on evidence from patients already recruited. The dose-response relationship is estimated using a flexible Bayesian Normal Dynamic Linear Model. Several competing design options were considered including: number of doses, proportion assigned to placebo, adaptation criterion, and number and timing of adaptations. We performed a fractional factorial study using SAS software to simulate virtual trial data for candidate adaptive designs under a variety of scenarios and to invoke WinBUGS for Bayesian model estimation. We analysed the simulated trial results using Normal linear models to estimate the effects of each design feature on empirical type I error and statistical power. Our readily-implemented approach using widely available statistical software identified a final design which performed robustly across a range of potential trial scenarios.
Bayesian Analysis of the Cosmic Microwave Background
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey
2007-01-01
There is a wealth of cosmological information encoded in the spatial power spectrum of temperature anisotropies of the cosmic microwave background! Experiments designed to map the microwave sky are returning a flood of data (time streams of instrument response as a beam is swept over the sky) at several different frequencies (from 30 to 900 GHz), all with different resolutions and noise properties. The resulting analysis challenge is to estimate, and quantify our uncertainty in, the spatial power spectrum of the cosmic microwave background given the complexities of "missing data", foreground emission, and complicated instrumental noise. Bayesian formulation of this problem allows consistent treatment of many complexities including complicated instrumental noise and foregrounds, and can be numerically implemented with Gibbs sampling. Gibbs sampling has now been validated as an efficient, statistically exact, and practically useful method for low-resolution (as demonstrated on WMAP 1 and 3 year temperature and polarization data). Continuing development for Planck - the goal is to exploit the unique capabilities of Gibbs sampling to directly propagate uncertainties in both foreground and instrument models to total uncertainty in cosmological parameters.
A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.
Dorie, Vincent; Harada, Masataka; Carnegie, Nicole Bohme; Hill, Jennifer
2016-09-10
When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Wöhling, Thomas; Nowak, Wolfgang
2014-05-01
Bayesian model averaging ranks the predictive capabilities of alternative conceptual models based on Bayes' theorem. The individual models are weighted with their posterior probability to be the best one in the considered set of models. Finally, their predictions are combined into a robust weighted average and the predictive uncertainty can be quantified. This rigorous procedure does, however, not yet account for possible instabilities due to measurement noise in the calibration data set. This is a major drawback, since posterior model weights may suffer a lack of robustness related to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new statistical concept to account for measurement noise as source of uncertainty for the weights in Bayesian model averaging. Our suggested upgrade reflects the limited information content of data for the purpose of model selection. It allows us to assess the significance of the determined posterior model weights, the confidence in model selection, and the accuracy of the quantified predictive uncertainty. Our approach rests on a brute-force Monte Carlo framework. We determine the robustness of model weights against measurement noise by repeatedly perturbing the observed data with random realizations of measurement error. Then, we analyze the induced variability in posterior model weights and introduce this "weighting variance" as an additional term into the overall prediction uncertainty analysis scheme. We further determine the theoretical upper limit in performance of the model set which is imposed by measurement noise. As an extension to the merely relative model ranking, this analysis provides a measure of absolute model performance. To finally decide, whether better data or longer time series are needed to ensure a robust basis for model selection, we resample the measurement time series and assess the convergence of model weights for increasing time series length. We illustrate our suggested approach with an application to model selection between different soil-plant models following up on a study by Wöhling et al. (2013). Results show that measurement noise compromises the reliability of model ranking and causes a significant amount of weighting uncertainty, if the calibration data time series is not long enough to compensate for its noisiness. This additional contribution to the overall predictive uncertainty is neglected without our approach. Thus, we strongly advertise to include our suggested upgrade in the Bayesian model averaging routine.
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
NASA Astrophysics Data System (ADS)
Barentsen, Geert; Vink, J. S.; Drew, J. E.; Sale, S. E.
2013-03-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main-sequence ages, masses, accretion rates and extinctions to be estimated using two widely available photometric survey data bases (Isaac Newton Telescope Photometric Hα Survey r'/Hα/i' and Two Micron All Sky Survey J-band magnitudes). Because our approach does not rely on spectroscopy, it can easily be applied to ho-mogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov chain Monte Carlo method and provide PYTHON source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2264 (Cone Nebula), arriving at a median age of 3.0 Myr, an accretion fraction of 20 ± 2 per cent and a median accretion rate of 10-8.4 M⊙ yr-1. The Bayesian analysis formulated in this work delivers results which are in agreement with spectroscopic studies already in the literature, but achieves this with great efficiency by depending only on photometry. It is a significant step forward from previous photometric studies because the probabilistic approach ensures that nuisance parameters, such as extinction and distance, are fully included in the analysis with a clear picture on any degeneracies.
Bayesian evaluation of effect size after replicating an original study
van Aert, Robbie C. M.; van Assen, Marcel A. L. M.
2017-01-01
The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies. However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant. We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study’s significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application (https://rvanaert.shinyapps.io/snapshot/) and R code for applying the method. PMID:28388646
Analysis of nasopharyngeal carcinoma risk factors with Bayesian networks.
Aussem, Alex; de Morais, Sérgio Rodrigues; Corbex, Marilys
2012-01-01
We propose a new graphical framework for extracting the relevant dietary, social and environmental risk factors that are associated with an increased risk of nasopharyngeal carcinoma (NPC) on a case-control epidemiologic study that consists of 1289 subjects and 150 risk factors. This framework builds on the use of Bayesian networks (BNs) for representing statistical dependencies between the random variables. We discuss a novel constraint-based procedure, called Hybrid Parents and Children (HPC), that builds recursively a local graph that includes all the relevant features statistically associated to the NPC, without having to find the whole BN first. The local graph is afterwards directed by the domain expert according to his knowledge. It provides a statistical profile of the recruited population, and meanwhile helps identify the risk factors associated to NPC. Extensive experiments on synthetic data sampled from known BNs show that the HPC outperforms state-of-the-art algorithms that appeared in the recent literature. From a biological perspective, the present study confirms that chemical products, pesticides and domestic fume intake from incomplete combustion of coal and wood are significantly associated with NPC risk. These results suggest that industrial workers are often exposed to noxious chemicals and poisonous substances that are used in the course of manufacturing. This study also supports previous findings that the consumption of a number of preserved food items, like house made proteins and sheep fat, are a major risk factor for NPC. BNs are valuable data mining tools for the analysis of epidemiologic data. They can explicitly combine both expert knowledge from the field and information inferred from the data. These techniques therefore merit consideration as valuable alternatives to traditional multivariate regression techniques in epidemiologic studies. Copyright © 2011 Elsevier B.V. All rights reserved.
Water quality management using statistical analysis and time-series prediction model
NASA Astrophysics Data System (ADS)
Parmar, Kulwinder Singh; Bhardwaj, Rashmi
2014-12-01
This paper deals with water quality management using statistical analysis and time-series prediction model. The monthly variation of water quality standards has been used to compare statistical mean, median, mode, standard deviation, kurtosis, skewness, coefficient of variation at Yamuna River. Model validated using R-squared, root mean square error, mean absolute percentage error, maximum absolute percentage error, mean absolute error, maximum absolute error, normalized Bayesian information criterion, Ljung-Box analysis, predicted value and confidence limits. Using auto regressive integrated moving average model, future water quality parameters values have been estimated. It is observed that predictive model is useful at 95 % confidence limits and curve is platykurtic for potential of hydrogen (pH), free ammonia, total Kjeldahl nitrogen, dissolved oxygen, water temperature (WT); leptokurtic for chemical oxygen demand, biochemical oxygen demand. Also, it is observed that predicted series is close to the original series which provides a perfect fit. All parameters except pH and WT cross the prescribed limits of the World Health Organization /United States Environmental Protection Agency, and thus water is not fit for drinking, agriculture and industrial use.
Bayesian statistics and Monte Carlo methods
NASA Astrophysics Data System (ADS)
Koch, K. R.
2018-03-01
The Bayesian approach allows an intuitive way to derive the methods of statistics. Probability is defined as a measure of the plausibility of statements or propositions. Three rules are sufficient to obtain the laws of probability. If the statements refer to the numerical values of variables, the so-called random variables, univariate and multivariate distributions follow. They lead to the point estimation by which unknown quantities, i.e. unknown parameters, are computed from measurements. The unknown parameters are random variables, they are fixed quantities in traditional statistics which is not founded on Bayes' theorem. Bayesian statistics therefore recommends itself for Monte Carlo methods, which generate random variates from given distributions. Monte Carlo methods, of course, can also be applied in traditional statistics. The unknown parameters, are introduced as functions of the measurements, and the Monte Carlo methods give the covariance matrix and the expectation of these functions. A confidence region is derived where the unknown parameters are situated with a given probability. Following a method of traditional statistics, hypotheses are tested by determining whether a value for an unknown parameter lies inside or outside the confidence region. The error propagation of a random vector by the Monte Carlo methods is presented as an application. If the random vector results from a nonlinearly transformed vector, its covariance matrix and its expectation follow from the Monte Carlo estimate. This saves a considerable amount of derivatives to be computed, and errors of the linearization are avoided. The Monte Carlo method is therefore efficient. If the functions of the measurements are given by a sum of two or more random vectors with different multivariate distributions, the resulting distribution is generally not known. TheMonte Carlo methods are then needed to obtain the covariance matrix and the expectation of the sum.
Exoplanet Biosignatures: A Framework for Their Assessment.
Catling, David C; Krissansen-Totton, Joshua; Kiang, Nancy Y; Crisp, David; Robinson, Tyler D; DasSarma, Shiladitya; Rushby, Andrew J; Del Genio, Anthony; Bains, William; Domagal-Goldman, Shawn
2018-04-20
Finding life on exoplanets from telescopic observations is an ultimate goal of exoplanet science. Life produces gases and other substances, such as pigments, which can have distinct spectral or photometric signatures. Whether or not life is found with future data must be expressed with probabilities, requiring a framework of biosignature assessment. We present a framework in which we advocate using biogeochemical "Exo-Earth System" models to simulate potential biosignatures in spectra or photometry. Given actual observations, simulations are used to find the Bayesian likelihoods of those data occurring for scenarios with and without life. The latter includes "false positives" wherein abiotic sources mimic biosignatures. Prior knowledge of factors influencing planetary inhabitation, including previous observations, is combined with the likelihoods to give the Bayesian posterior probability of life existing on a given exoplanet. Four components of observation and analysis are necessary. (1) Characterization of stellar (e.g., age and spectrum) and exoplanetary system properties, including "external" exoplanet parameters (e.g., mass and radius), to determine an exoplanet's suitability for life. (2) Characterization of "internal" exoplanet parameters (e.g., climate) to evaluate habitability. (3) Assessment of potential biosignatures within the environmental context (components 1-2), including corroborating evidence. (4) Exclusion of false positives. We propose that resulting posterior Bayesian probabilities of life's existence map to five confidence levels, ranging from "very likely" (90-100%) to "very unlikely" (<10%) inhabited. Key Words: Bayesian statistics-Biosignatures-Drake equation-Exoplanets-Habitability-Planetary science. Astrobiology 18, xxx-xxx.
Bayesian Latent Class Models in Malaria Diagnosis
Gonçalves, Luzia; Subtil, Ana; de Oliveira, M. Rosário; do Rosário, Virgílio; Lee, Pei-Wen; Shaio, Men-Fang
2012-01-01
Aims The main focus of this study is to illustrate the importance of the statistical analysis in the evaluation of the accuracy of malaria diagnostic tests, without admitting a reference test, exploring a dataset (3317) collected in São Tomé and Príncipe. Methods Bayesian Latent Class Models (without and with constraints) are used to estimate the malaria infection prevalence, together with sensitivities, specificities, and predictive values of three diagnostic tests (RDT, Microscopy and PCR), in four subpopulations simultaneously based on a stratified analysis by age groups (, 5 years old) and fever status (febrile, afebrile). Results In the afebrile individuals with at least five years old, the posterior mean of the malaria infection prevalence is 3.2% with a highest posterior density interval of [2.3–4.1]. The other three subpopulations (febrile 5 years, afebrile or febrile children less than 5 years) present a higher prevalence around 10.3% [8.8–11.7]. In afebrile children under-five years old, the sensitivity of microscopy is 50.5% [37.7–63.2]. In children under-five, the estimated sensitivities/specificities of RDT are 95.4% [90.3–99.5]/93.8% [91.6–96.0] – afebrile – and 94.1% [87.5–99.4]/97.5% [95.5–99.3] – febrile. In individuals with at least five years old are 96.0% [91.5–99.7]/98.7% [98.1–99.2] – afebrile – and 97.9% [95.3–99.8]/97.7% [96.6–98.6] – febrile. The PCR yields the most reliable results in four subpopulations. Conclusions The utility of this RDT in the field seems to be relevant. However, in all subpopulations, data provide enough evidence to suggest caution with the positive predictive values of the RDT. Microscopy has poor sensitivity compared to the other tests, particularly, in the afebrile children less than 5 years. This type of findings reveals the danger of statistical analysis based on microscopy as a reference test. Bayesian Latent Class Models provide a powerful tool to evaluate malaria diagnostic tests, taking into account different groups of interest. PMID:22844405
Nonlinear and non-Gaussian Bayesian based handwriting beautification
NASA Astrophysics Data System (ADS)
Shi, Cao; Xiao, Jianguo; Xu, Canhui; Jia, Wenhua
2013-03-01
A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.
Morelli, Maria Sole; Giannoni, Alberto; Passino, Claudio; Landini, Luigi; Emdin, Michele; Vanello, Nicola
2016-01-01
Electroencephalographic (EEG) irreducible artifacts are common and the removal of corrupted segments from the analysis may be required. The present study aims at exploring the effects of different EEG Missing Data Segment (MDS) distributions on cross-correlation analysis, involving EEG and physiological signals. The reliability of cross-correlation analysis both at single subject and at group level as a function of missing data statistics was evaluated using dedicated simulations. Moreover, a Bayesian-based approach for combining the single subject results at group level by considering each subject’s reliability was introduced. Starting from the above considerations, the cross-correlation function between EEG Global Field Power (GFP) in delta band and end-tidal CO2 (PETCO2) during rest and voluntary breath-hold was evaluated in six healthy subjects. The analysis of simulated data results at single subject level revealed a worsening of precision and accuracy in the cross-correlation analysis in the presence of MDS. At the group level, a large improvement in the results’ reliability with respect to single subject analysis was observed. The proposed Bayesian approach showed a slight improvement with respect to simple average results. Real data results were discussed in light of the simulated data tests and of the current physiological findings. PMID:27809243
Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S
2015-01-16
Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.
Chan, Yvonne L; Schanzenbach, David; Hickerson, Michael J
2014-09-01
Methods that integrate population-level sampling from multiple taxa into a single community-level analysis are an essential addition to the comparative phylogeographic toolkit. Detecting how species within communities have demographically tracked each other in space and time is important for understanding the effects of future climate and landscape changes and the resulting acceleration of extinctions, biological invasions, and potential surges in adaptive evolution. Here, we present a statistical framework for such an analysis based on hierarchical approximate Bayesian computation (hABC) with the goal of detecting concerted demographic histories across an ecological assemblage. Our method combines population genetic data sets from multiple taxa into a single analysis to estimate: 1) the proportion of a community sample that demographically expanded in a temporally clustered pulse and 2) when the pulse occurred. To validate the accuracy and utility of this new approach, we use simulation cross-validation experiments and subsequently analyze an empirical data set of 32 avian populations from Australia that are hypothesized to have expanded from smaller refugia populations in the late Pleistocene. The method can accommodate data set heterogeneity such as variability in effective population size, mutation rates, and sample sizes across species and exploits the statistical strength from the simultaneous analysis of multiple species. This hABC framework used in a multitaxa demographic context can increase our understanding of the impact of historical climate change by determining what proportion of the community responded in concert or independently and can be used with a wide variety of comparative phylogeographic data sets as biota-wide DNA barcoding data sets accumulate. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Toward improved analysis of concentration data: Embracing nondetects.
Shoari, Niloofar; Dubé, Jean-Sébastien
2018-03-01
Various statistical tests on concentration data serve to support decision-making regarding characterization and monitoring of contaminated media, assessing exposure to a chemical, and quantifying the associated risks. However, the routine statistical protocols cannot be directly applied because of challenges arising from nondetects or left-censored observations, which are concentration measurements below the detection limit of measuring instruments. Despite the existence of techniques based on survival analysis that can adjust for nondetects, these are seldom taken into account properly. A comprehensive review of the literature showed that managing policies regarding analysis of censored data do not always agree and that guidance from regulatory agencies may be outdated. Therefore, researchers and practitioners commonly resort to the most convenient way of tackling the censored data problem by substituting nondetects with arbitrary constants prior to data analysis, although this is generally regarded as a bias-prone approach. Hoping to improve the interpretation of concentration data, the present article aims to familiarize researchers in different disciplines with the significance of left-censored observations and provides theoretical and computational recommendations (under both frequentist and Bayesian frameworks) for adequate analysis of censored data. In particular, the present article synthesizes key findings from previous research with respect to 3 noteworthy aspects of inferential statistics: estimation of descriptive statistics, hypothesis testing, and regression analysis. Environ Toxicol Chem 2018;37:643-656. © 2017 SETAC. © 2017 SETAC.
Du, Yuanwei; Guo, Yubin
2015-01-01
The intrinsic mechanism of multimorbidity is difficult to recognize and prediction and diagnosis are difficult to carry out accordingly. Bayesian networks can help to diagnose multimorbidity in health care, but it is difficult to obtain the conditional probability table (CPT) because of the lack of clinically statistical data. Today, expert knowledge and experience are increasingly used in training Bayesian networks in order to help predict or diagnose diseases, but the CPT in Bayesian networks is usually irrational or ineffective for ignoring realistic constraints especially in multimorbidity. In order to solve these problems, an evidence reasoning (ER) approach is employed to extract and fuse inference data from experts using a belief distribution and recursive ER algorithm, based on which evidence reasoning method for constructing conditional probability tables in Bayesian network of multimorbidity is presented step by step. A multimorbidity numerical example is used to demonstrate the method and prove its feasibility and application. Bayesian network can be determined as long as the inference assessment is inferred by each expert according to his/her knowledge or experience. Our method is more effective than existing methods for extracting expert inference data accurately and is fused effectively for constructing CPTs in a Bayesian network of multimorbidity.
Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,
2016-01-01
Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.
Learning what to expect (in visual perception)
Seriès, Peggy; Seitz, Aaron R.
2013-01-01
Expectations are known to greatly affect our experience of the world. A growing theory in computational neuroscience is that perception can be successfully described using Bayesian inference models and that the brain is “Bayes-optimal” under some constraints. In this context, expectations are particularly interesting, because they can be viewed as prior beliefs in the statistical inference process. A number of questions remain unsolved, however, for example: How fast do priors change over time? Are there limits in the complexity of the priors that can be learned? How do an individual’s priors compare to the true scene statistics? Can we unlearn priors that are thought to correspond to natural scene statistics? Where and what are the neural substrate of priors? Focusing on the perception of visual motion, we here review recent studies from our laboratories and others addressing these issues. We discuss how these data on motion perception fit within the broader literature on perceptual Bayesian priors, perceptual expectations, and statistical and perceptual learning and review the possible neural basis of priors. PMID:24187536
Bayesian Statistical Analysis of Historical and Late Holocene Rates of Sea-Level Change
NASA Astrophysics Data System (ADS)
Cahill, Niamh; Parnell, Andrew; Kemp, Andrew; Horton, Benjamin
2014-05-01
A fundamental concern associated with climate change is the rate at which sea levels are rising. Studies of past sea level (particularly beyond the instrumental data range) allow modern sea-level rise to be placed in a more complete context. Considering this, we perform a Bayesian statistical analysis on historical and late Holocene rates of sea-level change. The data that form the input to the statistical model are tide-gauge measurements and proxy reconstructions from cores of coastal sediment. The aims are to estimate rates of sea-level rise, to determine when modern rates of sea-level rise began and to observe how these rates have been changing over time. Many of the current methods for doing this use simple linear regression to estimate rates. This is often inappropriate as it is too rigid and it can ignore uncertainties that arise as part of the data collection exercise. This can lead to over confidence in the sea-level trends being characterized. The proposed Bayesian model places a Gaussian process prior on the rate process (i.e. the process that determines how rates of sea-level are changing over time). The likelihood of the observed data is the integral of this process. When dealing with proxy reconstructions, this is set in an errors-in-variables framework so as to take account of age uncertainty. It is also necessary, in this case, for the model to account for glacio-isostatic adjustment, which introduces a covariance between individual age and sea-level observations. This method provides a flexible fit and it allows for the direct estimation of the rate process with full consideration of all sources of uncertainty. Analysis of tide-gauge datasets and proxy reconstructions in this way means that changing rates of sea level can be estimated more comprehensively and accurately than previously possible. The model captures the continuous and dynamic evolution of sea-level change and results show that not only are modern sea levels rising but that the rates of rise are continuously increasing. Analysis of the a global tide-gauge record (Church and White, 2011) indicated that the rate of sea-level rise increased continuously since 1880AD and is currently 2.57mm/yr (95% credible interval of 1.71 to 4.35mm/yr). Application of the model a proxy reconstruction from North Carolina (Kemp et al., 2011) indicated that the mean rate of rise in this locality since the middle of the 19th century (current rate of 2.66 mm/yr with a 95% credible interval of 1.29 to 4.59mm/yr) is in agreement with results from the tide gauge analysis and is unprecedented in at least the last 2000 years.
Markov chain Monte Carlo estimation of quantum states
NASA Astrophysics Data System (ADS)
Diguglielmo, James; Messenger, Chris; Fiurášek, Jaromír; Hage, Boris; Samblowski, Aiko; Schmidt, Tabea; Schnabel, Roman
2009-03-01
We apply a Bayesian data analysis scheme known as the Markov chain Monte Carlo to the tomographic reconstruction of quantum states. This method yields a vector, known as the Markov chain, which contains the full statistical information concerning all reconstruction parameters including their statistical correlations with no a priori assumptions as to the form of the distribution from which it has been obtained. From this vector we can derive, e.g., the marginal distributions and uncertainties of all model parameters, and also of other quantities such as the purity of the reconstructed state. We demonstrate the utility of this scheme by reconstructing the Wigner function of phase-diffused squeezed states. These states possess non-Gaussian statistics and therefore represent a nontrivial case of tomographic reconstruction. We compare our results to those obtained through pure maximum-likelihood and Fisher information approaches.
A Bayesian perspective on Markovian dynamics and the fluctuation theorem
NASA Astrophysics Data System (ADS)
Virgo, Nathaniel
2013-08-01
One of E. T. Jaynes' most important achievements was to derive statistical mechanics from the maximum entropy (MaxEnt) method. I re-examine a relatively new result in statistical mechanics, the Evans-Searles fluctuation theorem, from a MaxEnt perspective. This is done in the belief that interpreting such results in Bayesian terms will lead to new advances in statistical physics. The version of the fluctuation theorem that I will discuss applies to discrete, stochastic systems that begin in a non-equilibrium state and relax toward equilibrium. I will show that for such systems the fluctuation theorem can be seen as a consequence of the fact that the equilibrium distribution must obey the property of detailed balance. Although the principle of detailed balance applies only to equilibrium ensembles, it puts constraints on the form of non-equilibrium trajectories. This will be made clear by taking a novel kind of Bayesian perspective, in which the equilibrium distribution is seen as a prior over the system's set of possible trajectories. Non-equilibrium ensembles are calculated from this prior using Bayes' theorem, with the initial conditions playing the role of the data. I will also comment on the implications of this perspective for the question of how to derive the second law.
Bayesian analysis of multiple direct detection experiments
NASA Astrophysics Data System (ADS)
Arina, Chiara
2014-12-01
Bayesian methods offer a coherent and efficient framework for implementing uncertainties into induction problems. In this article, we review how this approach applies to the analysis of dark matter direct detection experiments. In particular we discuss the exclusion limit of XENON100 and the debated hints of detection under the hypothesis of a WIMP signal. Within parameter inference, marginalizing consistently over uncertainties to extract robust posterior probability distributions, we find that the claimed tension between XENON100 and the other experiments can be partially alleviated in isospin violating scenario, while elastic scattering model appears to be compatible with the frequentist statistical approach. We then move to model comparison, for which Bayesian methods are particularly well suited. Firstly, we investigate the annual modulation seen in CoGeNT data, finding that there is weak evidence for a modulation. Modulation models due to other physics compare unfavorably with the WIMP models, paying the price for their excessive complexity. Secondly, we confront several coherent scattering models to determine the current best physical scenario compatible with the experimental hints. We find that exothermic and inelastic dark matter are moderatly disfavored against the elastic scenario, while the isospin violating model has a similar evidence. Lastly the Bayes' factor gives inconclusive evidence for an incompatibility between the data sets of XENON100 and the hints of detection. The same question assessed with goodness of fit would indicate a 2 σ discrepancy. This suggests that more data are therefore needed to settle this question.
Zhu, Xiang; Stephens, Matthew
2017-01-01
Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss. PMID:29399241
ERIC Educational Resources Information Center
Chung, Gregory K. W. K.; Dionne, Gary B.; Kaiser, William J.
2006-01-01
Our research question was whether we could develop a feasible technique, using Bayesian networks, to diagnose gaps in student knowledge. Thirty-four college-age participants completed tasks designed to measure conceptual knowledge, procedural knowledge, and problem-solving skills related to circuit analysis. A Bayesian network was used to model…