Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
UNIFORMLY MOST POWERFUL BAYESIAN TESTS
Johnson, Valen E.
2014-01-01
Uniformly most powerful tests are statistical hypothesis tests that provide the greatest power against a fixed null hypothesis among all tests of a given size. In this article, the notion of uniformly most powerful tests is extended to the Bayesian setting by defining uniformly most powerful Bayesian tests to be tests that maximize the probability that the Bayes factor, in favor of the alternative hypothesis, exceeds a specified threshold. Like their classical counterpart, uniformly most powerful Bayesian tests are most easily defined in one-parameter exponential family models, although extensions outside of this class are possible. The connection between uniformly most powerful tests and uniformly most powerful Bayesian tests can be used to provide an approximate calibration between p-values and Bayes factors. Finally, issues regarding the strong dependence of resulting Bayes factors and p-values on sample size are discussed. PMID:24659829
Bayesian Approaches to Imputation, Hypothesis Testing, and Parameter Estimation
ERIC Educational Resources Information Center
Ross, Steven J.; Mackey, Beth
2015-01-01
This chapter introduces three applications of Bayesian inference to common and novel issues in second language research. After a review of the critiques of conventional hypothesis testing, our focus centers on ways Bayesian inference can be used for dealing with missing data, for testing theory-driven substantive hypotheses without a default null…
A default Bayesian hypothesis test for mediation.
Nuijten, Michèle B; Wetzels, Ruud; Matzke, Dora; Dolan, Conor V; Wagenmakers, Eric-Jan
2015-03-01
In order to quantify the relationship between multiple variables, researchers often carry out a mediation analysis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estimation and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301-322, 2009), who outlined a Bayesian parameter estimation procedure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the Jeffreys-Zellner-Siow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo techniques implemented in JAGS. All Bayesian tests are implemented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014).
Revised standards for statistical evidence.
Johnson, Valen E
2013-11-26
Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25-50:1, and to 100-200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrews, Stephen A.; Sigeti, David E.
These are a set of slides about Bayesian hypothesis testing, where many hypotheses are tested. The conclusions are the following: The value of the Bayes factor obtained when using the median of the posterior marginal is almost the minimum value of the Bayes factor. The value of τ 2 which minimizes the Bayes factor is a reasonable choice for this parameter. This allows a likelihood ratio to be computed with is the least favorable to H 0.
A critique of statistical hypothesis testing in clinical research
Raha, Somik
2011-01-01
Many have documented the difficulty of using the current paradigm of Randomized Controlled Trials (RCTs) to test and validate the effectiveness of alternative medical systems such as Ayurveda. This paper critiques the applicability of RCTs for all clinical knowledge-seeking endeavors, of which Ayurveda research is a part. This is done by examining statistical hypothesis testing, the underlying foundation of RCTs, from a practical and philosophical perspective. In the philosophical critique, the two main worldviews of probability are that of the Bayesian and the frequentist. The frequentist worldview is a special case of the Bayesian worldview requiring the unrealistic assumptions of knowing nothing about the universe and believing that all observations are unrelated to each other. Many have claimed that the first belief is necessary for science, and this claim is debunked by comparing variations in learning with different prior beliefs. Moving beyond the Bayesian and frequentist worldviews, the notion of hypothesis testing itself is challenged on the grounds that a hypothesis is an unclear distinction, and assigning a probability on an unclear distinction is an exercise that does not lead to clarity of action. This critique is of the theory itself and not any particular application of statistical hypothesis testing. A decision-making frame is proposed as a way of both addressing this critique and transcending ideological debates on probability. An example of a Bayesian decision-making approach is shown as an alternative to statistical hypothesis testing, utilizing data from a past clinical trial that studied the effect of Aspirin on heart attacks in a sample population of doctors. As a big reason for the prevalence of RCTs in academia is legislation requiring it, the ethics of legislating the use of statistical methods for clinical research is also examined. PMID:22022152
The researcher and the consultant: from testing to probability statements.
Hamra, Ghassan B; Stang, Andreas; Poole, Charles
2015-09-01
In the first instalment of this series, Stang and Poole provided an overview of Fisher significance testing (ST), Neyman-Pearson null hypothesis testing (NHT), and their unfortunate and unintended offspring, null hypothesis significance testing. In addition to elucidating the distinction between the first two and the evolution of the third, the authors alluded to alternative models of statistical inference; namely, Bayesian statistics. Bayesian inference has experienced a revival in recent decades, with many researchers advocating for its use as both a complement and an alternative to NHT and ST. This article will continue in the direction of the first instalment, providing practicing researchers with an introduction to Bayesian inference. Our work will draw on the examples and discussion of the previous dialogue.
A large scale test of the gaming-enhancement hypothesis.
Przybylski, Andrew K; Wang, John C
2016-01-01
A growing research literature suggests that regular electronic game play and game-based training programs may confer practically significant benefits to cognitive functioning. Most evidence supporting this idea, the gaming-enhancement hypothesis , has been collected in small-scale studies of university students and older adults. This research investigated the hypothesis in a general way with a large sample of 1,847 school-aged children. Our aim was to examine the relations between young people's gaming experiences and an objective test of reasoning performance. Using a Bayesian hypothesis testing approach, evidence for the gaming-enhancement and null hypotheses were compared. Results provided no substantive evidence supporting the idea that having preference for or regularly playing commercially available games was positively associated with reasoning ability. Evidence ranged from equivocal to very strong in support for the null hypothesis over what was predicted. The discussion focuses on the value of Bayesian hypothesis testing for investigating electronic gaming effects, the importance of open science practices, and pre-registered designs to improve the quality of future work.
Matthews, Luke J.; Tehrani, Jamie J.; Jordan, Fiona M.; Collard, Mark; Nunn, Charles L.
2011-01-01
Background Archaeologists and anthropologists have long recognized that different cultural complexes may have distinct descent histories, but they have lacked analytical techniques capable of easily identifying such incongruence. Here, we show how Bayesian phylogenetic analysis can be used to identify incongruent cultural histories. We employ the approach to investigate Iranian tribal textile traditions. Methods We used Bayes factor comparisons in a phylogenetic framework to test two models of cultural evolution: the hierarchically integrated system hypothesis and the multiple coherent units hypothesis. In the hierarchically integrated system hypothesis, a core tradition of characters evolves through descent with modification and characters peripheral to the core are exchanged among contemporaneous populations. In the multiple coherent units hypothesis, a core tradition does not exist. Rather, there are several cultural units consisting of sets of characters that have different histories of descent. Results For the Iranian textiles, the Bayesian phylogenetic analyses supported the multiple coherent units hypothesis over the hierarchically integrated system hypothesis. Our analyses suggest that pile-weave designs represent a distinct cultural unit that has a different phylogenetic history compared to other textile characters. Conclusions The results from the Iranian textiles are consistent with the available ethnographic evidence, which suggests that the commercial rug market has influenced pile-rug designs but not the techniques or designs incorporated in the other textiles produced by the tribes. We anticipate that Bayesian phylogenetic tests for inferring cultural units will be of great value for researchers interested in studying the evolution of cultural traits including language, behavior, and material culture. PMID:21559083
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
ERIC Educational Resources Information Center
Wilcox, Rand R.; Serang, Sarfaraz
2017-01-01
The article provides perspectives on p values, null hypothesis testing, and alternative techniques in light of modern robust statistical methods. Null hypothesis testing and "p" values can provide useful information provided they are interpreted in a sound manner, which includes taking into account insights and advances that have…
Moscoso del Prado Martín, Fermín
2013-12-01
I introduce the Bayesian assessment of scaling (BAS), a simple but powerful Bayesian hypothesis contrast methodology that can be used to test hypotheses on the scaling regime exhibited by a sequence of behavioral data. Rather than comparing parametric models, as typically done in previous approaches, the BAS offers a direct, nonparametric way to test whether a time series exhibits fractal scaling. The BAS provides a simpler and faster test than do previous methods, and the code for making the required computations is provided. The method also enables testing of finely specified hypotheses on the scaling indices, something that was not possible with the previously available methods. I then present 4 simulation studies showing that the BAS methodology outperforms the other methods used in the psychological literature. I conclude with a discussion of methodological issues on fractal analyses in experimental psychology. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Bayesian Methods for Determining the Importance of Effects
USDA-ARS?s Scientific Manuscript database
Criticisms have plagued the frequentist null-hypothesis significance testing (NHST) procedure since the day it was created from the Fisher Significance Test and Hypothesis Test of Jerzy Neyman and Egon Pearson. Alternatives to NHST exist in frequentist statistics, but competing methods are also avai...
A large scale test of the gaming-enhancement hypothesis
Wang, John C.
2016-01-01
A growing research literature suggests that regular electronic game play and game-based training programs may confer practically significant benefits to cognitive functioning. Most evidence supporting this idea, the gaming-enhancement hypothesis, has been collected in small-scale studies of university students and older adults. This research investigated the hypothesis in a general way with a large sample of 1,847 school-aged children. Our aim was to examine the relations between young people’s gaming experiences and an objective test of reasoning performance. Using a Bayesian hypothesis testing approach, evidence for the gaming-enhancement and null hypotheses were compared. Results provided no substantive evidence supporting the idea that having preference for or regularly playing commercially available games was positively associated with reasoning ability. Evidence ranged from equivocal to very strong in support for the null hypothesis over what was predicted. The discussion focuses on the value of Bayesian hypothesis testing for investigating electronic gaming effects, the importance of open science practices, and pre-registered designs to improve the quality of future work. PMID:27896035
The frequentist implications of optional stopping on Bayesian hypothesis tests.
Sanborn, Adam N; Hills, Thomas T
2014-04-01
Null hypothesis significance testing (NHST) is the most commonly used statistical methodology in psychology. The probability of achieving a value as extreme or more extreme than the statistic obtained from the data is evaluated, and if it is low enough, the null hypothesis is rejected. However, because common experimental practice often clashes with the assumptions underlying NHST, these calculated probabilities are often incorrect. Most commonly, experimenters use tests that assume that sample sizes are fixed in advance of data collection but then use the data to determine when to stop; in the limit, experimenters can use data monitoring to guarantee that the null hypothesis will be rejected. Bayesian hypothesis testing (BHT) provides a solution to these ills because the stopping rule used is irrelevant to the calculation of a Bayes factor. In addition, there are strong mathematical guarantees on the frequentist properties of BHT that are comforting for researchers concerned that stopping rules could influence the Bayes factors produced. Here, we show that these guaranteed bounds have limited scope and often do not apply in psychological research. Specifically, we quantitatively demonstrate the impact of optional stopping on the resulting Bayes factors in two common situations: (1) when the truth is a combination of the hypotheses, such as in a heterogeneous population, and (2) when a hypothesis is composite-taking multiple parameter values-such as the alternative hypothesis in a t-test. We found that, for these situations, while the Bayesian interpretation remains correct regardless of the stopping rule used, the choice of stopping rule can, in some situations, greatly increase the chance of experimenters finding evidence in the direction they desire. We suggest ways to control these frequentist implications of stopping rules on BHT.
A Bayesian Approach to the Paleomagnetic Conglomerate Test
NASA Astrophysics Data System (ADS)
Heslop, David; Roberts, Andrew P.
2018-02-01
The conglomerate test has served the paleomagnetic community for over 60 years as a means to detect remagnetizations. The test states that if a suite of clasts within a bed have uniformly random paleomagnetic directions, then the conglomerate cannot have experienced a pervasive event that remagnetized the clasts in the same direction. The current form of the conglomerate test is based on null hypothesis testing, which results in a binary "pass" (uniformly random directions) or "fail" (nonrandom directions) outcome. We have recast the conglomerate test in a Bayesian framework with the aim of providing more information concerning the level of support a given data set provides for a hypothesis of uniformly random paleomagnetic directions. Using this approach, we place the conglomerate test in a fully probabilistic framework that allows for inconclusive results when insufficient information is available to draw firm conclusions concerning the randomness or nonrandomness of directions. With our method, sample sets larger than those typically employed in paleomagnetism may be required to achieve strong support for a hypothesis of random directions. Given the potentially detrimental effect of unrecognized remagnetizations on paleomagnetic reconstructions, it is important to provide a means to draw statistically robust data-driven inferences. Our Bayesian analysis provides a means to do this for the conglomerate test.
To P or Not to P: Backing Bayesian Statistics.
Buchinsky, Farrel J; Chadha, Neil K
2017-12-01
In biomedical research, it is imperative to differentiate chance variation from truth before we generalize what we see in a sample of subjects to the wider population. For decades, we have relied on null hypothesis significance testing, where we calculate P values for our data to decide whether to reject a null hypothesis. This methodology is subject to substantial misinterpretation and errant conclusions. Instead of working backward by calculating the probability of our data if the null hypothesis were true, Bayesian statistics allow us instead to work forward, calculating the probability of our hypothesis given the available data. This methodology gives us a mathematical means of incorporating our "prior probabilities" from previous study data (if any) to produce new "posterior probabilities." Bayesian statistics tell us how confidently we should believe what we believe. It is time to embrace and encourage their use in our otolaryngology research.
ERIC Educational Resources Information Center
Marmolejo-Ramos, Fernando; Cousineau, Denis
2017-01-01
The number of articles showing dissatisfaction with the null hypothesis statistical testing (NHST) framework has been progressively increasing over the years. Alternatives to NHST have been proposed and the Bayesian approach seems to have achieved the highest amount of visibility. In this last part of the special issue, a few alternative…
Krypotos, Angelos-Miltiadis; Klugkist, Irene; Engelhard, Iris M.
2017-01-01
ABSTRACT Threat conditioning procedures have allowed the experimental investigation of the pathogenesis of Post-Traumatic Stress Disorder. The findings of these procedures have also provided stable foundations for the development of relevant intervention programs (e.g. exposure therapy). Statistical inference of threat conditioning procedures is commonly based on p-values and Null Hypothesis Significance Testing (NHST). Nowadays, however, there is a growing concern about this statistical approach, as many scientists point to the various limitations of p-values and NHST. As an alternative, the use of Bayes factors and Bayesian hypothesis testing has been suggested. In this article, we apply this statistical approach to threat conditioning data. In order to enable the easy computation of Bayes factors for threat conditioning data we present a new R package named condir, which can be used either via the R console or via a Shiny application. This article provides both a non-technical introduction to Bayesian analysis for researchers using the threat conditioning paradigm, and the necessary tools for computing Bayes factors easily. PMID:29038683
A Bayesian bird's eye view of ‘Replications of important results in social psychology’
Schönbrodt, Felix D.; Yao, Yuling; Gelman, Andrew; Wagenmakers, Eric-Jan
2017-01-01
We applied three Bayesian methods to reanalyse the preregistered contributions to the Social Psychology special issue ‘Replications of Important Results in Social Psychology’ (Nosek & Lakens. 2014 Registered reports: a method to increase the credibility of published results. Soc. Psychol. 45, 137–141. (doi:10.1027/1864-9335/a000192)). First, individual-experiment Bayesian parameter estimation revealed that for directed effect size measures, only three out of 44 central 95% credible intervals did not overlap with zero and fell in the expected direction. For undirected effect size measures, only four out of 59 credible intervals contained values greater than 0.10 (10% of variance explained) and only 19 intervals contained values larger than 0.05. Second, a Bayesian random-effects meta-analysis for all 38 t-tests showed that only one out of the 38 hierarchically estimated credible intervals did not overlap with zero and fell in the expected direction. Third, a Bayes factor hypothesis test was used to quantify the evidence for the null hypothesis against a default one-sided alternative. Only seven out of 60 Bayes factors indicated non-anecdotal support in favour of the alternative hypothesis (BF10>3), whereas 51 Bayes factors indicated at least some support for the null hypothesis. We hope that future analyses of replication success will embrace a more inclusive statistical approach by adopting a wider range of complementary techniques. PMID:28280547
A Rational Analysis of the Selection Task as Optimal Data Selection.
ERIC Educational Resources Information Center
Oaksford, Mike; Chater, Nick
1994-01-01
Experimental data on human reasoning in hypothesis-testing tasks is reassessed in light of a Bayesian model of optimal data selection in inductive hypothesis testing. The rational analysis provided by the model suggests that reasoning in such tasks may be rational rather than subject to systematic bias. (SLD)
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
Suggestions for presenting the results of data analyses
Anderson, David R.; Link, William A.; Johnson, Douglas H.; Burnham, Kenneth P.
2001-01-01
We give suggestions for the presentation of research results from frequentist, information-theoretic, and Bayesian analysis paradigms, followed by several general suggestions. The information-theoretic and Bayesian methods offer alternative approaches to data analysis and inference compared to traditionally used methods. Guidance is lacking on the presentation of results under these alternative procedures and on nontesting aspects of classical frequentists methods of statistical analysis. Null hypothesis testing has come under intense criticism. We recommend less reporting of the results of statistical tests of null hypotheses in cases where the null is surely false anyway, or where the null hypothesis is of little interest to science or management.
NASA Astrophysics Data System (ADS)
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
Bayesian Hypothesis Testing for Psychologists: A Tutorial on the Savage-Dickey Method
ERIC Educational Resources Information Center
Wagenmakers, Eric-Jan; Lodewyckx, Tom; Kuriyal, Himanshu; Grasman, Raoul
2010-01-01
In the field of cognitive psychology, the "p"-value hypothesis test has established a stranglehold on statistical reporting. This is unfortunate, as the "p"-value provides at best a rough estimate of the evidence that the data provide for the presence of an experimental effect. An alternative and arguably more appropriate measure of evidence is…
Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363
Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.
Bug Distribution and Statistical Pattern Classification.
ERIC Educational Resources Information Center
Tatsuoka, Kikumi K.; Tatsuoka, Maurice M.
1987-01-01
The rule space model permits measurement of cognitive skill acquisition and error diagnosis. Further discussion introduces Bayesian hypothesis testing and bug distribution. An illustration involves an artificial intelligence approach to testing fractions and arithmetic. (Author/GDC)
Bayesian adaptive phase II screening design for combination trials.
Cai, Chunyan; Yuan, Ying; Johnson, Valen E
2013-01-01
Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Simulation studies show that the proposed design substantially outperforms the conventional multiarm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while allocating substantially more patients to efficacious treatments. The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while providing higher power to identify the best treatment at the end of the trial.
Hypothesis Testing as an Act of Rationality
NASA Astrophysics Data System (ADS)
Nearing, Grey
2017-04-01
Statistical hypothesis testing is ad hoc in two ways. First, setting probabilistic rejection criteria is, as Neyman (1957) put it, an act of will rather than an act of rationality. Second, physical theories like conservation laws do not inherently admit probabilistic predictions, and so we must use what are called epistemic bridge principles to connect model predictions with the actual methods of hypothesis testing. In practice, these bridge principles are likelihood functions, error functions, or performance metrics. I propose that the reason we are faced with these problems is because we have historically failed to account for a fundamental component of basic logic - namely the portion of logic that explains how epistemic states evolve in the presence of empirical data. This component of Cox' (1946) calculitic logic is called information theory (Knuth, 2005), and adding information theory our hypothetico-deductive account of science yields straightforward solutions to both of the above problems. This also yields a straightforward method for dealing with Popper's (1963) problem of verisimilitude by facilitating a quantitative approach to measuring process isomorphism. In practice, this involves data assimilation. Finally, information theory allows us to reliably bound measures of epistemic uncertainty, thereby avoiding the problem of Bayesian incoherency under misspecified priors (Grünwald, 2006). I therefore propose solutions to four of the fundamental problems inherent in both hypothetico-deductive and/or Bayesian hypothesis testing. - Neyman (1957) Inductive Behavior as a Basic Concept of Philosophy of Science. - Cox (1946) Probability, Frequency and Reasonable Expectation. - Knuth (2005) Lattice Duality: The Origin of Probability and Entropy. - Grünwald (2006). Bayesian Inconsistency under Misspecification. - Popper (1963) Conjectures and Refutations: The Growth of Scientific Knowledge.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
ERIC Educational Resources Information Center
Page, Robert; Satake, Eiki
2017-01-01
While interest in Bayesian statistics has been growing in statistics education, the treatment of the topic is still inadequate in both textbooks and the classroom. Because so many fields of study lead to careers that involve a decision-making process requiring an understanding of Bayesian methods, it is becoming increasingly clear that Bayesian…
Shi, Haolun; Yin, Guosheng
2018-02-21
Simon's two-stage design is one of the most commonly used methods in phase II clinical trials with binary endpoints. The design tests the null hypothesis that the response rate is less than an uninteresting level, versus the alternative hypothesis that the response rate is greater than a desirable target level. From a Bayesian perspective, we compute the posterior probabilities of the null and alternative hypotheses given that a promising result is declared in Simon's design. Our study reveals that because the frequentist hypothesis testing framework places its focus on the null hypothesis, a potentially efficacious treatment identified by rejecting the null under Simon's design could have only less than 10% posterior probability of attaining the desirable target level. Due to the indifference region between the null and alternative, rejecting the null does not necessarily mean that the drug achieves the desirable response level. To clarify such ambiguity, we propose a Bayesian enhancement two-stage (BET) design, which guarantees a high posterior probability of the response rate reaching the target level, while allowing for early termination and sample size saving in case that the drug's response rate is smaller than the clinically uninteresting level. Moreover, the BET design can be naturally adapted to accommodate survival endpoints. We conduct extensive simulation studies to examine the empirical performance of our design and present two trial examples as applications. © 2018, The International Biometric Society.
A Bayesian framework to estimate diversification rates and their variation through time and space
2011-01-01
Background Patterns of species diversity are the result of speciation and extinction processes, and molecular phylogenetic data can provide valuable information to derive their variability through time and across clades. Bayesian Markov chain Monte Carlo methods offer a promising framework to incorporate phylogenetic uncertainty when estimating rates of diversification. Results We introduce a new approach to estimate diversification rates in a Bayesian framework over a distribution of trees under various constant and variable rate birth-death and pure-birth models, and test it on simulated phylogenies. Furthermore, speciation and extinction rates and their posterior credibility intervals can be estimated while accounting for non-random taxon sampling. The framework is particularly suitable for hypothesis testing using Bayes factors, as we demonstrate analyzing dated phylogenies of Chondrostoma (Cyprinidae) and Lupinus (Fabaceae). In addition, we develop a model that extends the rate estimation to a meta-analysis framework in which different data sets are combined in a single analysis to detect general temporal and spatial trends in diversification. Conclusions Our approach provides a flexible framework for the estimation of diversification parameters and hypothesis testing while simultaneously accounting for uncertainties in the divergence times and incomplete taxon sampling. PMID:22013891
Bayesian adaptive phase II screening design for combination trials
Cai, Chunyan; Yuan, Ying; Johnson, Valen E
2013-01-01
Background Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Methods Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Results Simulation studies show that the proposed design substantially outperforms the conventional multiarm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while allocating substantially more patients to efficacious treatments. Limitations The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. Conclusions The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while providing higher power to identify the best treatment at the end of the trial. PMID:23359875
Bayes Factor Approaches for Testing Interval Null Hypotheses
ERIC Educational Resources Information Center
Morey, Richard D.; Rouder, Jeffrey N.
2011-01-01
Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue…
Bayesian meta-analysis of Cronbach's coefficient alpha to evaluate informative hypotheses.
Okada, Kensuke
2015-12-01
This paper proposes a new method to evaluate informative hypotheses for meta-analysis of Cronbach's coefficient alpha using a Bayesian approach. The coefficient alpha is one of the most widely used reliability indices. In meta-analyses of reliability, researchers typically form specific informative hypotheses beforehand, such as 'alpha of this test is greater than 0.8' or 'alpha of one form of a test is greater than the others.' The proposed method enables direct evaluation of these informative hypotheses. To this end, a Bayes factor is calculated to evaluate the informative hypothesis against its complement. It allows researchers to summarize the evidence provided by previous studies in favor of their informative hypothesis. The proposed approach can be seen as a natural extension of the Bayesian meta-analysis of coefficient alpha recently proposed in this journal (Brannick and Zhang, 2013). The proposed method is illustrated through two meta-analyses of real data that evaluate different kinds of informative hypotheses on superpopulation: one is that alpha of a particular test is above the criterion value, and the other is that alphas among different test versions have ordered relationships. Informative hypotheses are supported from the data in both cases, suggesting that the proposed approach is promising for application. Copyright © 2015 John Wiley & Sons, Ltd.
Unification of field theory and maximum entropy methods for learning probability densities
NASA Astrophysics Data System (ADS)
Kinney, Justin B.
2015-09-01
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.
Unification of field theory and maximum entropy methods for learning probability densities.
Kinney, Justin B
2015-09-01
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.
Applications of Bayesian Statistics to Problems in Gamma-Ray Bursts
NASA Technical Reports Server (NTRS)
Meegan, Charles A.
1997-01-01
This presentation will describe two applications of Bayesian statistics to Gamma Ray Bursts (GRBS). The first attempts to quantify the evidence for a cosmological versus galactic origin of GRBs using only the observations of the dipole and quadrupole moments of the angular distribution of bursts. The cosmological hypothesis predicts isotropy, while the galactic hypothesis is assumed to produce a uniform probability distribution over positive values for these moments. The observed isotropic distribution indicates that the Bayes factor for the cosmological hypothesis over the galactic hypothesis is about 300. Another application of Bayesian statistics is in the estimation of chance associations of optical counterparts with galaxies. The Bayesian approach is preferred to frequentist techniques here because the Bayesian approach easily accounts for galaxy mass distributions and because one can incorporate three disjoint hypotheses: (1) bursts come from galactic centers, (2) bursts come from galaxies in proportion to luminosity, and (3) bursts do not come from external galaxies. This technique was used in the analysis of the optical counterpart to GRB970228.
Bayesian Nonparametric Prediction and Statistical Inference
1989-09-07
Kadane, J. (1980), "Bayesian decision theory and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. Ramsey , eds...the random model and weighted least squares regression," in Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217...likelihood function. On the other hand, H. Jeffreys’s theory of hypothesis testing covers the most important situations in which the prior is not diffuse. See
A Primer on Bayesian Analysis for Experimental Psychopathologists
Krypotos, Angelos-Miltiadis; Blanken, Tessa F.; Arnaudova, Inna; Matzke, Dora; Beckers, Tom
2016-01-01
The principal goals of experimental psychopathology (EPP) research are to offer insights into the pathogenic mechanisms of mental disorders and to provide a stable ground for the development of clinical interventions. The main message of the present article is that those goals are better served by the adoption of Bayesian statistics than by the continued use of null-hypothesis significance testing (NHST). In the first part of the article we list the main disadvantages of NHST and explain why those disadvantages limit the conclusions that can be drawn from EPP research. Next, we highlight the advantages of Bayesian statistics. To illustrate, we then pit NHST and Bayesian analysis against each other using an experimental data set from our lab. Finally, we discuss some challenges when adopting Bayesian statistics. We hope that the present article will encourage experimental psychopathologists to embrace Bayesian statistics, which could strengthen the conclusions drawn from EPP research. PMID:28748068
Tressoldi, Patrizio E.
2011-01-01
Starting from the famous phrase “extraordinary claims require extraordinary evidence,” we will present the evidence supporting the concept that human visual perception may have non-local properties, in other words, that it may operate beyond the space and time constraints of sensory organs, in order to discuss which criteria can be used to define evidence as extraordinary. This evidence has been obtained from seven databases which are related to six different protocols used to test the reality and the functioning of non-local perception, analyzed using both a frequentist and a new Bayesian meta-analysis statistical procedure. According to a frequentist meta-analysis, the null hypothesis can be rejected for all six protocols even if the effect sizes range from 0.007 to 0.28. According to Bayesian meta-analysis, the Bayes factors provides strong evidence to support the alternative hypothesis (H1) over the null hypothesis (H0), but only for three out of the six protocols. We will discuss whether quantitative psychology can contribute to defining the criteria for the acceptance of new scientific ideas in order to avoid the inconclusive controversies between supporters and opponents. PMID:21713069
Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation
NASA Technical Reports Server (NTRS)
Jefferys, William H.; Berger, James O.
1992-01-01
'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.
Steingroever, Helen; Pachur, Thorsten; Šmíra, Martin; Lee, Michael D
2018-06-01
The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.
NASA Astrophysics Data System (ADS)
Plant, N. G.; Thieler, E. R.; Gutierrez, B.; Lentz, E. E.; Zeigler, S. L.; Van Dongeren, A.; Fienen, M. N.
2016-12-01
We evaluate the strengths and weaknesses of Bayesian networks that have been used to address scientific and decision-support questions related to coastal geomorphology. We will provide an overview of coastal geomorphology research that has used Bayesian networks and describe what this approach can do and when it works (or fails to work). Over the past decade, Bayesian networks have been formulated to analyze the multi-variate structure and evolution of coastal morphology and associated human and ecological impacts. The approach relates observable system variables to each other by estimating discrete correlations. The resulting Bayesian-networks make predictions that propagate errors, conduct inference via Bayes rule, or both. In scientific applications, the model results are useful for hypothesis testing, using confidence estimates to gage the strength of tests while applications to coastal resource management are aimed at decision-support, where the probabilities of desired ecosystems outcomes are evaluated. The range of Bayesian-network applications to coastal morphology includes emulation of high-resolution wave transformation models to make oceanographic predictions, morphologic response to storms and/or sea-level rise, groundwater response to sea-level rise and morphologic variability, habitat suitability for endangered species, and assessment of monetary or human-life risk associated with storms. All of these examples are based on vast observational data sets, numerical model output, or both. We will discuss the progression of our experiments, which has included testing whether the Bayesian-network approach can be implemented and is appropriate for addressing basic and applied scientific problems and evaluating the hindcast and forecast skill of these implementations. We will present and discuss calibration/validation tests that are used to assess the robustness of Bayesian-network models and we will compare these results to tests of other models. This will demonstrate how Bayesian networks are used to extract new insights about coastal morphologic behavior, assess impacts to societal and ecological systems, and communicate probabilistic predictions to decision makers.
Bayesian wavelet PCA methodology for turbomachinery damage diagnosis under uncertainty
NASA Astrophysics Data System (ADS)
Xu, Shengli; Jiang, Xiaomo; Huang, Jinzhi; Yang, Shuhua; Wang, Xiaofang
2016-12-01
Centrifugal compressor often suffers various defects such as impeller cracking, resulting in forced outage of the total plant. Damage diagnostics and condition monitoring of such a turbomachinery system has become an increasingly important and powerful tool to prevent potential failure in components and reduce unplanned forced outage and further maintenance costs, while improving reliability, availability and maintainability of a turbomachinery system. This paper presents a probabilistic signal processing methodology for damage diagnostics using multiple time history data collected from different locations of a turbomachine, considering data uncertainty and multivariate correlation. The proposed methodology is based on the integration of three advanced state-of-the-art data mining techniques: discrete wavelet packet transform, Bayesian hypothesis testing, and probabilistic principal component analysis. The multiresolution wavelet analysis approach is employed to decompose a time series signal into different levels of wavelet coefficients. These coefficients represent multiple time-frequency resolutions of a signal. Bayesian hypothesis testing is then applied to each level of wavelet coefficient to remove possible imperfections. The ratio of posterior odds Bayesian approach provides a direct means to assess whether there is imperfection in the decomposed coefficients, thus avoiding over-denoising. Power spectral density estimated by the Welch method is utilized to evaluate the effectiveness of Bayesian wavelet cleansing method. Furthermore, the probabilistic principal component analysis approach is developed to reduce dimensionality of multiple time series and to address multivariate correlation and data uncertainty for damage diagnostics. The proposed methodology and generalized framework is demonstrated with a set of sensor data collected from a real-world centrifugal compressor with impeller cracks, through both time series and contour analyses of vibration signal and principal components.
An objective Bayesian analysis of a crossover design via model selection and model averaging.
Li, Dandan; Sivaganesan, Siva
2016-11-10
Inference about the treatment effect in a crossover design has received much attention over time owing to the uncertainty in the existence of the carryover effect and its impact on the estimation of the treatment effect. Adding to this uncertainty is that the existence of the carryover effect and its size may depend on the presence of the treatment effect and its size. We consider estimation and testing hypothesis about the treatment effect in a two-period crossover design, assuming normally distributed response variable, and use an objective Bayesian approach to test the hypothesis about the treatment effect and to estimate its size when it exists while accounting for the uncertainty about the presence of the carryover effect as well as the treatment and period effects. We evaluate and compare the performance of the proposed approach with a standard frequentist approach using simulated data, and real data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Unscaled Bayes factors for multiple hypothesis testing in microarray experiments.
Bertolino, Francesco; Cabras, Stefano; Castellanos, Maria Eugenia; Racugno, Walter
2015-12-01
Multiple hypothesis testing collects a series of techniques usually based on p-values as a summary of the available evidence from many statistical tests. In hypothesis testing, under a Bayesian perspective, the evidence for a specified hypothesis against an alternative, conditionally on data, is given by the Bayes factor. In this study, we approach multiple hypothesis testing based on both Bayes factors and p-values, regarding multiple hypothesis testing as a multiple model selection problem. To obtain the Bayes factors we assume default priors that are typically improper. In this case, the Bayes factor is usually undetermined due to the ratio of prior pseudo-constants. We show that ignoring prior pseudo-constants leads to unscaled Bayes factor which do not invalidate the inferential procedure in multiple hypothesis testing, because they are used within a comparative scheme. In fact, using partial information from the p-values, we are able to approximate the sampling null distribution of the unscaled Bayes factor and use it within Efron's multiple testing procedure. The simulation study suggests that under normal sampling model and even with small sample sizes, our approach provides false positive and false negative proportions that are less than other common multiple hypothesis testing approaches based only on p-values. The proposed procedure is illustrated in two simulation studies, and the advantages of its use are showed in the analysis of two microarray experiments. © The Author(s) 2011.
A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.
Yu, Qingzhao; Zhu, Lin; Zhu, Han
2017-11-01
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.
On Some Assumptions of the Null Hypothesis Statistical Testing
ERIC Educational Resources Information Center
Patriota, Alexandre Galvão
2017-01-01
Bayesian and classical statistical approaches are based on different types of logical principles. In order to avoid mistaken inferences and misguided interpretations, the practitioner must respect the inference rules embedded into each statistical method. Ignoring these principles leads to the paradoxical conclusions that the hypothesis…
Kuiper, Rebecca M; Nederhoff, Tim; Klugkist, Irene
2015-05-01
In this paper, the performance of six types of techniques for comparisons of means is examined. These six emerge from the distinction between the method employed (hypothesis testing, model selection using information criteria, or Bayesian model selection) and the set of hypotheses that is investigated (a classical, exploration-based set of hypotheses containing equality constraints on the means, or a theory-based limited set of hypotheses with equality and/or order restrictions). A simulation study is conducted to examine the performance of these techniques. We demonstrate that, if one has specific, a priori specified hypotheses, confirmation (i.e., investigating theory-based hypotheses) has advantages over exploration (i.e., examining all possible equality-constrained hypotheses). Furthermore, examining reasonable order-restricted hypotheses has more power to detect the true effect/non-null hypothesis than evaluating only equality restrictions. Additionally, when investigating more than one theory-based hypothesis, model selection is preferred over hypothesis testing. Because of the first two results, we further examine the techniques that are able to evaluate order restrictions in a confirmatory fashion by examining their performance when the homogeneity of variance assumption is violated. Results show that the techniques are robust to heterogeneity when the sample sizes are equal. When the sample sizes are unequal, the performance is affected by heterogeneity. The size and direction of the deviations from the baseline, where there is no heterogeneity, depend on the effect size (of the means) and on the trend in the group variances with respect to the ordering of the group sizes. Importantly, the deviations are less pronounced when the group variances and sizes exhibit the same trend (e.g., are both increasing with group number). © 2014 The British Psychological Society.
Non-Bayesian Inference: Causal Structure Trumps Correlation
ERIC Educational Resources Information Center
Bes, Benedicte; Sloman, Steven; Lucas, Christopher G.; Raufaste, Eric
2012-01-01
The study tests the hypothesis that conditional probability judgments can be influenced by causal links between the target event and the evidence even when the statistical relations among variables are held constant. Three experiments varied the causal structure relating three variables and found that (a) the target event was perceived as more…
Comparisons of Means Using Exploratory and Confirmatory Approaches
ERIC Educational Resources Information Center
Kuiper, Rebecca M.; Hoijtink, Herbert
2010-01-01
This article discusses comparisons of means using exploratory and confirmatory approaches. Three methods are discussed: hypothesis testing, model selection based on information criteria, and Bayesian model selection. Throughout the article, an example is used to illustrate and evaluate the two approaches and the three methods. We demonstrate that…
Burroughs, N J; Pillay, D; Mutimer, D
1999-01-01
Bayesian analysis using a virus dynamics model is demonstrated to facilitate hypothesis testing of patterns in clinical time-series. Our Markov chain Monte Carlo implementation demonstrates that the viraemia time-series observed in two sets of hepatitis B patients on antiviral (lamivudine) therapy, chronic carriers and liver transplant patients, are significantly different, overcoming clinical trial design differences that question the validity of non-parametric tests. We show that lamivudine-resistant mutants grow faster in transplant patients than in chronic carriers, which probably explains the differences in emergence times and failure rates between these two sets of patients. Incorporation of dynamic models into Bayesian parameter analysis is of general applicability in medical statistics. PMID:10643081
Tracing the footsteps of Sherlock Holmes: cognitive representations of hypothesis testing.
Van Wallendael, L R; Hastie, R
1990-05-01
A well-documented phenomenon in opinion-revision literature is subjects' failure to revise probability estimates for an exhaustive set of mutually exclusive hypotheses in a complementary manner. However, prior research has not addressed the question of whether such behavior simply represents a misunderstanding of mathematical rules, or whether it is a consequence of a cognitive representation of hypotheses that is at odds with the Bayesian notion of a set relationship. Two alternatives to the Bayesian representation, a belief system (Shafer, 1976) and a system of independent hypotheses, were proposed, and three experiments were conducted to examine cognitive representations of hypothesis sets in the testing of multiple competing hypotheses. Subjects were given brief murder mysteries to solve and allowed to request various types of information about the suspects; after having received each new piece of information, subjects rated each suspect's probability of being the murderer. Presence and timing of suspect eliminations were varied in the first two experiments; the final experiment involved the varying of percentages of clues that referred to more than one suspect (for example, all of the female suspects). The noncomplementarity of opinion revisions remained a strong phenomenon in all conditions. Information-search data refuted the idea that subjects represented hypotheses as a Bayesian set; further study of the independent hypotheses theory and Shaferian belief functions as descriptive models is encouraged.
A Tutorial in Bayesian Potential Outcomes Mediation Analysis.
Miočević, Milica; Gonzalez, Oscar; Valente, Matthew J; MacKinnon, David P
2018-01-01
Statistical mediation analysis is used to investigate intermediate variables in the relation between independent and dependent variables. Causal interpretation of mediation analyses is challenging because randomization of subjects to levels of the independent variable does not rule out the possibility of unmeasured confounders of the mediator to outcome relation. Furthermore, commonly used frequentist methods for mediation analysis compute the probability of the data given the null hypothesis, which is not the probability of a hypothesis given the data as in Bayesian analysis. Under certain assumptions, applying the potential outcomes framework to mediation analysis allows for the computation of causal effects, and statistical mediation in the Bayesian framework gives indirect effects probabilistic interpretations. This tutorial combines causal inference and Bayesian methods for mediation analysis so the indirect and direct effects have both causal and probabilistic interpretations. Steps in Bayesian causal mediation analysis are shown in the application to an empirical example.
Why Current Statistics of Complementary Alternative Medicine Clinical Trials is Invalid.
Pandolfi, Maurizio; Carreras, Giulia
2018-06-07
It is not sufficiently known that frequentist statistics cannot provide direct information on the probability that the research hypothesis tested is correct. The error resulting from this misunderstanding is compounded when the hypotheses under scrutiny have precarious scientific bases, which, generally, those of complementary alternative medicine (CAM) are. In such cases, it is mandatory to use inferential statistics, considering the prior probability that the hypothesis tested is true, such as the Bayesian statistics. The authors show that, under such circumstances, no real statistical significance can be achieved in CAM clinical trials. In this respect, CAM trials involving human material are also hardly defensible from an ethical viewpoint.
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-01
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-08
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
Bayesian Posterior Odds Ratios: Statistical Tools for Collaborative Evaluations
ERIC Educational Resources Information Center
Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon
2018-01-01
To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…
NASA Astrophysics Data System (ADS)
Koch, Wolfgang
1996-05-01
Sensor data processing in a dense target/dense clutter environment is inevitably confronted with data association conflicts which correspond with the multiple hypothesis character of many modern approaches (MHT: multiple hypothesis tracking). In this paper we analyze the efficiency of retrodictive techniques that generalize standard fixed interval smoothing to MHT applications. 'Delayed estimation' based on retrodiction provides uniquely interpretable and accurate trajectories from ambiguous MHT output if a certain time delay is tolerated. In a Bayesian framework the theoretical background of retrodiction and its intimate relation to Bayesian MHT is sketched. By a simulated example with two closely-spaced targets, relatively low detection probabilities, and rather high false return densities, we demonstrate the benefits of retrodiction and quantitatively discuss the achievable track accuracies and the time delays involved for typical radar parameters.
CytoBayesJ: software tools for Bayesian analysis of cytogenetic radiation dosimetry data.
Ainsbury, Elizabeth A; Vinnikov, Volodymyr; Puig, Pedro; Maznyk, Nataliya; Rothkamm, Kai; Lloyd, David C
2013-08-30
A number of authors have suggested that a Bayesian approach may be most appropriate for analysis of cytogenetic radiation dosimetry data. In the Bayesian framework, probability of an event is described in terms of previous expectations and uncertainty. Previously existing, or prior, information is used in combination with experimental results to infer probabilities or the likelihood that a hypothesis is true. It has been shown that the Bayesian approach increases both the accuracy and quality assurance of radiation dose estimates. New software entitled CytoBayesJ has been developed with the aim of bringing Bayesian analysis to cytogenetic biodosimetry laboratory practice. CytoBayesJ takes a number of Bayesian or 'Bayesian like' methods that have been proposed in the literature and presents them to the user in the form of simple user-friendly tools, including testing for the most appropriate model for distribution of chromosome aberrations and calculations of posterior probability distributions. The individual tools are described in detail and relevant examples of the use of the methods and the corresponding CytoBayesJ software tools are given. In this way, the suitability of the Bayesian approach to biological radiation dosimetry is highlighted and its wider application encouraged by providing a user-friendly software interface and manual in English and Russian. Copyright © 2013 Elsevier B.V. All rights reserved.
Qian, Song S; Lyons, Regan E
2006-10-01
We present a Bayesian approach for characterizing background contaminant concentration distributions using data from sites that may have been contaminated. Our method, focused on estimation, resolves several technical problems of the existing methods sanctioned by the U.S. Environmental Protection Agency (USEPA) (a hypothesis testing based method), resulting in a simple and quick procedure for estimating background contaminant concentrations. The proposed Bayesian method is applied to two data sets from a federal facility regulated under the Resource Conservation and Restoration Act. The results are compared to background distributions identified using existing methods recommended by the USEPA. The two data sets represent low and moderate levels of censorship in the data. Although an unbiased estimator is elusive, we show that the proposed Bayesian estimation method will have a smaller bias than the EPA recommended method.
On the importance of avoiding shortcuts in applying cognitive models to hierarchical data.
Boehm, Udo; Marsman, Maarten; Matzke, Dora; Wagenmakers, Eric-Jan
2018-06-12
Psychological experiments often yield data that are hierarchically structured. A number of popular shortcut strategies in cognitive modeling do not properly accommodate this structure and can result in biased conclusions. To gauge the severity of these biases, we conducted a simulation study for a two-group experiment. We first considered a modeling strategy that ignores the hierarchical data structure. In line with theoretical results, our simulations showed that Bayesian and frequentist methods that rely on this strategy are biased towards the null hypothesis. Secondly, we considered a modeling strategy that takes a two-step approach by first obtaining participant-level estimates from a hierarchical cognitive model and subsequently using these estimates in a follow-up statistical test. Methods that rely on this strategy are biased towards the alternative hypothesis. Only hierarchical models of the multilevel data lead to correct conclusions. Our results are particularly relevant for the use of hierarchical Bayesian parameter estimates in cognitive modeling.
Structure Learning in Bayesian Sensorimotor Integration
Genewein, Tim; Hez, Eduard; Razzaghpanah, Zeynab; Braun, Daniel A.
2015-01-01
Previous studies have shown that sensorimotor processing can often be described by Bayesian learning, in particular the integration of prior and feedback information depending on its degree of reliability. Here we test the hypothesis that the integration process itself can be tuned to the statistical structure of the environment. We exposed human participants to a reaching task in a three-dimensional virtual reality environment where we could displace the visual feedback of their hand position in a two dimensional plane. When introducing statistical structure between the two dimensions of the displacement, we found that over the course of several days participants adapted their feedback integration process in order to exploit this structure for performance improvement. In control experiments we found that this adaptation process critically depended on performance feedback and could not be induced by verbal instructions. Our results suggest that structural learning is an important meta-learning component of Bayesian sensorimotor integration. PMID:26305797
Bayesian randomized clinical trials: From fixed to adaptive design.
Yin, Guosheng; Lam, Chi Kin; Shi, Haolun
2017-08-01
Randomized controlled studies are the gold standard for phase III clinical trials. Using α-spending functions to control the overall type I error rate, group sequential methods are well established and have been dominating phase III studies. Bayesian randomized design, on the other hand, can be viewed as a complement instead of competitive approach to the frequentist methods. For the fixed Bayesian design, the hypothesis testing can be cast in the posterior probability or Bayes factor framework, which has a direct link to the frequentist type I error rate. Bayesian group sequential design relies upon Bayesian decision-theoretic approaches based on backward induction, which is often computationally intensive. Compared with the frequentist approaches, Bayesian methods have several advantages. The posterior predictive probability serves as a useful and convenient tool for trial monitoring, and can be updated at any time as the data accrue during the trial. The Bayesian decision-theoretic framework possesses a direct link to the decision making in the practical setting, and can be modeled more realistically to reflect the actual cost-benefit analysis during the drug development process. Other merits include the possibility of hierarchical modeling and the use of informative priors, which would lead to a more comprehensive utilization of information from both historical and longitudinal data. From fixed to adaptive design, we focus on Bayesian randomized controlled clinical trials and make extensive comparisons with frequentist counterparts through numerical studies. Copyright © 2017 Elsevier Inc. All rights reserved.
A systematic review of Bayesian articles in psychology: The last 25 years.
van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah
2017-06-01
Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A Tutorial Introduction to Bayesian Models of Cognitive Development
2011-01-01
typewriter with an infinite amount of paper. There is a space of documents that it is capable of producing, which includes things like The Tempest and does...not include, say, a Vermeer painting or a poem written in Russian. This typewriter represents a means of generating the hypothesis space for a Bayesian...learner: each possible document that can be typed on it is a hypothesis, the infinite set of documents producible by the typewriter is the latent
Taroni, F; Biedermann, A; Bozza, S
2016-02-01
Many people regard the concept of hypothesis testing as fundamental to inferential statistics. Various schools of thought, in particular frequentist and Bayesian, have promoted radically different solutions for taking a decision about the plausibility of competing hypotheses. Comprehensive philosophical comparisons about their advantages and drawbacks are widely available and continue to span over large debates in the literature. More recently, controversial discussion was initiated by an editorial decision of a scientific journal [1] to refuse any paper submitted for publication containing null hypothesis testing procedures. Since the large majority of papers published in forensic journals propose the evaluation of statistical evidence based on the so called p-values, it is of interest to expose the discussion of this journal's decision within the forensic science community. This paper aims to provide forensic science researchers with a primer on the main concepts and their implications for making informed methodological choices. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Pyrodiversity promotes avian diversity over the decade following forest fire
Morgan W. Tingley; Viviana Ruiz-Gutiérrez; Robert L. Wilkerson; Christine A. Howell; Rodney B. Siegel
2016-01-01
An emerging hypothesis in fire ecology is that pyrodiversity increases species diversity.We test whether pyrodiversityâdefined as the standard deviation of fire severityâincreases avian biodiversity at two spatial scales, and whether and how this relationship may change in the decade following fire. We use a dynamic Bayesian community model applied to a multi-year...
ERIC Educational Resources Information Center
Martuza, Victor R.; Engel, John D.
Results from classical power analysis (Brewer, 1972) suggest that a researcher should not set a=p (when p is less than a) in a posteriori fashion when a study yields statistically significant results because of a resulting decrease in power. The purpose of the present report is to use Bayesian theory in examining the validity of this…
Bayes factor and posterior probability: Complementary statistical evidence to p-value.
Lin, Ruitao; Yin, Guosheng
2015-09-01
As a convention, a p-value is often computed in hypothesis testing and compared with the nominal level of 0.05 to determine whether to reject the null hypothesis. Although the smaller the p-value, the more significant the statistical test, it is difficult to perceive the p-value in a probability scale and quantify it as the strength of the data against the null hypothesis. In contrast, the Bayesian posterior probability of the null hypothesis has an explicit interpretation of how strong the data support the null. We make a comparison of the p-value and the posterior probability by considering a recent clinical trial. The results show that even when we reject the null hypothesis, there is still a substantial probability (around 20%) that the null is true. Not only should we examine whether the data would have rarely occurred under the null hypothesis, but we also need to know whether the data would be rare under the alternative. As a result, the p-value only provides one side of the information, for which the Bayes factor and posterior probability may offer complementary evidence. Copyright © 2015 Elsevier Inc. All rights reserved.
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Adaptive sequential Bayesian classification using Page's test
NASA Astrophysics Data System (ADS)
Lynch, Robert S., Jr.; Willett, Peter K.
2002-03-01
In this paper, the previously introduced Mean-Field Bayesian Data Reduction Algorithm is extended for adaptive sequential hypothesis testing utilizing Page's test. In general, Page's test is well understood as a method of detecting a permanent change in distribution associated with a sequence of observations. However, the relationship between detecting a change in distribution utilizing Page's test with that of classification and feature fusion is not well understood. Thus, the contribution of this work is based on developing a method of classifying an unlabeled vector of fused features (i.e., detect a change to an active statistical state) as quickly as possible given an acceptable mean time between false alerts. In this case, the developed classification test can be thought of as equivalent to performing a sequential probability ratio test repeatedly until a class is decided, with the lower log-threshold of each test being set to zero and the upper log-threshold being determined by the expected distance between false alerts. It is of interest to estimate the delay (or, related stopping time) to a classification decision (the number of time samples it takes to classify the target), and the mean time between false alerts, as a function of feature selection and fusion by the Mean-Field Bayesian Data Reduction Algorithm. Results are demonstrated by plotting the delay to declaring the target class versus the mean time between false alerts, and are shown using both different numbers of simulated training data and different numbers of relevant features for each class.
Bayes in biological anthropology.
Konigsberg, Lyle W; Frankenberg, Susan R
2013-12-01
In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.
Linguistic Phylogenies Support Back-Migration from Beringia to Asia
Sicoli, Mark A.; Holton, Gary
2014-01-01
Recent arguments connecting Na-Dene languages of North America with Yeniseian languages of Siberia have been used to assert proof for the origin of Native Americans in central or western Asia. We apply phylogenetic methods to test support for this hypothesis against an alternative hypothesis that Yeniseian represents a back-migration to Asia from a Beringian ancestral population. We coded a linguistic dataset of typological features and used neighbor-joining network algorithms and Bayesian model comparison based on Bayes factors to test the fit between the data and the linguistic phylogenies modeling two dispersal hypotheses. Our results support that a Dene-Yeniseian connection more likely represents radiation out of Beringia with back-migration into central Asia than a migration from central or western Asia to North America. PMID:24621925
Validation of the thermal challenge problem using Bayesian Belief Networks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McFarland, John; Swiler, Laura Painton
The thermal challenge problem has been developed at Sandia National Laboratories as a testbed for demonstrating various types of validation approaches and prediction methods. This report discusses one particular methodology to assess the validity of a computational model given experimental data. This methodology is based on Bayesian Belief Networks (BBNs) and can incorporate uncertainty in experimental measurements, in physical quantities, and model uncertainties. The approach uses the prior and posterior distributions of model output to compute a validation metric based on Bayesian hypothesis testing (a Bayes' factor). This report discusses various aspects of the BBN, specifically in the context ofmore » the thermal challenge problem. A BBN is developed for a given set of experimental data in a particular experimental configuration. The development of the BBN and the method for ''solving'' the BBN to develop the posterior distribution of model output through Monte Carlo Markov Chain sampling is discussed in detail. The use of the BBN to compute a Bayes' factor is demonstrated.« less
A Test by Any Other Name: P Values, Bayes Factors, and Statistical Inference.
Stern, Hal S
2016-01-01
Procedures used for statistical inference are receiving increased scrutiny as the scientific community studies the factors associated with insuring reproducible research. This note addresses recent negative attention directed at p values, the relationship of confidence intervals and tests, and the role of Bayesian inference and Bayes factors, with an eye toward better understanding these different strategies for statistical inference. We argue that researchers and data analysts too often resort to binary decisions (e.g., whether to reject or accept the null hypothesis) in settings where this may not be required.
Ritual human sacrifice promoted and sustained the evolution of stratified societies.
Watts, Joseph; Sheehan, Oliver; Atkinson, Quentin D; Bulbulia, Joseph; Gray, Russell D
2016-04-14
Evidence for human sacrifice is found throughout the archaeological record of early civilizations, the ethnographic records of indigenous world cultures, and the texts of the most prolific contemporary religions. According to the social control hypothesis, human sacrifice legitimizes political authority and social class systems, functioning to stabilize such social stratification. Support for the social control hypothesis is largely limited to historical anecdotes of human sacrifice, where the causal claims have not been subject to rigorous quantitative cross-cultural tests. Here we test the social control hypothesis by applying Bayesian phylogenetic methods to a geographically and socially diverse sample of 93 traditional Austronesian cultures. We find strong support for models in which human sacrifice stabilizes social stratification once stratification has arisen, and promotes a shift to strictly inherited class systems. Whilst evolutionary theories of religion have focused on the functionality of prosocial and moral beliefs, our results reveal a darker link between religion and the evolution of modern hierarchical societies.
Two Bayesian tests of the GLOMOsys Model.
Field, Sarahanne M; Wagenmakers, Eric-Jan; Newell, Ben R; Zeelenberg, René; van Ravenzwaaij, Don
2016-12-01
Priming is arguably one of the key phenomena in contemporary social psychology. Recent retractions and failed replication attempts have led to a division in the field between proponents and skeptics and have reinforced the importance of confirming certain priming effects through replication. In this study, we describe the results of 2 preregistered replication attempts of 1 experiment by Förster and Denzler (2012). In both experiments, participants first processed letters either globally or locally, then were tested using a typicality rating task. Bayes factor hypothesis tests were conducted for both experiments: Experiment 1 (N = 100) yielded an indecisive Bayes factor of 1.38, indicating that the in-lab data are 1.38 times more likely to have occurred under the null hypothesis than under the alternative. Experiment 2 (N = 908) yielded a Bayes factor of 10.84, indicating strong support for the null hypothesis that global priming does not affect participants' mean typicality ratings. The failure to replicate this priming effect challenges existing support for the GLOMO sys model. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Stewart, Heather; Massoudieh, Arash; Gellis, Allen C.
2015-01-01
A Bayesian chemical mass balance (CMB) approach was used to assess the contribution of potential sources for fluvial samples from Laurel Hill Creek in southwest Pennsylvania. The Bayesian approach provides joint probability density functions of the sources' contributions considering the uncertainties due to source and fluvial sample heterogeneity and measurement error. Both elemental profiles of sources and fluvial samples and 13C and 15N isotopes were used for source apportionment. The sources considered include stream bank erosion, forest, roads and agriculture (pasture and cropland). Agriculture was found to have the largest contribution, followed by stream bank erosion. Also, road erosion was found to have a significant contribution in three of the samples collected during lower-intensity rain events. The source apportionment was performed with and without isotopes. The results were largely consistent; however, the use of isotopes was found to slightly increase the uncertainty in most of the cases. The correlation analysis between the contributions of sources shows strong correlations between stream bank and agriculture, whereas roads and forest seem to be less correlated to other sources. Thus, the method was better able to estimate road and forest contributions independently. The hypothesis that the contributions of sources are not seasonally changing was tested by assuming that all ten fluvial samples had the same source contributions. This hypothesis was rejected, demonstrating a significant seasonal variation in the sources of sediments in the stream.
A Bayesian Method for Evaluating and Discovering Disease Loci Associations
Jiang, Xia; Barmada, M. Michael; Cooper, Gregory F.; Becich, Michael J.
2011-01-01
Background A genome-wide association study (GWAS) typically involves examining representative SNPs in individuals from some population. A GWAS data set can concern a million SNPs and may soon concern billions. Researchers investigate the association of each SNP individually with a disease, and it is becoming increasingly commonplace to also analyze multi-SNP associations. Techniques for handling so many hypotheses include the Bonferroni correction and recently developed Bayesian methods. These methods can encounter problems. Most importantly, they are not applicable to a complex multi-locus hypothesis which has several competing hypotheses rather than only a null hypothesis. A method that computes the posterior probability of complex hypotheses is a pressing need. Methodology/Findings We introduce the Bayesian network posterior probability (BNPP) method which addresses the difficulties. The method represents the relationship between a disease and SNPs using a directed acyclic graph (DAG) model, and computes the likelihood of such models using a Bayesian network scoring criterion. The posterior probability of a hypothesis is computed based on the likelihoods of all competing hypotheses. The BNPP can not only be used to evaluate a hypothesis that has previously been discovered or suspected, but also to discover new disease loci associations. The results of experiments using simulated and real data sets are presented. Our results concerning simulated data sets indicate that the BNPP exhibits both better evaluation and discovery performance than does a p-value based method. For the real data sets, previous findings in the literature are confirmed and additional findings are found. Conclusions/Significance We conclude that the BNPP resolves a pressing problem by providing a way to compute the posterior probability of complex multi-locus hypotheses. A researcher can use the BNPP to determine the expected utility of investigating a hypothesis further. Furthermore, we conclude that the BNPP is a promising method for discovering disease loci associations. PMID:21853025
Concerns regarding a call for pluralism of information theory and hypothesis testing
Lukacs, P.M.; Thompson, W.L.; Kendall, W.L.; Gould, W.R.; Doherty, P.F.; Burnham, K.P.; Anderson, D.R.
2007-01-01
1. Stephens et al . (2005) argue for `pluralism? in statistical analysis, combining null hypothesis testing and information-theoretic (I-T) methods. We show that I-T methods are more informative even in single variable problems and we provide an ecological example. 2. I-T methods allow inferences to be made from multiple models simultaneously. We believe multimodel inference is the future of data analysis, which cannot be achieved with null hypothesis-testing approaches. 3. We argue for a stronger emphasis on critical thinking in science in general and less reliance on exploratory data analysis and data dredging. Deriving alternative hypotheses is central to science; deriving a single interesting science hypothesis and then comparing it to a default null hypothesis (e.g. `no difference?) is not an efficient strategy for gaining knowledge. We think this single-hypothesis strategy has been relied upon too often in the past. 4. We clarify misconceptions presented by Stephens et al . (2005). 5. We think inference should be made about models, directly linked to scientific hypotheses, and their parameters conditioned on data, Prob(Hj| data). I-T methods provide a basis for this inference. Null hypothesis testing merely provides a probability statement about the data conditioned on a null model, Prob(data |H0). 6. Synthesis and applications. I-T methods provide a more informative approach to inference. I-T methods provide a direct measure of evidence for or against hypotheses and a means to consider simultaneously multiple hypotheses as a basis for rigorous inference. Progress in our science can be accelerated if modern methods can be used intelligently; this includes various I-T and Bayesian methods.
A Bayesian-frequentist two-stage single-arm phase II clinical trial design.
Dong, Gaohong; Shih, Weichung Joe; Moore, Dirk; Quan, Hui; Marcella, Stephen
2012-08-30
It is well-known that both frequentist and Bayesian clinical trial designs have their own advantages and disadvantages. To have better properties inherited from these two types of designs, we developed a Bayesian-frequentist two-stage single-arm phase II clinical trial design. This design allows both early acceptance and rejection of the null hypothesis ( H(0) ). The measures (for example probability of trial early termination, expected sample size, etc.) of the design properties under both frequentist and Bayesian settings are derived. Moreover, under the Bayesian setting, the upper and lower boundaries are determined with predictive probability of trial success outcome. Given a beta prior and a sample size for stage I, based on the marginal distribution of the responses at stage I, we derived Bayesian Type I and Type II error rates. By controlling both frequentist and Bayesian error rates, the Bayesian-frequentist two-stage design has special features compared with other two-stage designs. Copyright © 2012 John Wiley & Sons, Ltd.
Dediu, Dan
2011-02-07
Language is a hallmark of our species and understanding linguistic diversity is an area of major interest. Genetic factors influencing the cultural transmission of language provide a powerful and elegant explanation for aspects of the present day linguistic diversity and a window into the emergence and evolution of language. In particular, it has recently been proposed that linguistic tone-the usage of voice pitch to convey lexical and grammatical meaning-is biased by two genes involved in brain growth and development, ASPM and Microcephalin. This hypothesis predicts that tone is a stable characteristic of language because of its 'genetic anchoring'. The present paper tests this prediction using a Bayesian phylogenetic framework applied to a large set of linguistic features and language families, using multiple software implementations, data codings, stability estimations, linguistic classifications and outgroup choices. The results of these different methods and datasets show a large agreement, suggesting that this approach produces reliable estimates of the stability of linguistic data. Moreover, linguistic tone is found to be stable across methods and datasets, providing suggestive support for the hypothesis of genetic influences on its distribution.
Bayesian evaluation of effect size after replicating an original study
van Aert, Robbie C. M.; van Assen, Marcel A. L. M.
2017-01-01
The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies. However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant. We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study’s significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application (https://rvanaert.shinyapps.io/snapshot/) and R code for applying the method. PMID:28388646
Bayesian isotonic density regression
Wang, Lianming; Dunson, David B.
2011-01-01
Density regression models allow the conditional distribution of the response given predictors to change flexibly over the predictor space. Such models are much more flexible than nonparametric mean regression models with nonparametric residual distributions, and are well supported in many applications. A rich variety of Bayesian methods have been proposed for density regression, but it is not clear whether such priors have full support so that any true data-generating model can be accurately approximated. This article develops a new class of density regression models that incorporate stochastic-ordering constraints which are natural when a response tends to increase or decrease monotonely with a predictor. Theory is developed showing large support. Methods are developed for hypothesis testing, with posterior computation relying on a simple Gibbs sampler. Frequentist properties are illustrated in a simulation study, and an epidemiology application is considered. PMID:22822259
Ross, Cody T; Winterhalder, Bruce
2016-01-01
We conduct a revaluation of the Thornhill and Fincher research project on parasites using finely-resolved geographic data on parasite prevalence, individual-level sociocultural data, and multilevel Bayesian modeling. In contrast to the evolutionary psychological mechanisms linking parasites to human behavior and cultural characteristics proposed by Thornhill and Fincher, we offer an alternative hypothesis that structural racism and differential access to sanitation systems drive both variation in parasite prevalence and differential behaviors and cultural characteristics. We adopt a Bayesian framework to estimate parasite prevalence rates in 51 districts in eight Latin American countries using the disease status of 170,220 individuals tested for infection with the intestinal roundworm Ascaris lumbricoides (Hürlimann et al., []: PLoS Negl Trop Dis 5:e1404). We then use district-level estimates of parasite prevalence and individual-level social data from 5,558 individuals in the same 51 districts (Latinobarómetro, 2008) to assess claims of causal associations between parasite prevalence and sociocultural characteristics. We find, contrary to Thornhill and Fincher, that parasite prevalence is positively associated with preferences for democracy, negatively associated with preferences for collectivism, and not associated with violent crime rates or gender inequality. A positive association between parasite prevalence and religiosity, as in Fincher and Thornhill (: Behav Brain Sci 35:61-79), and a negative association between parasite prevalence and achieved education, as predicted by Eppig et al. (: Proc R S B: Biol Sci 277:3801-3808), become negative and unreliable when reasonable controls are included in the model. We find support for all predictions derived from our hypothesis linking structural racism to both parasite prevalence and cultural outcomes. We conclude that best practices in biocultural modeling require examining more than one hypothesis, retaining individual-level data and its associated variance whenever possible, and adopting multilevel techniques suited to the structuring of the data. © 2015 Wiley Periodicals, Inc.
How Recent History Affects Perception: The Normative Approach and Its Heuristic Approximation
Raviv, Ofri; Ahissar, Merav; Loewenstein, Yonatan
2012-01-01
There is accumulating evidence that prior knowledge about expectations plays an important role in perception. The Bayesian framework is the standard computational approach to explain how prior knowledge about the distribution of expected stimuli is incorporated with noisy observations in order to improve performance. However, it is unclear what information about the prior distribution is acquired by the perceptual system over short periods of time and how this information is utilized in the process of perceptual decision making. Here we address this question using a simple two-tone discrimination task. We find that the “contraction bias”, in which small magnitudes are overestimated and large magnitudes are underestimated, dominates the pattern of responses of human participants. This contraction bias is consistent with the Bayesian hypothesis in which the true prior information is available to the decision-maker. However, a trial-by-trial analysis of the pattern of responses reveals that the contribution of most recent trials to performance is overweighted compared with the predictions of a standard Bayesian model. Moreover, we study participants' performance in a-typical distributions of stimuli and demonstrate substantial deviations from the ideal Bayesian detector, suggesting that the brain utilizes a heuristic approximation of the Bayesian inference. We propose a biologically plausible model, in which decision in the two-tone discrimination task is based on a comparison between the second tone and an exponentially-decaying average of the first tone and past tones. We show that this model accounts for both the contraction bias and the deviations from the ideal Bayesian detector hypothesis. These findings demonstrate the power of Bayesian-like heuristics in the brain, as well as their limitations in their failure to fully adapt to novel environments. PMID:23133343
Sirota, Miroslav; Kostovičová, Lenka; Juanchich, Marie
2014-08-01
Knowing which properties of visual displays facilitate statistical reasoning bears practical and theoretical implications. Therefore, we studied the effect of one property of visual diplays - iconicity (i.e., the resemblance of a visual sign to its referent) - on Bayesian reasoning. Two main accounts of statistical reasoning predict different effect of iconicity on Bayesian reasoning. The ecological-rationality account predicts a positive iconicity effect, because more highly iconic signs resemble more individuated objects, which tap better into an evolutionary-designed frequency-coding mechanism that, in turn, facilitates Bayesian reasoning. The nested-sets account predicts a null iconicity effect, because iconicity does not affect the salience of a nested-sets structure-the factor facilitating Bayesian reasoning processed by a general reasoning mechanism. In two well-powered experiments (N = 577), we found no support for a positive iconicity effect across different iconicity levels that were manipulated in different visual displays (meta-analytical overall effect: log OR = -0.13, 95% CI [-0.53, 0.28]). A Bayes factor analysis provided strong evidence in favor of the null hypothesis-the null iconicity effect. Thus, these findings corroborate the nested-sets rather than the ecological-rationality account of statistical reasoning.
Performing Contrast Analysis in Factorial Designs: From NHST to Confidence Intervals and Beyond
Wiens, Stefan; Nilsson, Mats E.
2016-01-01
Because of the continuing debates about statistics, many researchers may feel confused about how to analyze and interpret data. Current guidelines in psychology advocate the use of effect sizes and confidence intervals (CIs). However, researchers may be unsure about how to extract effect sizes from factorial designs. Contrast analysis is helpful because it can be used to test specific questions of central interest in studies with factorial designs. It weighs several means and combines them into one or two sets that can be tested with t tests. The effect size produced by a contrast analysis is simply the difference between means. The CI of the effect size informs directly about direction, hypothesis exclusion, and the relevance of the effects of interest. However, any interpretation in terms of precision or likelihood requires the use of likelihood intervals or credible intervals (Bayesian). These various intervals and even a Bayesian t test can be obtained easily with free software. This tutorial reviews these methods to guide researchers in answering the following questions: When I analyze mean differences in factorial designs, where can I find the effects of central interest, and what can I learn about their effect sizes? PMID:29805179
Tropical insect diversity: evidence of greater host specialization in seed-feeding weevils.
Peguero, Guille; Bonal, Raúl; Sol, Daniel; Muñoz, Alberto; Sork, Victoria L; Espelta, Josep M
2017-08-01
Host specialization has long been hypothesized to explain the extraordinary diversity of phytophagous insects in the tropics. However, addressing this hypothesis has proved challenging because of the risk of over-looking rare interactions, and hence biasing specialization estimations, and the difficulties to separate the diversity component attributable to insect specialization from that related to host diversity. As a result, the host specialization hypothesis lacks empirical support for important phytophagous insect clades. Here, we test the hypothesis in a radiation of seed-feeding insects, acorn weevils (Curculio spp.), sampled in temperate and tropical regions (California and Nicaragua, respectively) with an equivalent pool of oak host species. Using DNA sequences from three low-copy genes, we delimited to species level 778 weevil larvae extracted from host seeds and assessed their phylogenetic relationships by Maximum Likelihood and Bayesian inference. We then reconstructed the oak-weevil food webs and examined differences in alpha, beta and gamma diversity using Hill numbers of effective species. We found a higher alpha, beta and gamma diversity of weevils in Nicaragua compared to California despite similar richness of host species at both local and regional level. By means of Bayesian mixed models, we also found that tropical weevil species were highly specialized both in terms of host range and interaction strength, whereas their temperate congeners had a broader taxonomic and phylogenetic host spectrum. Finally, in Nicaraguan species, larval body size was highly correlated with the size of the acorns infested, as would be expected by a greater host specialization, whereas in California this relationship was absent. Altogether, these lines of evidence support the host specialization hypothesis and suggest contrasting eco-evolutionary dynamics in tropical and temperate regions even in absence of differences in host diversity. © 2017 by the Ecological Society of America.
Bayesian approach to MSD-based analysis of particle motion in live cells.
Monnier, Nilah; Guo, Syuan-Ming; Mori, Masashi; He, Jun; Lénárt, Péter; Bathe, Mark
2012-08-08
Quantitative tracking of particle motion using live-cell imaging is a powerful approach to understanding the mechanism of transport of biological molecules, organelles, and cells. However, inferring complex stochastic motion models from single-particle trajectories in an objective manner is nontrivial due to noise from sampling limitations and biological heterogeneity. Here, we present a systematic Bayesian approach to multiple-hypothesis testing of a general set of competing motion models based on particle mean-square displacements that automatically classifies particle motion, properly accounting for sampling limitations and correlated noise while appropriately penalizing model complexity according to Occam's Razor to avoid over-fitting. We test the procedure rigorously using simulated trajectories for which the underlying physical process is known, demonstrating that it chooses the simplest physical model that explains the observed data. Further, we show that computed model probabilities provide a reliability test for the downstream biological interpretation of associated parameter values. We subsequently illustrate the broad utility of the approach by applying it to disparate biological systems including experimental particle trajectories from chromosomes, kinetochores, and membrane receptors undergoing a variety of complex motions. This automated and objective Bayesian framework easily scales to large numbers of particle trajectories, making it ideal for classifying the complex motion of large numbers of single molecules and cells from high-throughput screens, as well as single-cell-, tissue-, and organism-level studies. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Using the Detectability Index to Predict P300 Speller Performance
Mainsah, B.O.; Collins, L.M.; Throckmorton, C.S.
2017-01-01
Objective The P300 speller is a popular brain-computer interface (BCI) system that has been investigated as a potential communication alternative for individuals with severe neuromuscular limitations. To achieve acceptable accuracy levels for communication, the system requires repeated data measurements in a given signal condition to enhance the signal-to-noise ratio of elicited brain responses. These elicited brain responses, which are used as control signals, are embedded in noisy electroencephalography (EEG) data. The discriminability between target and non-target EEG responses defines a user’s performance with the system. A previous P300 speller model has been proposed to estimate system accuracy given a certain amount of data collection. However, the approach was limited to a static stopping algorithm, i.e. averaging over a fixed number of measurements, and the row-column paradigm. A generalized method that is also applicable to dynamic stopping algorithms and other stimulus paradigms is desirable. Approach We developed a new probabilistic model-based approach to predicting BCI performance, where performance functions can be derived analytically or via Monte Carlo methods. Within this framework, we introduce a new model for the P300 speller with the Bayesian dynamic stopping (DS) algorithm, by simplifying a multi-hypothesis to a binary hypothesis problem using the likelihood ratio test. Under a normality assumption, the performance functions for the Bayesian algorithm can be parameterized with the detectability index, a measure which quantifies the discriminability between target and non-target EEG responses. Main results Simulations with synthetic and empirical data provided initial verification of the proposed method of estimating performance with Bayesian DS using the detectability index. Analysis of results from previous online studies validated the proposed method. Significance The proposed method could serve as a useful tool to initially asses BCI performance without extensive online testing, in order to estimate the amount of data required to achieve a desired accuracy level. PMID:27705956
Using the detectability index to predict P300 speller performance
NASA Astrophysics Data System (ADS)
Mainsah, B. O.; Collins, L. M.; Throckmorton, C. S.
2016-12-01
Objective. The P300 speller is a popular brain-computer interface (BCI) system that has been investigated as a potential communication alternative for individuals with severe neuromuscular limitations. To achieve acceptable accuracy levels for communication, the system requires repeated data measurements in a given signal condition to enhance the signal-to-noise ratio of elicited brain responses. These elicited brain responses, which are used as control signals, are embedded in noisy electroencephalography (EEG) data. The discriminability between target and non-target EEG responses defines a user’s performance with the system. A previous P300 speller model has been proposed to estimate system accuracy given a certain amount of data collection. However, the approach was limited to a static stopping algorithm, i.e. averaging over a fixed number of measurements, and the row-column paradigm. A generalized method that is also applicable to dynamic stopping (DS) algorithms and other stimulus paradigms is desirable. Approach. We developed a new probabilistic model-based approach to predicting BCI performance, where performance functions can be derived analytically or via Monte Carlo methods. Within this framework, we introduce a new model for the P300 speller with the Bayesian DS algorithm, by simplifying a multi-hypothesis to a binary hypothesis problem using the likelihood ratio test. Under a normality assumption, the performance functions for the Bayesian algorithm can be parameterized with the detectability index, a measure which quantifies the discriminability between target and non-target EEG responses. Main results. Simulations with synthetic and empirical data provided initial verification of the proposed method of estimating performance with Bayesian DS using the detectability index. Analysis of results from previous online studies validated the proposed method. Significance. The proposed method could serve as a useful tool to initially assess BCI performance without extensive online testing, in order to estimate the amount of data required to achieve a desired accuracy level.
Rodgers, Joseph Lee
2016-01-01
The Bayesian-frequentist debate typically portrays these statistical perspectives as opposing views. However, both Bayesian and frequentist statisticians have expanded their epistemological basis away from a singular focus on the null hypothesis, to a broader perspective involving the development and comparison of competing statistical/mathematical models. For frequentists, statistical developments such as structural equation modeling and multilevel modeling have facilitated this transition. For Bayesians, the Bayes factor has facilitated this transition. The Bayes factor is treated in articles within this issue of Multivariate Behavioral Research. The current presentation provides brief commentary on those articles and more extended discussion of the transition toward a modern modeling epistemology. In certain respects, Bayesians and frequentists share common goals.
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
The DNA database search controversy revisited: bridging the Bayesian-frequentist gap.
Storvik, Geir; Egeland, Thore
2007-09-01
Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671-677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence.
Uncertainty Quantification of Hypothesis Testing for the Integrated Knowledge Engine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cuellar, Leticia
2012-05-31
The Integrated Knowledge Engine (IKE) is a tool of Bayesian analysis, based on Bayesian Belief Networks or Bayesian networks for short. A Bayesian network is a graphical model (directed acyclic graph) that allows representing the probabilistic structure of many variables assuming a localized type of dependency called the Markov property. The Markov property in this instance makes any node or random variable to be independent of any non-descendant node given information about its parent. A direct consequence of this property is that it is relatively easy to incorporate new evidence and derive the appropriate consequences, which in general is notmore » an easy or feasible task. Typically we use Bayesian networks as predictive models for a small subset of the variables, either the leave nodes or the root nodes. In IKE, since most applications deal with diagnostics, we are interested in predicting the likelihood of the root nodes given new observations on any of the children nodes. The root nodes represent the various possible outcomes of the analysis, and an important problem is to determine when we have gathered enough evidence to lean toward one of these particular outcomes. This document presents criteria to decide when the evidence gathered is sufficient to draw a particular conclusion or decide in favor of a particular outcome by quantifying the uncertainty in the conclusions that are drawn from the data. The material in this document is organized as follows: Section 2 presents briefly a forensics Bayesian network, and we explore evaluating the information provided by new evidence by looking first at the posterior distribution of the nodes of interest, and then at the corresponding posterior odds ratios. Section 3 presents a third alternative: Bayes Factors. In section 4 we finalize by showing the relation between the posterior odds ratios and Bayes factors and showing examples these cases, and in section 5 we conclude by providing clear guidelines of how to use these for the type of Bayesian networks used in IKE.« less
Nagy, László G; Urban, Alexander; Orstadius, Leif; Papp, Tamás; Larsson, Ellen; Vágvölgyi, Csaba
2010-12-01
Recently developed comparative phylogenetic methods offer a wide spectrum of applications in evolutionary biology, although it is generally accepted that their statistical properties are incompletely known. Here, we examine and compare the statistical power of the ML and Bayesian methods with regard to selection of best-fit models of fruiting-body evolution and hypothesis testing of ancestral states on a real-life data set of a physiological trait (autodigestion) in the family Psathyrellaceae. Our phylogenies are based on the first multigene data set generated for the family. Two different coding regimes (binary and multistate) and two data sets differing in taxon sampling density are examined. The Bayesian method outperformed Maximum Likelihood with regard to statistical power in all analyses. This is particularly evident if the signal in the data is weak, i.e. in cases when the ML approach does not provide support to choose among competing hypotheses. Results based on binary and multistate coding differed only modestly, although it was evident that multistate analyses were less conclusive in all cases. It seems that increased taxon sampling density has favourable effects on inference of ancestral states, while model parameters are influenced to a smaller extent. The model best fitting our data implies that the rate of losses of deliquescence equals zero, although model selection in ML does not provide proper support to reject three of the four candidate models. The results also support the hypothesis that non-deliquescence (lack of autodigestion) has been ancestral in Psathyrellaceae, and that deliquescent fruiting bodies represent the preferred state, having evolved independently several times during evolution. Copyright © 2010 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Titus, Benjamin M.; Daly, Marymegan
2017-03-01
Specialist and generalist life histories are expected to result in contrasting levels of genetic diversity at the population level, and symbioses are expected to lead to patterns that reflect a shared biogeographic history and co-diversification. We test these assumptions using mtDNA sequencing and a comparative phylogeographic approach for six co-occurring crustacean species that are symbiotic with sea anemones on western Atlantic coral reefs, yet vary in their host specificities: four are host specialists and two are host generalists. We first conducted species discovery analyses to delimit cryptic lineages, followed by classic population genetic diversity analyses for each delimited taxon, and then reconstructed the demographic history for each taxon using traditional summary statistics, Bayesian skyline plots, and approximate Bayesian computation to test for signatures of recent and concerted population expansion. The genetic diversity values recovered here contravene the expectations of the specialist-generalist variation hypothesis and classic population genetics theory; all specialist lineages had greater genetic diversity than generalists. Demography suggests recent population expansions in all taxa, although Bayesian skyline plots and approximate Bayesian computation suggest the timing and magnitude of these events were idiosyncratic. These results do not meet the a priori expectation of concordance among symbiotic taxa and suggest that intrinsic aspects of species biology may contribute more to phylogeographic history than extrinsic forces that shape whole communities. The recovery of two cryptic specialist lineages adds an additional layer of biodiversity to this symbiosis and contributes to an emerging pattern of cryptic speciation in the specialist taxa. Our results underscore the differences in the evolutionary processes acting on marine systems from the terrestrial processes that often drive theory. Finally, we continue to highlight the Florida Reef Tract as an important biodiversity hotspot.
Hayes, Brett K; Hawkins, Guy E; Newell, Ben R
2016-05-01
Four experiments examined the locus of impact of causal knowledge on consideration of alternative hypotheses in judgments under uncertainty. Two possible loci were examined; overcoming neglect of the alternative when developing a representation of a judgment problem and improving utilization of statistics associated with the alternative hypothesis. In Experiment 1, participants could search for information about the various components of Bayes's rule in a diagnostic problem. A majority failed to spontaneously search for information about an alternative hypothesis, but this bias was reduced when a specific alternative hypothesis was mentioned before search. No change in search patterns was found when a generic alternative cause was mentioned. Experiments 2a and 2b broadly replicated these patterns when participants rated or made binary judgments about the relevance of each of the Bayesian components. In contrast, Experiment 3 showed that when participants were given the likelihood of the data given a focal hypothesis p(D|H) and an alternative hypothesis p(D|¬H), they gave estimates of p(H|D) that were consistent with Bayesian principles. Additional causal knowledge had relatively little impact on such judgments. These results show that causal knowledge primarily affects neglect of the alternative hypothesis at the initial stage of problem representation. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
A Bayesian Model for the Prediction and Early Diagnosis of Alzheimer's Disease.
Alexiou, Athanasios; Mantzavinos, Vasileios D; Greig, Nigel H; Kamal, Mohammad A
2017-01-01
Alzheimer's disease treatment is still an open problem. The diversity of symptoms, the alterations in common pathophysiology, the existence of asymptomatic cases, the different types of sporadic and familial Alzheimer's and their relevance with other types of dementia and comorbidities, have already created a myth-fear against the leading disease of the twenty first century. Many failed latest clinical trials and novel medications have revealed the early diagnosis as the most critical treatment solution, even though scientists tested the amyloid hypothesis and few related drugs. Unfortunately, latest studies have indicated that the disease begins at the very young ages thus making it difficult to determine the right time of proper treatment. By taking into consideration all these multivariate aspects and unreliable factors against an appropriate treatment, we focused our research on a non-classic statistical evaluation of the most known and accepted Alzheimer's biomarkers. Therefore, in this paper, the code and few experimental results of a computational Bayesian tool have being reported, dedicated to the correlation and assessment of several Alzheimer's biomarkers to export a probabilistic medical prognostic process. This new statistical software is executable in the Bayesian software Winbugs, based on the latest Alzheimer's classification and the formulation of the known relative probabilities of the various biomarkers, correlated with Alzheimer's progression, through a set of discrete distributions. A user-friendly web page has been implemented for the supporting of medical doctors and researchers, to upload Alzheimer's tests and receive statistics on the occurrence of Alzheimer's disease development or presence, due to abnormal testing in one or more biomarkers.
Too good to be true: when overwhelming evidence fails to convince.
Gunn, Lachlan J; Chapeau-Blondeau, François; McDonnell, Mark D; Davis, Bruce R; Allison, Andrew; Abbott, Derek
2016-03-01
Is it possible for a large sequence of measurements or observations, which support a hypothesis, to counterintuitively decrease our confidence? Can unanimous support be too good to be true? The assumption of independence is often made in good faith; however, rarely is consideration given to whether a systemic failure has occurred. Taking this into account can cause certainty in a hypothesis to decrease as the evidence for it becomes apparently stronger. We perform a probabilistic Bayesian analysis of this effect with examples based on (i) archaeological evidence, (ii) weighing of legal evidence and (iii) cryptographic primality testing. In this paper, we investigate the effects of small error rates in a set of measurements or observations. We find that even with very low systemic failure rates, high confidence is surprisingly difficult to achieve; in particular, we find that certain analyses of cryptographically important numerical tests are highly optimistic, underestimating their false-negative rate by as much as a factor of 2 80 .
Emerging Concepts of Data Integration in Pathogen Phylodynamics.
Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.
Emerging Concepts of Data Integration in Pathogen Phylodynamics
Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504
Schirtzinger, Erin E.; Matsumoto, Tania; Eberhard, Jessica R.; Graves, Gary R.; Sanchez, Juan J.; Capelli, Sara; Müller, Heinrich; Scharpegge, Julia; Chambers, Geoffrey K.; Fleischer, Robert C.
2008-01-01
The question of when modern birds (Neornithes) first diversified has generated much debate among avian systematists. Fossil evidence generally supports a Tertiary diversification, whereas estimates based on molecular dating favor an earlier diversification in the Cretaceous period. In this study, we used an alternate approach, the inference of historical biogeographic patterns, to test the hypothesis that the initial radiation of the Order Psittaciformes (the parrots and cockatoos) originated on the Gondwana supercontinent during the Cretaceous. We utilized broad taxonomic sampling (representatives of 69 of the 82 extant genera and 8 outgroup taxa) and multilocus molecular character sampling (3,941 bp from mitochondrial DNA (mtDNA) genes cytochrome oxidase I and NADH dehydrogenase 2 and nuclear introns of rhodopsin intron 1, tropomyosin alpha-subunit intron 5, and transforming growth factor ß-2) to generate phylogenetic hypotheses for the Psittaciformes. Analyses of the combined character partitions using maximum parsimony, maximum likelihood, and Bayesian criteria produced well-resolved and topologically similar trees in which the New Zealand taxa Strigops and Nestor (Psittacidae) were sister to all other psittaciforms and the cockatoo clade (Cacatuidae) was sister to a clade containing all remaining parrots (Psittacidae). Within this large clade of Psittacidae, some traditionally recognized tribes and subfamilies were monophyletic (e.g., Arini, Psittacini, and Loriinae), whereas several others were polyphyletic (e.g., Cyclopsittacini, Platycercini, Psittaculini, and Psittacinae). Ancestral area reconstructions using our Bayesian phylogenetic hypothesis and current distributions of genera supported the hypothesis of an Australasian origin for the Psittaciformes. Separate analyses of the timing of parrot diversification constructed with both Bayesian relaxed-clock and penalized likelihood approaches showed better agreement between geologic and diversification events in the chronograms based on a Cretaceous dating of the basal split within parrots than the chronograms based on a Tertiary dating of this split, although these data are more equivocal. Taken together, our results support a Cretaceous origin of Psittaciformes in Gondwana after the separation of Africa and the India/Madagascar block with subsequent diversification through both vicariance and dispersal. These well-resolved molecular phylogenies will be of value for comparative studies of behavior, ecology, and life history in parrots. PMID:18653733
The Scientific Method, Diagnostic Bayes, and How to Detect Epistemic Errors
NASA Astrophysics Data System (ADS)
Vrugt, J. A.
2015-12-01
In the past decades, Bayesian methods have found widespread application and use in environmental systems modeling. Bayes theorem states that the posterior probability, P(H|D) of a hypothesis, H is proportional to the product of the prior probability, P(H) of this hypothesis and the likelihood, L(H|hat{D}) of the same hypothesis given the new/incoming observations, \\hat {D}. In science and engineering, H often constitutes some numerical simulation model, D = F(x,.) which summarizes using algebraic, empirical, and differential equations, state variables and fluxes, all our theoretical and/or practical knowledge of the system of interest, and x are the d unknown parameters which are subject to inference using some data, \\hat {D} of the observed system response. The Bayesian approach is intimately related to the scientific method and uses an iterative cycle of hypothesis formulation (model), experimentation and data collection, and theory/hypothesis refinement to elucidate the rules that govern the natural world. Unfortunately, model refinement has proven to be very difficult in large part because of the poor diagnostic power of residual based likelihood functions tep{gupta2008}. This has inspired te{vrugt2013} to advocate the use of 'likelihood-free' inference using approximate Bayesian computation (ABC). This approach uses one or more summary statistics, S(\\hat {D}) of the original data, \\hat {D} designed ideally to be sensitive only to one particular process in the model. Any mismatch between the observed and simulated summary metrics is then easily linked to a specific model component. A recurrent issue with the application of ABC is self-sufficiency of the summary statistics. In theory, S(.) should contain as much information as the original data itself, yet complex systems rarely admit sufficient statistics. In this article, we propose to combine the ideas of ABC and regular Bayesian inference to guarantee that no information is lost in diagnostic model evaluation. This hybrid approach, coined diagnostic Bayes, uses the summary metrics as prior distribution and original data in the likelihood function, or P(x|\\hat {D}) ∝ P(x|S(\\hat {D})) L(x|\\hat {D}). A case study illustrates the ability of the proposed methodology to diagnose epistemic errors and provide guidance on model refinement.
Bayesian networks for evaluation of evidence from forensic entomology.
Andersson, M Gunnar; Sundström, Anders; Lindström, Anders
2013-09-01
In the aftermath of a CBRN incident, there is an urgent need to reconstruct events in order to bring the perpetrators to court and to take preventive actions for the future. The challenge is to discriminate, based on available information, between alternative scenarios. Forensic interpretation is used to evaluate to what extent results from the forensic investigation favor the prosecutors' or the defendants' arguments, using the framework of Bayesian hypothesis testing. Recently, several new scientific disciplines have been used in a forensic context. In the AniBioThreat project, the framework was applied to veterinary forensic pathology, tracing of pathogenic microorganisms, and forensic entomology. Forensic entomology is an important tool for estimating the postmortem interval in, for example, homicide investigations as a complement to more traditional methods. In this article we demonstrate the applicability of the Bayesian framework for evaluating entomological evidence in a forensic investigation through the analysis of a hypothetical scenario involving suspect movement of carcasses from a clandestine laboratory. Probabilities of different findings under the alternative hypotheses were estimated using a combination of statistical analysis of data, expert knowledge, and simulation, and entomological findings are used to update the beliefs about the prosecutors' and defendants' hypotheses and to calculate the value of evidence. The Bayesian framework proved useful for evaluating complex hypotheses using findings from several insect species, accounting for uncertainty about development rate, temperature, and precolonization. The applicability of the forensic statistic approach to evaluating forensic results from a CBRN incident is discussed.
Statistical Hypothesis Testing in Intraspecific Phylogeography: NCPA versus ABC
Templeton, Alan R.
2009-01-01
Nested clade phylogeographic analysis (NCPA) and approximate Bayesian computation (ABC) have been used to test phylogeographic hypotheses. Multilocus NCPA tests null hypotheses, whereas ABC discriminates among a finite set of alternatives. The interpretive criteria of NCPA are explicit and allow complex models to be built from simple components. The interpretive criteria of ABC are ad hoc and require the specification of a complete phylogeographic model. The conclusions from ABC are often influenced by implicit assumptions arising from the many parameters needed to specify a complex model. These complex models confound many assumptions so that biological interpretations are difficult. Sampling error is accounted for in NCPA, but ABC ignores important sources of sampling error that creates pseudo-statistical power. NCPA generates the full sampling distribution of its statistics, but ABC only yields local probabilities, which in turn make it impossible to distinguish between a good fitting model, a non-informative model, and an over-determined model. Both NCPA and ABC use approximations, but convergences of the approximations used in NCPA are well defined whereas those in ABC are not. NCPA can analyze a large number of locations, but ABC cannot. Finally, the dimensionality of tested hypothesis is known in NCPA, but not for ABC. As a consequence, the “probabilities” generated by ABC are not true probabilities and are statistically non-interpretable. Accordingly, ABC should not be used for hypothesis testing, but simulation approaches are valuable when used in conjunction with NCPA or other methods that do not rely on highly parameterized models. PMID:19192182
NASA Astrophysics Data System (ADS)
Rizzo, D. M.; Fytilis, N.; Stevens, L.
2012-12-01
Environmental managers are increasingly required to monitor and forecast long-term effects and vulnerability of biophysical systems to human-generated stresses. Ideally, a study involving both physical and biological assessments conducted concurrently (in space and time) could provide a better understanding of the mechanisms and complex relationships. However, costs and resources associated with monitoring the complex linkages between the physical, geomorphic and habitat conditions and the biological integrity of stream reaches are prohibitive. Researchers have used classification techniques to place individual streams and rivers into a broader spatial context (hydrologic or health condition). Such efforts require environmental managers to gather multiple forms of information - quantitative, qualitative and subjective. We research and develop a novel classification tool that combines self-organizing maps with a Naïve Bayesian classifier to direct resources to stream reaches most in need. The Vermont Agency of Natural Resources has developed and adopted protocols for physical stream geomorphic and habitat assessments throughout the state of Vermont. Separate from these assessments, the Vermont Department of Environmental Conservation monitors the biological communities and the water quality in streams. Our initial hypothesis is that the geomorphic reach assessments and water quality data may be leveraged to reduce error and uncertainty associated with predictions of biological integrity and stream health. We test our hypothesis using over 2500 Vermont stream reaches (~1371 stream miles) assessed by the two agencies. In the development of this work, we combine a Naïve Bayesian classifier with a modified Kohonen Self-Organizing Map (SOM). The SOM is an unsupervised artificial neural network that autonomously analyzes inherent dataset properties using input data only. It is typically used to cluster data into similar categories when a priori classes do not exist. The incorporation of a Bayesian classifier allows one to explicitly incorporate existing knowledge and expert opinion into the data analysis. Since classification plays a leading role in the future development of data-enabled science and engineering, such a computational tool is applicable to a variety of proactive adaptive watershed management applications.
On Bayesian methods of exploring qualitative interactions for targeted treatment.
Chen, Wei; Ghosh, Debashis; Raghunathan, Trivellore E; Norkin, Maxim; Sargent, Daniel J; Bepler, Gerold
2012-12-10
Providing personalized treatments designed to maximize benefits and minimizing harms is of tremendous current medical interest. One problem in this area is the evaluation of the interaction between the treatment and other predictor variables. Treatment effects in subgroups having the same direction but different magnitudes are called quantitative interactions, whereas those having opposite directions in subgroups are called qualitative interactions (QIs). Identifying QIs is challenging because they are rare and usually unknown among many potential biomarkers. Meanwhile, subgroup analysis reduces the power of hypothesis testing and multiple subgroup analyses inflate the type I error rate. We propose a new Bayesian approach to search for QI in a multiple regression setting with adaptive decision rules. We consider various regression models for the outcome. We illustrate this method in two examples of phase III clinical trials. The algorithm is straightforward and easy to implement using existing software packages. We provide a sample code in Appendix A. Copyright © 2012 John Wiley & Sons, Ltd.
Meinzer, Caitlyn; Martin, Renee; Suarez, Jose I
2017-09-08
In phase II trials, the most efficacious dose is usually not known. Moreover, given limited resources, it is difficult to robustly identify a dose while also testing for a signal of efficacy that would support a phase III trial. Recent designs have sought to be more efficient by exploring multiple doses through the use of adaptive strategies. However, the added flexibility may potentially increase the risk of making incorrect assumptions and reduce the total amount of information available across the dose range as a function of imbalanced sample size. To balance these challenges, a novel placebo-controlled design is presented in which a restricted Bayesian response adaptive randomization (RAR) is used to allocate a majority of subjects to the optimal dose of active drug, defined as the dose with the lowest probability of poor outcome. However, the allocation between subjects who receive active drug or placebo is held constant to retain the maximum possible power for a hypothesis test of overall efficacy comparing the optimal dose to placebo. The design properties and optimization of the design are presented in the context of a phase II trial for subarachnoid hemorrhage. For a fixed total sample size, a trade-off exists between the ability to select the optimal dose and the probability of rejecting the null hypothesis. This relationship is modified by the allocation ratio between active and control subjects, the choice of RAR algorithm, and the number of subjects allocated to an initial fixed allocation period. While a responsive RAR algorithm improves the ability to select the correct dose, there is an increased risk of assigning more subjects to a worse arm as a function of ephemeral trends in the data. A subarachnoid treatment trial is used to illustrate how this design can be customized for specific objectives and available data. Bayesian adaptive designs are a flexible approach to addressing multiple questions surrounding the optimal dose for treatment efficacy within the context of limited resources. While the design is general enough to apply to many situations, future work is needed to address interim analyses and the incorporation of models for dose response.
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes.
Schwartenbeck, Philipp; FitzGerald, Thomas H B; Mathys, Christoph; Dolan, Ray; Friston, Karl
2015-10-01
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a "limited offer" game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing. © The Author 2014. Published by Oxford University Press.
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes
Schwartenbeck, Philipp; FitzGerald, Thomas H. B.; Mathys, Christoph; Dolan, Ray; Friston, Karl
2015-01-01
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a “limited offer” game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing. PMID:25056572
In silico model-based inference: a contemporary approach for hypothesis testing in network biology
Klinke, David J.
2014-01-01
Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model-based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900’s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model-based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. PMID:25139179
In silico model-based inference: a contemporary approach for hypothesis testing in network biology.
Klinke, David J
2014-01-01
Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model-based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model-based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. © 2014 American Institute of Chemical Engineers.
Optimal predictions in everyday cognition: the wisdom of individuals or crowds?
Mozer, Michael C; Pashler, Harold; Homaei, Hadjar
2008-10-01
Griffiths and Tenenbaum (2006) asked individuals to make predictions about the duration or extent of everyday events (e.g., cake baking times), and reported that predictions were optimal, employing Bayesian inference based on veridical prior distributions. Although the predictions conformed strikingly to statistics of the world, they reflect averages over many individuals. On the conjecture that the accuracy of the group response is chiefly a consequence of aggregating across individuals, we constructed simple, heuristic approximations to the Bayesian model premised on the hypothesis that individuals have access merely to a sample of k instances drawn from the relevant distribution. The accuracy of the group response reported by Griffiths and Tenenbaum could be accounted for by supposing that individuals each utilize only two instances. Moreover, the variability of the group data is more consistent with this small-sample hypothesis than with the hypothesis that people utilize veridical or nearly veridical representations of the underlying prior distributions. Our analyses lead to a qualitatively different view of how individuals reason from past experience than the view espoused by Griffiths and Tenenbaum. 2008 Cognitive Science Society, Inc.
Fully Bayesian tests of neutrality using genealogical summary statistics.
Drummond, Alexei J; Suchard, Marc A
2008-10-31
Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.
Sanchez, Gaëtan; Lecaignard, Françoise; Otman, Anatole; Maby, Emmanuel; Mattout, Jérémie
2016-01-01
The relatively young field of Brain-Computer Interfaces has promoted the use of electrophysiology and neuroimaging in real-time. In the meantime, cognitive neuroscience studies, which make extensive use of functional exploration techniques, have evolved toward model-based experiments and fine hypothesis testing protocols. Although these two developments are mostly unrelated, we argue that, brought together, they may trigger an important shift in the way experimental paradigms are being designed, which should prove fruitful to both endeavors. This change simply consists in using real-time neuroimaging in order to optimize advanced neurocognitive hypothesis testing. We refer to this new approach as the instantiation of an Active SAmpling Protocol (ASAP). As opposed to classical (static) experimental protocols, ASAP implements online model comparison, enabling the optimization of design parameters (e.g., stimuli) during the course of data acquisition. This follows the well-known principle of sequential hypothesis testing. What is radically new, however, is our ability to perform online processing of the huge amount of complex data that brain imaging techniques provide. This is all the more relevant at a time when physiological and psychological processes are beginning to be approached using more realistic, generative models which may be difficult to tease apart empirically. Based upon Bayesian inference, ASAP proposes a generic and principled way to optimize experimental design adaptively. In this perspective paper, we summarize the main steps in ASAP. Using synthetic data we illustrate its superiority in selecting the right perceptual model compared to a classical design. Finally, we briefly discuss its future potential for basic and clinical neuroscience as well as some remaining challenges.
Bayes factors for testing inequality constrained hypotheses: Issues with prior specification.
Mulder, Joris
2014-02-01
Several issues are discussed when testing inequality constrained hypotheses using a Bayesian approach. First, the complexity (or size) of the inequality constrained parameter spaces can be ignored. This is the case when using the posterior probability that the inequality constraints of a hypothesis hold, Bayes factors based on non-informative improper priors, and partial Bayes factors based on posterior priors. Second, the Bayes factor may not be invariant for linear one-to-one transformations of the data. This can be observed when using balanced priors which are centred on the boundary of the constrained parameter space with a diagonal covariance structure. Third, the information paradox can be observed. When testing inequality constrained hypotheses, the information paradox occurs when the Bayes factor of an inequality constrained hypothesis against its complement converges to a constant as the evidence for the first hypothesis accumulates while keeping the sample size fixed. This paradox occurs when using Zellner's g prior as a result of too much prior shrinkage. Therefore, two new methods are proposed that avoid these issues. First, partial Bayes factors are proposed based on transformed minimal training samples. These training samples result in posterior priors that are centred on the boundary of the constrained parameter space with the same covariance structure as in the sample. Second, a g prior approach is proposed by letting g go to infinity. This is possible because the Jeffreys-Lindley paradox is not an issue when testing inequality constrained hypotheses. A simulation study indicated that the Bayes factor based on this g prior approach converges fastest to the true inequality constrained hypothesis. © 2013 The British Psychological Society.
Derkarabetian, Shahan; Steinmann, David B.; Hedin, Marshal
2010-01-01
Background Many cave-dwelling animal species display similar morphologies (troglomorphism) that have evolved convergent within and among lineages under the similar selective pressures imposed by cave habitats. Here we study such ecomorphological evolution in cave-dwelling Sclerobuninae harvestmen (Opiliones) from the western United States, providing general insights into morphological homoplasy, rates of morphological change, and the temporal context of cave evolution. Methodology/Principal Findings We gathered DNA sequence data from three independent gene regions, and combined these data with Bayesian hypothesis testing, morphometrics analysis, study of penis morphology, and relaxed molecular clock analyses. Using multivariate morphometric analysis, we find that phylogenetically unrelated taxa have convergently evolved troglomorphism; alternative phylogenetic hypotheses involving less morphological convergence are not supported by Bayesian hypothesis testing. In one instance, this morphology is found in specimens from a high-elevation stony debris habitat, suggesting that troglomorphism can evolve in non-cave habitats. We discovered a strong positive relationship between troglomorphy index and relative divergence time, making it possible to predict taxon age from morphology. Most of our time estimates for the origin of highly-troglomorphic cave forms predate the Pleistocene. Conclusions/Significance While several regions in the eastern and central United States are well-known hotspots for cave evolution, few modern phylogenetic studies have addressed the evolution of cave-obligate species in the western United States. Our integrative studies reveal the recurrent evolution of troglomorphism in a perhaps unexpected geographic region, at surprisingly deep time depths, and in sometimes surprising habitats. Because some newly discovered troglomorphic populations represent undescribed species, our findings stress the need for further biological exploration, integrative systematic research, and conservation efforts in western US cave habitats. PMID:20479884
NASA Technical Reports Server (NTRS)
Denning, Peter J.
1989-01-01
In 1983 and 1984, the Infrared Astronomical Satellite (IRAS) detected 5,425 stellar objects and measured their infrared spectra. In 1987 a program called AUTOCLASS used Bayesian inference methods to discover the classes present in these data and determine the most probable class of each object, revealing unknown phenomena in astronomy. AUTOCLASS has rekindled the old debate on the suitability of Bayesian methods, which are computationally intensive, interpret probabilities as plausibility measures rather than frequencies, and appear to depend on a subjective assessment of the probability of a hypothesis before the data were collected. Modern statistical methods have, however, recently been shown to also depend on subjective elements. These debates bring into question the whole tradition of scientific objectivity and offer scientists a new way to take responsibility for their findings and conclusions.
A Bayesian Nonparametric Approach to Test Equating
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2009-01-01
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…
Furtado-Junior, I; Abrunhosa, F A; Holanda, F C A F; Tavares, M C S
2016-06-01
Fishing selectivity of the mangrove crab Ucides cordatus in the north coast of Brazil can be defined as the fisherman's ability to capture and select individuals from a certain size or sex (or a combination of these factors) which suggests an empirical selectivity. Considering this hypothesis, we calculated the selectivity curves for males and females crabs using the logit function of the logistic model in the formulation. The Bayesian inference consisted of obtaining the posterior distribution by applying the Markov chain Monte Carlo (MCMC) method to software R using the OpenBUGS, BRugs, and R2WinBUGS libraries. The estimated results of width average carapace selection for males and females compared with previous studies reporting the average width of the carapace of sexual maturity allow us to confirm the hypothesis that most mature individuals do not suffer from fishing pressure; thus, ensuring their sustainability.
Bayesian Dose-Response Modeling in Sparse Data
NASA Astrophysics Data System (ADS)
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a wrong parametric assumption. In this regard, we consider a robust experimental design which does not require any parametric assumption.
Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests
ERIC Educational Resources Information Center
Veldkamp, Bernard P.
2010-01-01
Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…
A Bayesian test for Hardy–Weinberg equilibrium of biallelic X-chromosomal markers
Puig, X; Ginebra, J; Graffelman, J
2017-01-01
The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy–Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy–Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used. PMID:28900292
Vemulapalli, Vijetha; Qu, Jiaqi; Garren, Jeonifer M; Rodrigues, Leonardo O; Kiebish, Michael A; Sarangarajan, Rangaprasad; Narain, Niven R; Akmaev, Viatcheslav R
2016-11-01
Given the availability of extensive digitized healthcare data from medical records, claims and prescription information, it is now possible to use hypothesis-free, data-driven approaches to mine medical databases for novel insight. The goal of this analysis was to demonstrate the use of artificial intelligence based methods such as Bayesian networks to open up opportunities for creation of new knowledge in management of chronic conditions. Hospital level Medicare claims data containing discharge numbers for most common diagnoses were analyzed in a hypothesis-free manner using Bayesian networks learning methodology. While many interactions identified between discharge rates of diagnoses using this data set are supported by current medical knowledge, a novel interaction linking asthma and renal failure was discovered. This interaction is non-obvious and had not been looked at by the research and clinical communities in epidemiological or clinical data. A plausible pharmacological explanation of this link is proposed together with a verification of the risk significance by conventional statistical analysis. Potential clinical and molecular pathways defining the relationship between commonly used asthma medications and renal disease are discussed. The study underscores the need for further epidemiological research to validate this novel hypothesis. Validation will lead to advancement in clinical treatment of asthma & bronchitis, thereby, improving patient outcomes and leading to long term cost savings. In summary, this study demonstrates that application of advanced artificial intelligence methods in healthcare has the potential to enhance the quality of care by discovering non-obvious, clinically relevant relationships and enabling timely care intervention. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Renes, Joseph M.
2017-10-01
We extend the recent bounds of Sason and Verdú relating Rényi entropy and Bayesian hypothesis testing (arXiv:1701.01974.) to the quantum domain and show that they have a number of different applications. First, we obtain a sharper bound relating the optimal probability of correctly distinguishing elements of an ensemble of states to that of the pretty good measurement, and an analogous bound for optimal and pretty good entanglement recovery. Second, we obtain bounds relating optimal guessing and entanglement recovery to the fidelity of the state with a product state, which then leads to tight tripartite uncertainty and monogamy relations.
Visualizing the Bayesian 2-test case: The effect of tree diagrams on medical decision making.
Binder, Karin; Krauss, Stefan; Bruckmaier, Georg; Marienhagen, Jörg
2018-01-01
In medicine, diagnoses based on medical test results are probabilistic by nature. Unfortunately, cognitive illusions regarding the statistical meaning of test results are well documented among patients, medical students, and even physicians. There are two effective strategies that can foster insight into what is known as Bayesian reasoning situations: (1) translating the statistical information on the prevalence of a disease and the sensitivity and the false-alarm rate of a specific test for that disease from probabilities into natural frequencies, and (2) illustrating the statistical information with tree diagrams, for instance, or with other pictorial representation. So far, such strategies have only been empirically tested in combination for "1-test cases", where one binary hypothesis ("disease" vs. "no disease") has to be diagnosed based on one binary test result ("positive" vs. "negative"). However, in reality, often more than one medical test is conducted to derive a diagnosis. In two studies, we examined a total of 388 medical students from the University of Regensburg (Germany) with medical "2-test scenarios". Each student had to work on two problems: diagnosing breast cancer with mammography and sonography test results, and diagnosing HIV infection with the ELISA and Western Blot tests. In Study 1 (N = 190 participants), we systematically varied the presentation of statistical information ("only textual information" vs. "only tree diagram" vs. "text and tree diagram in combination"), whereas in Study 2 (N = 198 participants), we varied the kinds of tree diagrams ("complete tree" vs. "highlighted tree" vs. "pruned tree"). All versions were implemented in probability format (including probability trees) and in natural frequency format (including frequency trees). We found that natural frequency trees, especially when the question-related branches were highlighted, improved performance, but that none of the corresponding probabilistic visualizations did.
Gravity dependence of the effect of optokinetic stimulation on the subjective visual vertical.
Ward, Bryan K; Bockisch, Christopher J; Caramia, Nicoletta; Bertolini, Giovanni; Tarnutzer, Alexander Andrea
2017-05-01
Accurate and precise estimates of direction of gravity are essential for spatial orientation. According to Bayesian theory, multisensory vestibular, visual, and proprioceptive input is centrally integrated in a weighted fashion based on the reliability of the component sensory signals. For otolithic input, a decreasing signal-to-noise ratio was demonstrated with increasing roll angle. We hypothesized that the weights of vestibular (otolithic) and extravestibular (visual/proprioceptive) sensors are roll-angle dependent and predicted an increased weight of extravestibular cues with increasing roll angle, potentially following the Bayesian hypothesis. To probe this concept, the subjective visual vertical (SVV) was assessed in different roll positions (≤ ± 120°, steps = 30°, n = 10) with/without presenting an optokinetic stimulus (velocity = ± 60°/s). The optokinetic stimulus biased the SVV toward the direction of stimulus rotation for roll angles ≥ ± 30° ( P < 0.005). Offsets grew from 3.9 ± 1.8° (upright) to 22.1 ± 11.8° (±120° roll tilt, P < 0.001). Trial-to-trial variability increased with roll angle, demonstrating a nonsignificant increase when providing optokinetic stimulation. Variability and optokinetic bias were correlated ( R 2 = 0.71, slope = 0.71, 95% confidence interval = 0.57-0.86). An optimal-observer model combining an optokinetic bias with vestibular input reproduced measured errors closely. These findings support the hypothesis of a weighted multisensory integration when estimating direction of gravity with optokinetic stimulation. Visual input was weighted more when vestibular input became less reliable, i.e., at larger roll-tilt angles. However, according to Bayesian theory, the variability of combined cues is always lower than the variability of each source cue. If the observed increase in variability, although nonsignificant, is true, either it must depend on an additional source of variability, added after SVV computation, or it would conflict with the Bayesian hypothesis. NEW & NOTEWORTHY Applying a rotating optokinetic stimulus while recording the subjective visual vertical in different whole body roll angles, we noted the optokinetic-induced bias to correlate with the roll angle. These findings allow the hypothesis that the established optimal weighting of single-sensory cues depending on their reliability to estimate direction of gravity could be extended to a bias caused by visual self-motion stimuli. Copyright © 2017 the American Physiological Society.
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls that have plagued previous theoretical movements.
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood
ERIC Educational Resources Information Center
Karabatsos, George
2017-01-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon…
Test of a hypothesis of realism in quantum theory using a Bayesian approach
NASA Astrophysics Data System (ADS)
Nikitin, N.; Toms, K.
2017-05-01
In this paper we propose a time-independent equality and time-dependent inequality, suitable for an experimental test of the hypothesis of realism. The derivation of these relations is based on the concept of conditional probability and on Bayes' theorem in the framework of Kolmogorov's axiomatics of probability theory. The equality obtained is intrinsically different from the well-known Greenberger-Horne-Zeilinger (GHZ) equality and its variants, because violation of the proposed equality might be tested in experiments with only two microsystems in a maximally entangled Bell state |Ψ-> , while a test of the GHZ equality requires at least three quantum systems in a special state |ΨGHZ> . The obtained inequality differs from Bell's, Wigner's, and Leggett-Garg inequalities, because it deals with spin s =1 /2 projections onto only two nonparallel directions at two different moments of time, while a test of the Bell and Wigner inequalities requires at least three nonparallel directions, and a test of the Leggett-Garg inequalities requires at least three distinct moments of time. Hence, the proposed inequality seems to open an additional experimental possibility to avoid the "contextuality loophole." Violation of the proposed equality and inequality is illustrated with the behavior of a pair of anticorrelated spins in an external magnetic field and also with the oscillations of flavor-entangled pairs of neutral pseudoscalar mesons.
NASA Astrophysics Data System (ADS)
Armal, S.; Devineni, N.; Khanbilvardi, R.
2017-12-01
This study presents a systematic analysis for identifying and attributing trends in the annual frequency of extreme rainfall events across the contiguous United States to climate change and climate variability modes. A Bayesian multilevel model is developed for 1,244 stations simultaneously to test the null hypothesis of no trend and verify two alternate hypotheses: Trend can be attributed to changes in global surface temperature anomalies, or to a combination of cyclical climate modes with varying quasi-periodicities and global surface temperature anomalies. The Bayesian multilevel model provides the opportunity to pool information across stations and reduce the parameter estimation uncertainty, hence identifying the trends better. The choice of the best alternate hypotheses is made based on Watanabe-Akaike Information Criterion, a Bayesian pointwise predictive accuracy measure. Statistically significant time trends are observed in 742 of the 1,244 stations. Trends in 409 of these stations can be attributed to changes in global surface temperature anomalies. These stations are predominantly found in the Southeast and Northeast climate regions. The trends in 274 of these stations can be attributed to the El Nino Southern Oscillations, North Atlantic Oscillation, Pacific Decadal Oscillation and Atlantic Multi-Decadal Oscillation along with changes in global surface temperature anomalies. These stations are mainly found in the Northwest, West and Southwest climate regions.
Monden, Rei; de Vos, Stijn; Morey, Richard; Wagenmakers, Eric-Jan; de Jonge, Peter; Roest, Annelieke M
2016-12-01
The Food and Drug Administration (FDA) uses a p < 0.05 null-hypothesis significance testing framework to evaluate "substantial evidence" for drug efficacy. This framework only allows dichotomous conclusions and does not quantify the strength of evidence supporting efficacy. The efficacy of FDA-approved antidepressants for the treatment of anxiety disorders was re-evaluated in a Bayesian framework that quantifies the strength of the evidence. Data from 58 double-blind placebo-controlled trials were retrieved from the FDA for the second-generation antidepressants for the treatment of anxiety disorders. Bayes factors (BFs) were calculated for all treatment arms compared to placebo and were compared with the corresponding p-values and the FDA conclusion categories. BFs ranged from 0.07 to 131,400, indicating a range of no support of evidence to strong evidence for the efficacy. Results also indicate a varying strength of evidence between the trials with p < 0.05. In sum, there were large differences in BFs across trials. Among trials providing "substantial evidence" according to the FDA, only 27 out of 59 dose groups obtained strong support for efficacy according to the typically used cutoff of BF ≥ 20. The Bayesian framework can provide valuable information on the strength of the evidence for drug efficacy. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Bayesian evidence for non-zero θ 13 and CP-violation in neutrino oscillations
NASA Astrophysics Data System (ADS)
Bergström, Johannes
2012-08-01
We present the Bayesian method for evaluating the evidence for a non-zero value of the leptonic mixing angle θ 13 and CP-violation in neutrino oscillation experiments. This is an application of the well-established method of Bayesian model selection, of which we give a concise and pedagogical overview. When comparing the hypothesis θ 13 = 0 with hypotheses where θ 13 > 0 using global data but excluding the recent reactor measurements, we obtain only a weak preference for a non-zero θ 13, even though the significance is over 3 σ. We then add the reactor measurements one by one and show how the evidence for θ 13 > 0 quickly increases. When including the D ouble C hooz, D aya B ay, and RENO data, the evidence becomes overwhelming with a posterior probability of the hypothesis θ 13 = 0 below 10-11. Owing to the small amount of information on the CP-phase δ, very similar evidences are obtained for the CP-conserving and CP-violating hypotheses. Hence, there is, not unexpectedly, neither evidence for nor against leptonic CP-violation. However, when future experiments aiming to search for CP-violation have started taking data, this question will be of great importance and the method described here can be used as an important complement to standard analyses.
Status and Power Do Not Modulate Automatic Imitation of Intransitive Hand Movements
Farmer, Harry; Carr, Evan W.; Svartdal, Marita; Winkielman, Piotr; Hamilton, Antonia F. de C.
2016-01-01
The tendency to mimic the behaviour of others is affected by a variety of social factors, and it has been argued that such “mirroring” is often unconsciously deployed as a means of increasing affiliation during interpersonal interactions. However, the relationship between automatic motor imitation and status/power is currently unclear. This paper reports five experiments that investigated whether social status (Experiments 1, 2, and 3) or power (Experiments 4 and 5) had a moderating effect on automatic imitation (AI) in finger-movement tasks, using a series of different manipulations. Experiments 1 and 2 manipulated the social status of the observed person using an associative learning task. Experiment 3 manipulated social status via perceived competence at a simple computer game. Experiment 4 manipulated participants’ power (relative to the actors) in a card-choosing task. Finally, Experiment 5 primed participants using a writing task, to induce the sense of being powerful or powerless. No significant interactions were found between congruency and social status/power in any of the studies. Additionally, Bayesian hypothesis testing indicated that the null hypothesis should be favoured over the experimental hypothesis in all five studies. These findings are discussed in terms of their implications for AI tasks, social effects on mimicry, and the hypothesis of mimicry as a strategic mechanism to promote affiliation. PMID:27096167
Ravinet, Mark; Harrod, Chris; Eizaguirre, Christophe; Prodöhl, Paulo A
2014-06-01
Repeated recolonization of freshwater environments following Pleistocene glaciations has played a major role in the evolution and adaptation of anadromous taxa. Located at the western fringe of Europe, Ireland and Britain were likely recolonized rapidly by anadromous fishes from the North Atlantic following the last glacial maximum (LGM). While the presence of unique mitochondrial haplotypes in Ireland suggests that a cryptic northern refugium may have played a role in recolonization, no explicit test of this hypothesis has been conducted. The three-spined stickleback is native and ubiquitous to aquatic ecosystems throughout Ireland, making it an excellent model species with which to examine the biogeographical history of anadromous fishes in the region. We used mitochondrial and microsatellite markers to examine the presence of divergent evolutionary lineages and to assess broad-scale patterns of geographical clustering among postglacially isolated populations. Our results confirm that Ireland is a region of secondary contact for divergent mitochondrial lineages and that endemic haplotypes occur in populations in Central and Southern Ireland. To test whether a putative Irish lineage arose from a cryptic Irish refugium, we used approximate Bayesian computation (ABC). However, we found no support for this hypothesis. Instead, the Irish lineage likely diverged from the European lineage as a result of postglacial isolation of freshwater populations by rising sea levels. These findings emphasize the need to rigorously test biogeographical hypothesis and contribute further evidence that postglacial processes may have shaped genetic diversity in temperate fauna.
Ravinet, Mark; Harrod, Chris; Eizaguirre, Christophe; Prodöhl, Paulo A
2014-01-01
Repeated recolonization of freshwater environments following Pleistocene glaciations has played a major role in the evolution and adaptation of anadromous taxa. Located at the western fringe of Europe, Ireland and Britain were likely recolonized rapidly by anadromous fishes from the North Atlantic following the last glacial maximum (LGM). While the presence of unique mitochondrial haplotypes in Ireland suggests that a cryptic northern refugium may have played a role in recolonization, no explicit test of this hypothesis has been conducted. The three-spined stickleback is native and ubiquitous to aquatic ecosystems throughout Ireland, making it an excellent model species with which to examine the biogeographical history of anadromous fishes in the region. We used mitochondrial and microsatellite markers to examine the presence of divergent evolutionary lineages and to assess broad-scale patterns of geographical clustering among postglacially isolated populations. Our results confirm that Ireland is a region of secondary contact for divergent mitochondrial lineages and that endemic haplotypes occur in populations in Central and Southern Ireland. To test whether a putative Irish lineage arose from a cryptic Irish refugium, we used approximate Bayesian computation (ABC). However, we found no support for this hypothesis. Instead, the Irish lineage likely diverged from the European lineage as a result of postglacial isolation of freshwater populations by rising sea levels. These findings emphasize the need to rigorously test biogeographical hypothesis and contribute further evidence that postglacial processes may have shaped genetic diversity in temperate fauna. PMID:25360281
Robust constraint on cosmic textures from the cosmic microwave background.
Feeney, Stephen M; Johnson, Matthew C; Mortlock, Daniel J; Peiris, Hiranya V
2012-06-15
Fluctuations in the cosmic microwave background (CMB) contain information which has been pivotal in establishing the current cosmological model. These data can also be used to test well-motivated additions to this model, such as cosmic textures. Textures are a type of topological defect that can be produced during a cosmological phase transition in the early Universe, and which leave characteristic hot and cold spots in the CMB. We apply bayesian methods to carry out a rigorous test of the texture hypothesis, using full-sky data from the Wilkinson Microwave Anisotropy Probe. We conclude that current data do not warrant augmenting the standard cosmological model with textures. We rule out at 95% confidence models that predict more than 6 detectable cosmic textures on the full sky.
When decision heuristics and science collide.
Yu, Erica C; Sprenger, Amber M; Thomas, Rick P; Dougherty, Michael R
2014-04-01
The ongoing discussion among scientists about null-hypothesis significance testing and Bayesian data analysis has led to speculation about the practices and consequences of "researcher degrees of freedom." This article advances this debate by asking the broader questions that we, as scientists, should be asking: How do scientists make decisions in the course of doing research, and what is the impact of these decisions on scientific conclusions? We asked practicing scientists to collect data in a simulated research environment, and our findings show that some scientists use data collection heuristics that deviate from prescribed methodology. Monte Carlo simulations show that data collection heuristics based on p values lead to biases in estimated effect sizes and Bayes factors and to increases in both false-positive and false-negative rates, depending on the specific heuristic. We also show that using Bayesian data collection methods does not eliminate these biases. Thus, our study highlights the little appreciated fact that the process of doing science is a behavioral endeavor that can bias statistical description and inference in a manner that transcends adherence to any particular statistical framework.
A Bayesian Approach to Model Selection in Hierarchical Mixtures-of-Experts Architectures.
Tanner, Martin A.; Peng, Fengchun; Jacobs, Robert A.
1997-03-01
There does not exist a statistical model that shows good performance on all tasks. Consequently, the model selection problem is unavoidable; investigators must decide which model is best at summarizing the data for each task of interest. This article presents an approach to the model selection problem in hierarchical mixtures-of-experts architectures. These architectures combine aspects of generalized linear models with those of finite mixture models in order to perform tasks via a recursive "divide-and-conquer" strategy. Markov chain Monte Carlo methodology is used to estimate the distribution of the architectures' parameters. One part of our approach to model selection attempts to estimate the worth of each component of an architecture so that relatively unused components can be pruned from the architecture's structure. A second part of this approach uses a Bayesian hypothesis testing procedure in order to differentiate inputs that carry useful information from nuisance inputs. Simulation results suggest that the approach presented here adheres to the dictum of Occam's razor; simple architectures that are adequate for summarizing the data are favored over more complex structures. Copyright 1997 Elsevier Science Ltd. All Rights Reserved.
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
Dembo, Mana; Radovčić, Davorka; Garvin, Heather M; Laird, Myra F; Schroeder, Lauren; Scott, Jill E; Brophy, Juliet; Ackermann, Rebecca R; Musiba, Chares M; de Ruiter, Darryl J; Mooers, Arne Ø; Collard, Mark
2016-08-01
Homo naledi is a recently discovered species of fossil hominin from South Africa. A considerable amount is already known about H. naledi but some important questions remain unanswered. Here we report a study that addressed two of them: "Where does H. naledi fit in the hominin evolutionary tree?" and "How old is it?" We used a large supermatrix of craniodental characters for both early and late hominin species and Bayesian phylogenetic techniques to carry out three analyses. First, we performed a dated Bayesian analysis to generate estimates of the evolutionary relationships of fossil hominins including H. naledi. Then we employed Bayes factor tests to compare the strength of support for hypotheses about the relationships of H. naledi suggested by the best-estimate trees. Lastly, we carried out a resampling analysis to assess the accuracy of the age estimate for H. naledi yielded by the dated Bayesian analysis. The analyses strongly supported the hypothesis that H. naledi forms a clade with the other Homo species and Australopithecus sediba. The analyses were more ambiguous regarding the position of H. naledi within the (Homo, Au. sediba) clade. A number of hypotheses were rejected, but several others were not. Based on the available craniodental data, Homo antecessor, Asian Homo erectus, Homo habilis, Homo floresiensis, Homo sapiens, and Au. sediba could all be the sister taxon of H. naledi. According to the dated Bayesian analysis, the most likely age for H. naledi is 912 ka. This age estimate was supported by the resampling analysis. Our findings have a number of implications. Most notably, they support the assignment of the new specimens to Homo, cast doubt on the claim that H. naledi is simply a variant of H. erectus, and suggest H. naledi is younger than has been previously proposed. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bayesian models for comparative analysis integrating phylogenetic uncertainty.
de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P
2012-06-28
Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602
Development of dynamic Bayesian models for web application test management
NASA Astrophysics Data System (ADS)
Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.
2018-03-01
The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.
NASA Astrophysics Data System (ADS)
Goodman, Steven N.
1989-11-01
This dissertation explores the use of a mathematical measure of statistical evidence, the log likelihood ratio, in clinical trials. The methods and thinking behind the use of an evidential measure are contrasted with traditional methods of analyzing data, which depend primarily on a p-value as an estimate of the statistical strength of an observed data pattern. It is contended that neither the behavioral dictates of Neyman-Pearson hypothesis testing methods, nor the coherency dictates of Bayesian methods are realistic models on which to base inference. The use of the likelihood alone is applied to four aspects of trial design or conduct: the calculation of sample size, the monitoring of data, testing for the equivalence of two treatments, and meta-analysis--the combining of results from different trials. Finally, a more general model of statistical inference, using belief functions, is used to see if it is possible to separate the assessment of evidence from our background knowledge. It is shown that traditional and Bayesian methods can be modeled as two ends of a continuum of structured background knowledge, methods which summarize evidence at the point of maximum likelihood assuming no structure, and Bayesian methods assuming complete knowledge. Both schools are seen to be missing a concept of ignorance- -uncommitted belief. This concept provides the key to understanding the problem of sampling to a foregone conclusion and the role of frequency properties in statistical inference. The conclusion is that statistical evidence cannot be defined independently of background knowledge, and that frequency properties of an estimator are an indirect measure of uncommitted belief. Several likelihood summaries need to be used in clinical trials, with the quantitative disparity between summaries being an indirect measure of our ignorance. This conclusion is linked with parallel ideas in the philosophy of science and cognitive psychology.
A Bayesian network approach for modeling local failure in lung cancer
NASA Astrophysics Data System (ADS)
Oh, Jung Hun; Craft, Jeffrey; Lozi, Rawan Al; Vaidya, Manushka; Meng, Yifan; Deasy, Joseph O.; Bradley, Jeffrey D.; El Naqa, Issam
2011-03-01
Locally advanced non-small cell lung cancer (NSCLC) patients suffer from a high local failure rate following radiotherapy. Despite many efforts to develop new dose-volume models for early detection of tumor local failure, there was no reported significant improvement in their application prospectively. Based on recent studies of biomarker proteins' role in hypoxia and inflammation in predicting tumor response to radiotherapy, we hypothesize that combining physical and biological factors with a suitable framework could improve the overall prediction. To test this hypothesis, we propose a graphical Bayesian network framework for predicting local failure in lung cancer. The proposed approach was tested using two different datasets of locally advanced NSCLC patients treated with radiotherapy. The first dataset was collected retrospectively, which comprises clinical and dosimetric variables only. The second dataset was collected prospectively in which in addition to clinical and dosimetric information, blood was drawn from the patients at various time points to extract candidate biomarkers as well. Our preliminary results show that the proposed method can be used as an efficient method to develop predictive models of local failure in these patients and to interpret relationships among the different variables in the models. We also demonstrate the potential use of heterogeneous physical and biological variables to improve the model prediction. With the first dataset, we achieved better performance compared with competing Bayesian-based classifiers. With the second dataset, the combined model had a slightly higher performance compared to individual physical and biological models, with the biological variables making the largest contribution. Our preliminary results highlight the potential of the proposed integrated approach for predicting post-radiotherapy local failure in NSCLC patients.
Advances in Significance Testing for Cluster Detection
NASA Astrophysics Data System (ADS)
Coleman, Deidra Andrea
Over the past two decades, much attention has been given to data driven project goals such as the Human Genome Project and the development of syndromic surveillance systems. A major component of these types of projects is analyzing the abundance of data. Detecting clusters within the data can be beneficial as it can lead to the identification of specified sequences of DNA nucleotides that are related to important biological functions or the locations of epidemics such as disease outbreaks or bioterrorism attacks. Cluster detection techniques require efficient and accurate hypothesis testing procedures. In this dissertation, we improve upon the hypothesis testing procedures for cluster detection by enhancing distributional theory and providing an alternative method for spatial cluster detection using syndromic surveillance data. In Chapter 2, we provide an efficient method to compute the exact distribution of the number and coverage of h-clumps of a collection of words. This method involves defining a Markov chain using a minimal deterministic automaton to reduce the number of states needed for computation. We allow words of the collection to contain other words of the collection making the method more general. We use our method to compute the distributions of the number and coverage of h-clumps in the Chi motif of H. influenza.. In Chapter 3, we provide an efficient algorithm to compute the exact distribution of multiple window discrete scan statistics for higher-order, multi-state Markovian sequences. This algorithm involves defining a Markov chain to efficiently keep track of probabilities needed to compute p-values of the statistic. We use our algorithm to identify cases where the available approximation does not perform well. We also use our algorithm to detect unusual clusters of made free throw shots by National Basketball Association players during the 2009-2010 regular season. In Chapter 4, we give a procedure to detect outbreaks using syndromic surveillance data while controlling the Bayesian False Discovery Rate (BFDR). The procedure entails choosing an appropriate Bayesian model that captures the spatial dependency inherent in epidemiological data and considers all days of interest, selecting a test statistic based on a chosen measure that provides the magnitude of the maximumal spatial cluster for each day, and identifying a cutoff value that controls the BFDR for rejecting the collective null hypothesis of no outbreak over a collection of days for a specified region.We use our procedure to analyze botulism-like syndrome data collected by the North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT).
Watts, Joseph; Greenhill, Simon J.; Atkinson, Quentin D.; Currie, Thomas E.; Bulbulia, Joseph; Gray, Russell D.
2015-01-01
Supernatural belief presents an explanatory challenge to evolutionary theorists—it is both costly and prevalent. One influential functional explanation claims that the imagined threat of supernatural punishment can suppress selfishness and enhance cooperation. Specifically, morally concerned supreme deities or ‘moralizing high gods' have been argued to reduce free-riding in large social groups, enabling believers to build the kind of complex societies that define modern humanity. Previous cross-cultural studies claiming to support the MHG hypothesis rely on correlational analyses only and do not correct for the statistical non-independence of sampled cultures. Here we use a Bayesian phylogenetic approach with a sample of 96 Austronesian cultures to test the MHG hypothesis as well as an alternative supernatural punishment hypothesis that allows punishment by a broad range of moralizing agents. We find evidence that broad supernatural punishment drives political complexity, whereas MHGs follow political complexity. We suggest that the concept of MHGs diffused as part of a suite of traits arising from cultural exchange between complex societies. Our results show the power of phylogenetic methods to address long-standing debates about the origins and functions of religion in human society. PMID:25740888
Deductive Updating Is Not Bayesian
ERIC Educational Resources Information Center
Markovits, Henry; Brisson, Janie; de Chantal, Pier-Luc
2015-01-01
One of the major debates concerning the nature of inferential reasoning is between counterexample-based theories such as mental model theory and probabilistic theories. This study looks at conclusion updating after the addition of statistical information to examine the hypothesis that deductive reasoning cannot be explained by probabilistic…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benetti, Micol; Alcaniz, Jailson S.; Landau, Susana J., E-mail: micolbenetti@on.br, E-mail: slandau@df.uba.ar, E-mail: alcaniz@on.br
The hypothesis of the self-induced collapse of the inflaton wave function was proposed as responsible for the emergence of inhomogeneity and anisotropy at all scales. This proposal was studied within an almost de Sitter space-time approximation for the background, which led to a perfect scale-invariant power spectrum, and also for a quasi-de Sitter background, which allows to distinguish departures from the standard approach due to the inclusion of the collapse hypothesis. In this work we perform a Bayesian model comparison for two different choices of the self-induced collapse in a full quasi-de Sitter expansion scenario. In particular, we analyze themore » possibility of detecting the imprint of these collapse schemes at low multipoles of the anisotropy temperature power spectrum of the Cosmic Microwave Background (CMB) using the most recent data provided by the Planck Collaboration. Our results show that one of the two collapse schemes analyzed provides the same Bayesian evidence of the minimal standard cosmological model ΛCDM, while the other scenario is weakly disfavoured with respect to the standard cosmology.« less
Gottscho, Andrew D; Marks, Sharyn B; Jennings, W Bryan
2014-01-01
The North American deserts were impacted by both Neogene plate tectonics and Quaternary climatic fluctuations, yet it remains unclear how these events influenced speciation in this region. We tested published hypotheses regarding the timing and mode of speciation, population structure, and demographic history of the Mojave Fringe-toed Lizard (Uma scoparia), a sand dune specialist endemic to the Mojave Desert of California and Arizona. We sampled 109 individual lizards representing 22 insular dune localities, obtained DNA sequences for 14 nuclear loci, and found that U. scoparia has low genetic diversity relative to the U. notata species complex, comparable to that of chimpanzees and southern elephant seals. Analyses of genotypes using Bayesian clustering algorithms did not identify discrete populations within U. scoparia. Using isolation-with-migration (IM) models and a novel coalescent-based hypothesis testing approach, we estimated that U. scoparia diverged from U. notata in the Pleistocene epoch. The likelihood ratio test and the Akaike Information Criterion consistently rejected nested speciation models that included parameters for migration and population growth of U. scoparia. We reject the Neogene vicariance hypothesis for the speciation of U. scoparia and define this species as a single evolutionarily significant unit for conservation purposes. PMID:25360285
Subliminal or not? Comparing null-hypothesis and Bayesian methods for testing subliminal priming.
Sand, Anders; Nilsson, Mats E
2016-08-01
A difficulty for reports of subliminal priming is demonstrating that participants who actually perceived the prime are not driving the priming effects. There are two conventional methods for testing this. One is to test whether a direct measure of stimulus perception is not significantly above chance on a group level. The other is to use regression to test if an indirect measure of stimulus processing is significantly above zero when the direct measure is at chance. Here we simulated samples in which we assumed that only participants who perceived the primes were primed by it. Conventional analyses applied to these samples had a very large error rate of falsely supporting subliminal priming. Calculating a Bayes factor for the samples very seldom falsely supported subliminal priming. We conclude that conventional tests are not reliable diagnostics of subliminal priming. Instead, we recommend that experimenters calculate a Bayes factor when investigating subliminal priming. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Cornejo-Romero, Amelia; Aguilar-Martínez, Gustavo F.; Medina-Sánchez, Javier; Rendón-Aguilar, Beatriz; Valverde, Pedro Luis; Zavala-Hurtado, Jose Alejandro; Serrato, Alejandra; Rivas-Arancibia, Sombra; Pérez-Hernández, Marco Aurelio; López-Ortega, Gerardo; Jiménez-Sierra, Cecilia
2017-01-01
Historic demography changes of plant species adapted to New World arid environments could be consistent with either the Glacial Refugium Hypothesis (GRH), which posits that populations contracted to refuges during the cold-dry glacial and expanded in warm-humid interglacial periods, or with the Interglacial Refugium Hypothesis (IRH), which suggests that populations contracted during interglacials and expanded in glacial times. These contrasting hypotheses are developed in the present study for the giant columnar cactus Cephalocereus columna-trajani in the intertropical Mexican drylands where the effects of Late Quaternary climatic changes on phylogeography of cacti remain largely unknown. In order to determine if the historic demography and phylogeographic structure of the species are consistent with either hypothesis, sequences of the chloroplast regions psbA-trnH and trnT-trnL from 110 individuals from 10 populations comprising the full distribution range of this species were analysed. Standard estimators of genetic diversity and structure were calculated. The historic demography was analysed using a Bayesian approach and the palaeodistribution was derived from ecological niche modelling to determine if, in the arid environments of south-central Mexico, glacial-interglacial cycles drove the genetic divergence and diversification of this species. Results reveal low but statistically significant population differentiation (FST = 0.124, P < 0.001), although very clear geographic clusters are not formed. Genetic diversity, haplotype network and Approximate Bayesian Computation (ABC) demographic analyses suggest a population expansion estimated to have taken place in the Last Interglacial (123.04 kya, 95% CI 115.3–130.03). The species palaeodistribution is consistent with the ABC analyses and indicates that the potential area of palaedistribution and climatic suitability were larger during the Last Interglacial and Holocene than in the Last Glacial Maximum. Overall, these results suggest that C. columna-trajani experienced an expansion following the warm conditions of interglacials, in accordance with the GRH. PMID:28426818
Cornejo-Romero, Amelia; Vargas-Mendoza, Carlos Fabián; Aguilar-Martínez, Gustavo F; Medina-Sánchez, Javier; Rendón-Aguilar, Beatriz; Valverde, Pedro Luis; Zavala-Hurtado, Jose Alejandro; Serrato, Alejandra; Rivas-Arancibia, Sombra; Pérez-Hernández, Marco Aurelio; López-Ortega, Gerardo; Jiménez-Sierra, Cecilia
2017-01-01
Historic demography changes of plant species adapted to New World arid environments could be consistent with either the Glacial Refugium Hypothesis (GRH), which posits that populations contracted to refuges during the cold-dry glacial and expanded in warm-humid interglacial periods, or with the Interglacial Refugium Hypothesis (IRH), which suggests that populations contracted during interglacials and expanded in glacial times. These contrasting hypotheses are developed in the present study for the giant columnar cactus Cephalocereus columna-trajani in the intertropical Mexican drylands where the effects of Late Quaternary climatic changes on phylogeography of cacti remain largely unknown. In order to determine if the historic demography and phylogeographic structure of the species are consistent with either hypothesis, sequences of the chloroplast regions psbA-trnH and trnT-trnL from 110 individuals from 10 populations comprising the full distribution range of this species were analysed. Standard estimators of genetic diversity and structure were calculated. The historic demography was analysed using a Bayesian approach and the palaeodistribution was derived from ecological niche modelling to determine if, in the arid environments of south-central Mexico, glacial-interglacial cycles drove the genetic divergence and diversification of this species. Results reveal low but statistically significant population differentiation (FST = 0.124, P < 0.001), although very clear geographic clusters are not formed. Genetic diversity, haplotype network and Approximate Bayesian Computation (ABC) demographic analyses suggest a population expansion estimated to have taken place in the Last Interglacial (123.04 kya, 95% CI 115.3-130.03). The species palaeodistribution is consistent with the ABC analyses and indicates that the potential area of palaedistribution and climatic suitability were larger during the Last Interglacial and Holocene than in the Last Glacial Maximum. Overall, these results suggest that C. columna-trajani experienced an expansion following the warm conditions of interglacials, in accordance with the GRH.
Mardulyn, Patrick; Goffredo, Maria; Conte, Annamaria; Hendrickx, Guy; Meiswinkel, Rudolf; Balenghien, Thomas; Sghaier, Soufien; Lohr, Youssef; Gilbert, Marius
2013-05-01
Bluetongue (BT) is a commonly cited example of a disease with a distribution believed to have recently expanded in response to global warming. The BT virus is transmitted to ruminants by biting midges of the genus Culicoides, and it has been hypothesized that the emergence of BT in Mediterranean Europe during the last two decades is a consequence of the recent colonization of the region by Culicoides imicola and linked to climate change. To better understand the mechanism responsible for the northward spread of BT, we tested the hypothesis of a recent colonization of Italy by C. imicola, by obtaining samples from more than 60 localities across Italy, Corsica, Southern France, and Northern Africa (the hypothesized source point for the recent invasion of C. imicola), and by genotyping them with 10 newly identified microsatellite loci. The patterns of genetic variation within and among the sampled populations were characterized and used in a rigorous approximate Bayesian computation framework to compare three competing historical hypotheses related to the arrival and establishment of C. imicola in Italy. The hypothesis of an ancient presence of the insect vector was strongly favoured by this analysis, with an associated P ≥ 99%, suggesting that causes other than the northward range expansion of C. imicola may have supported the emergence of BT in southern Europe. Overall, this study illustrates the potential of molecular genetic markers for exploring the assumed link between climate change and the spread of diseases. © 2013 Blackwell Publishing Ltd.
Effects of Phasor Measurement Uncertainty on Power Line Outage Detection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Chen; Wang, Jianhui; Zhu, Hao
2014-12-01
Phasor measurement unit (PMU) technology provides an effective tool to enhance the wide-area monitoring systems (WAMSs) in power grids. Although extensive studies have been conducted to develop several PMU applications in power systems (e.g., state estimation, oscillation detection and control, voltage stability analysis, and line outage detection), the uncertainty aspects of PMUs have not been adequately investigated. This paper focuses on quantifying the impact of PMU uncertainty on power line outage detection and identification, in which a limited number of PMUs installed at a subset of buses are utilized to detect and identify the line outage events. Specifically, the linemore » outage detection problem is formulated as a multi-hypothesis test, and a general Bayesian criterion is used for the detection procedure, in which the PMU uncertainty is analytically characterized. We further apply the minimum detection error criterion for the multi-hypothesis test and derive the expected detection error probability in terms of PMU uncertainty. The framework proposed provides fundamental guidance for quantifying the effects of PMU uncertainty on power line outage detection. Case studies are provided to validate our analysis and show how PMU uncertainty influences power line outage detection.« less
A Bayesian Perspective on the Reproducibility Project: Psychology.
Etz, Alexander; Vandekerckhove, Joachim
2016-01-01
We revisit the results of the recent Reproducibility Project: Psychology by the Open Science Collaboration. We compute Bayes factors-a quantity that can be used to express comparative evidence for an hypothesis but also for the null hypothesis-for a large subset (N = 72) of the original papers and their corresponding replication attempts. In our computation, we take into account the likely scenario that publication bias had distorted the originally published results. Overall, 75% of studies gave qualitatively similar results in terms of the amount of evidence provided. However, the evidence was often weak (i.e., Bayes factor < 10). The majority of the studies (64%) did not provide strong evidence for either the null or the alternative hypothesis in either the original or the replication, and no replication attempts provided strong evidence in favor of the null. In all cases where the original paper provided strong evidence but the replication did not (15%), the sample size in the replication was smaller than the original. Where the replication provided strong evidence but the original did not (10%), the replication sample size was larger. We conclude that the apparent failure of the Reproducibility Project to replicate many target effects can be adequately explained by overestimation of effect sizes (or overestimation of evidence against the null hypothesis) due to small sample sizes and publication bias in the psychological literature. We further conclude that traditional sample sizes are insufficient and that a more widespread adoption of Bayesian methods is desirable.
Multiple optimality criteria support Ornithoscelida
NASA Astrophysics Data System (ADS)
Parry, Luke A.; Baron, Matthew G.; Vinther, Jakob
2017-10-01
A recent study of early dinosaur evolution using equal-weights parsimony recovered a scheme of dinosaur interrelationships and classification that differed from historical consensus in a single, but significant, respect; Ornithischia and Saurischia were not recovered as monophyletic sister-taxa, but rather Ornithischia and Theropoda formed a novel clade named Ornithoscelida. However, these analyses only used maximum parsimony, and numerous recent simulation studies have questioned the accuracy of parsimony under equal weights. Here, we provide additional support for this alternative hypothesis using Bayesian implementation of the Mkv model, as well as through number of additional parsimony analyses, including implied weighting. Using Bayesian inference and implied weighting, we recover the same fundamental topology for Dinosauria as the original study, with a monophyletic Ornithoscelida, demonstrating that the main suite of methods used in morphological phylogenetics recover this novel hypothesis. This result was further scrutinized through the systematic exclusion of different character sets. Novel characters from the original study (those not taken or adapted from previous phylogenetic studies) were found to be more important for resolving the relationships within Dinosauromorpha than the relationships within Dinosauria. Reanalysis of a modified version of the character matrix that supports the Ornithischia-Saurischia dichotomy under maximum parsimony also supports this hypothesis under implied weighting, but not under the Mkv model, with both Theropoda and Sauropodomorpha becoming paraphyletic with respect to Ornithischia.
Perez, M F; Bonatelli, I A S; Moraes, E M; Carstens, B C
2016-01-01
Pilosocereus machrisii and P. aurisetus are cactus species within the P. aurisetus complex, a group of eight cacti that are restricted to rocky habitats within the Neotropical savannas of eastern South America. Previous studies have suggested that diversification within this complex was driven by distributional fragmentation, isolation leading to allopatric differentiation, and secondary contact among divergent lineages. These events have been associated with Quaternary climatic cycles, leading to the hypothesis that the xerophytic vegetation patches which presently harbor these populations operate as refugia during the current interglacial. However, owing to limitations of the standard phylogeographic approaches used in these studies, this hypothesis was not explicitly tested. Here we use Approximate Bayesian Computation to refine the previous inferences and test the role of different events in the diversification of two species within P. aurisetus group. We used molecular data from chloroplast DNA and simple sequence repeats loci of P. machrisii and P. aurisetus, the two species with broadest distribution in the complex, in order to test if the diversification in each species was driven mostly by vicariance or by long-dispersal events. We found that both species were affected primarily by vicariance, with a refuge model as the most likely scenario for P. aurisetus and a soft vicariance scenario most probable for P. machrisii. These results emphasize the importance of distributional fragmentation in these species, and add support to the hypothesis of long-term isolation in interglacial refugia previously proposed for the P. aurisetus species complex diversification. PMID:27071846
Thorvaldsson, Valgeir; Skoog, Ingmar; Johansson, Boo
2017-03-01
Terminal decline (TD) refers to acceleration in within-person cognitive decline prior to death. The cognitive reserve hypothesis postulates that individuals with higher IQ are able to better tolerate age-related increase in brain pathologies. On average, they will exhibit a later onset of TD, but once they start to decline, their trajectory is steeper relative to those with lower IQ. We tested these predictions using data from initially nondemented individuals (n = 179) in the H70-study repeatedly measured at ages 70, 75, 79, 81, 85, 88, 90, 92, 95, 97, 99, and 100, or until death, on cognitive tests of perceptual-and-motor-speed and spatial and verbal ability. We quantified IQ using the Raven's Coloured Progressive Matrices (RCPM) test administrated at age 70. We fitted random change point TD models to the data, within a Bayesian framework, conditioned on IQ, age of death, education, and sex. In line with predictions, we found that 1 additional standard deviation on the IQ scale was associated with a delay in onset of TD by 1.87 (95% highest density interval [HDI; 0.20, 4.08]) years on speed, 1.96 (95% HDI [0.15, 3.54]) years on verbal ability, but only 0.88 (95% HDI [-0.93, 3.49]) year on spatial ability. Higher IQ was associated with steeper rate of decline within the TD phase on measures of speed and verbal ability, whereas results on spatial ability were nonconclusive. Our findings provide partial support for the cognitive reserve hypothesis and demonstrate that IQ can be a significant moderator of cognitive change trajectories in old age. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Tsionas, Mike G.; Michaelides, Panayotis G.
2017-09-01
We use a novel Bayesian inference procedure for the Lyapunov exponent in the dynamical system of returns and their unobserved volatility. In the dynamical system, computation of largest Lyapunov exponent by traditional methods is impossible as the stochastic nature has to be taken explicitly into account due to unobserved volatility. We apply the new techniques to daily stock return data for a group of six countries, namely USA, UK, Switzerland, Netherlands, Germany and France, from 2003 to 2014, by means of Sequential Monte Carlo for Bayesian inference. The evidence points to the direction that there is indeed noisy chaos both before and after the recent financial crisis. However, when a much simpler model is examined where the interaction between returns and volatility is not taken into consideration jointly, the hypothesis of chaotic dynamics does not receive much support by the data ("neglected chaos").
Almost but not quite 2D, Non-linear Bayesian Inversion of CSEM Data
NASA Astrophysics Data System (ADS)
Ray, A.; Key, K.; Bodin, T.
2013-12-01
The geophysical inverse problem can be elegantly stated in a Bayesian framework where a probability distribution can be viewed as a statement of information regarding a random variable. After all, the goal of geophysical inversion is to provide information on the random variables of interest - physical properties of the earth's subsurface. However, though it may be simple to postulate, a practical difficulty of fully non-linear Bayesian inversion is the computer time required to adequately sample the model space and extract the information we seek. As a consequence, in geophysical problems where evaluation of a full 2D/3D forward model is computationally expensive, such as marine controlled source electromagnetic (CSEM) mapping of the resistivity of seafloor oil and gas reservoirs, Bayesian studies have largely been conducted with 1D forward models. While the 1D approximation is indeed appropriate for exploration targets with planar geometry and geological stratification, it only provides a limited, site-specific idea of uncertainty in resistivity with depth. In this work, we extend our fully non-linear 1D Bayesian inversion to a 2D model framework, without requiring the usual regularization of model resistivities in the horizontal or vertical directions used to stabilize quasi-2D inversions. In our approach, we use the reversible jump Markov-chain Monte-Carlo (RJ-MCMC) or trans-dimensional method and parameterize the subsurface in a 2D plane with Voronoi cells. The method is trans-dimensional in that the number of cells required to parameterize the subsurface is variable, and the cells dynamically move around and multiply or combine as demanded by the data being inverted. This approach allows us to expand our uncertainty analysis of resistivity at depth to more than a single site location, allowing for interactions between model resistivities at different horizontal locations along a traverse over an exploration target. While the model is parameterized in 2D, we efficiently evaluate the forward response using 1D profiles extracted from the model at the common-midpoints of the EM source-receiver pairs. Since the 1D approximation is locally valid at different midpoint locations, the computation time is far lower than is required by a full 2D or 3D simulation. We have applied this method to both synthetic and real CSEM survey data from the Scarborough gas field on the Northwest shelf of Australia, resulting in a spatially variable quantification of resistivity and its uncertainty in 2D. This Bayesian approach results in a large database of 2D models that comprise a posterior probability distribution, which we can subset to test various hypotheses about the range of model structures compatible with the data. For example, we can subset the model distributions to examine the hypothesis that a resistive reservoir extends overs a certain spatial extent. Depending on how this conditions other parts of the model space, light can be shed on the geological viability of the hypothesis. Since tackling spatially variable uncertainty and trade-offs in 2D and 3D is a challenging research problem, the insights gained from this work may prove valuable for subsequent full 2D and 3D Bayesian inversions.
Okano, Justin T; Robbins, Danielle; Palk, Laurence; Gerstoft, Jan; Obel, Niels; Blower, Sally
2016-07-01
Worldwide, approximately 35 million individuals are infected with HIV; about 25 million of these live in sub-Saharan Africa. WHO proposes using treatment as prevention (TasP) to eliminate HIV. Treatment suppresses viral load, decreasing the probability an individual transmits HIV. The elimination threshold is one new HIV infection per 1000 individuals. Here, we test the hypothesis that TasP can substantially reduce epidemics and eliminate HIV. We estimate the impact of TasP, between 1996 and 2013, on the Danish HIV epidemic in men who have sex with men (MSM), an epidemic UNAIDS has identified as a priority for elimination. We use a CD4-staged Bayesian back-calculation approach to estimate incidence, and the hidden epidemic (the number of HIV-infected undiagnosed MSM). To develop the back-calculation model, we use data from an ongoing nationwide population-based study: the Danish HIV Cohort Study. Incidence, and the hidden epidemic, decreased substantially after treatment was introduced in 1996. By 2013, incidence was close to the elimination threshold: 1·4 (median, 95% Bayesian credible interval [BCI] 0·4-2·1) new HIV infections per 1000 MSM and there were only 617 (264-858) undiagnosed MSM. Decreasing incidence and increasing treatment coverage were highly correlated; a treatment threshold effect was apparent. Our study is the first to show that TasP can substantially reduce a country's HIV epidemic, and bring it close to elimination. However, we have shown the effectiveness of TasP under optimal conditions: very high treatment coverage, and exceptionally high (98%) viral suppression rate. Unless these extremely challenging conditions can be met in sub-Saharan Africa, the WHO's global elimination strategy is unlikely to succeed. National Institute of Allergy and Infectious Diseases. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Rational Analysis of Rule-Based Concept Learning
ERIC Educational Resources Information Center
Goodman, Noah D.; Tenenbaum, Joshua B.; Feldman, Jacob; Griffiths, Thomas L.
2008-01-01
This article proposes a new model of human concept learning that provides a rational analysis of learning feature-based concepts. This model is built upon Bayesian inference for a grammatically structured hypothesis space--a concept language of logical rules. This article compares the model predictions to human generalization judgments in several…
On the occurrence of false positives in tests of migration under an isolation with migration model
Hey, Jody; Chung, Yujin; Sethuraman, Arun
2015-01-01
The population genetic study of divergence is often done using a Bayesian genealogy sampler, like those implemented in IMa2 and related programs, and these analyses frequently include a likelihood-ratio test of the null hypothesis of no migration between populations. Cruickshank and Hahn (2014, Molecular Ecology, 23, 3133–3157) recently reported a high rate of false positive test results with IMa2 for data simulated with small numbers of loci under models with no migration and recent splitting times. We confirm these findings and discover that they are caused by a failure of the assumptions underlying likelihood ratio tests that arises when using marginal likelihoods for a subset of model parameters. We also show that for small data sets, with little divergence between samples from two populations, an excellent fit can often be found by a model with a low migration rate and recent splitting time and a model with a high migration rate and a deep splitting time. PMID:26456794
NASA Astrophysics Data System (ADS)
Roy, C.; Romanowicz, B. A.
2017-12-01
Monte Carlo methods are powerful approaches to solve nonlinear problems and are becoming very popular in Earth sciences. One reason being that, at first glance, no constraints or explicit regularization of model parameters are required. At second glance, one might realize that regularization is done through a prior. The choice of this prior, however, is subjective, and with its choice, unintended or undesired extra information can be injected into the problem. The principal criticism of Bayesian methods is that the prior can be "tuned" in order to get the expected solution. Consequently, detractors of the Bayesian method could easily argue that the solution is influenced by the form of the prior distribution, which choice is subjective. Hence, models obtained with Monte Carlo methods are still highly debated. Here we investigate the influence of a priori constraints (i.e., fixed crustal discontinuities) on the posterior probability distributions of estimated parameters, that is, vertical polarized shear velocity VSV and radial anisotropy ξ, in a transdimensional Bayesian inversion for continental lithospheric structure. We follow upon the work of Calò et al. (2016), who jointly inverted converted phases (P to S) without deconvolution and surface wave dispersion data, to obtain 1-D radial anisotropic shear wave velocity profiles in the North American craton. We aim at verifying whether the strong lithospheric layering found in the stable part of the craton is robust with respect to artifacts that might be caused by the methodology used. We test the hypothesis that the observed midlithospheric discontinuities result from (1) fixed crustal discontinuities in the reference model and (2) a fixed Vp/Vs ratio. The synthetic tests on two Earth models show that a fixed Vp/Vs ratio does not introduce artificial layering, even if the assumed value is slightly wrong. This is an important finding for real data inversion where the true value is not always available or accurate. However, fixing crustal discontinuities can lead to the introduction of spurious layering, and this is not recommended. Additionally, allowing the Vp/Vs ratio to vary does not help preventing that. Applying the modified approach resulting from these tests to two stations (FRB and FCC) in the North American craton, we confirm the presence of at least one midlithospheric low-velocity layer. We also confirm the difficulty of consistently detecting the lithosphere-asthenosphere boundary in the craton.
A Bayesian sequential design using alpha spending function to control type I error.
Zhu, Han; Yu, Qingzhao
2017-10-01
We propose in this article a Bayesian sequential design using alpha spending functions to control the overall type I error in phase III clinical trials. We provide algorithms to calculate critical values, power, and sample sizes for the proposed design. Sensitivity analysis is implemented to check the effects from different prior distributions, and conservative priors are recommended. We compare the power and actual sample sizes of the proposed Bayesian sequential design with different alpha spending functions through simulations. We also compare the power of the proposed method with frequentist sequential design using the same alpha spending function. Simulations show that, at the same sample size, the proposed method provides larger power than the corresponding frequentist sequential design. It also has larger power than traditional Bayesian sequential design which sets equal critical values for all interim analyses. When compared with other alpha spending functions, O'Brien-Fleming alpha spending function has the largest power and is the most conservative in terms that at the same sample size, the null hypothesis is the least likely to be rejected at early stage of clinical trials. And finally, we show that adding a step of stop for futility in the Bayesian sequential design can reduce the overall type I error and reduce the actual sample sizes.
BOP2: Bayesian optimal design for phase II clinical trials with simple and complex endpoints.
Zhou, Heng; Lee, J Jack; Yuan, Ying
2017-09-20
We propose a flexible Bayesian optimal phase II (BOP2) design that is capable of handling simple (e.g., binary) and complicated (e.g., ordinal, nested, and co-primary) endpoints under a unified framework. We use a Dirichlet-multinomial model to accommodate different types of endpoints. At each interim, the go/no-go decision is made by evaluating a set of posterior probabilities of the events of interest, which is optimized to maximize power or minimize the number of patients under the null hypothesis. Unlike other existing Bayesian designs, the BOP2 design explicitly controls the type I error rate, thereby bridging the gap between Bayesian designs and frequentist designs. In addition, the stopping boundary of the BOP2 design can be enumerated prior to the onset of the trial. These features make the BOP2 design accessible to a wide range of users and regulatory agencies and particularly easy to implement in practice. Simulation studies show that the BOP2 design has favorable operating characteristics with higher power and lower risk of incorrectly terminating the trial than some existing Bayesian phase II designs. The software to implement the BOP2 design is freely available at www.trialdesign.org. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
NASA Astrophysics Data System (ADS)
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.
Kennett, James P.; Kennett, Douglas J.; Culleton, Brendan J.; Aura Tortosa, J. Emili; Bischoff, James L.; Bunch, Ted E.; Daniel, I. Randolph; Erlandson, Jon M.; Ferraro, David; Firestone, Richard B.; Goodyear, Albert C.; Israde-Alcántara, Isabel; Johnson, John R.; Jordá Pardo, Jesús F.; Kimbel, David R.; LeCompte, Malcolm A.; Lopinot, Neal H.; Mahaney, William C.; Moore, Andrew M. T.; Moore, Christopher R.; Ray, Jack H.; Stafford, Thomas W.; Tankersley, Kenneth Barnett; Wittke, James H.; Wolbach, Wendy S.; West, Allen
2015-01-01
The Younger Dryas impact hypothesis posits that a cosmic impact across much of the Northern Hemisphere deposited the Younger Dryas boundary (YDB) layer, containing peak abundances in a variable assemblage of proxies, including magnetic and glassy impact-related spherules, high-temperature minerals and melt glass, nanodiamonds, carbon spherules, aciniform carbon, platinum, and osmium. Bayesian chronological modeling was applied to 354 dates from 23 stratigraphic sections in 12 countries on four continents to establish a modeled YDB age range for this event of 12,835–12,735 Cal B.P. at 95% probability. This range overlaps that of a peak in extraterrestrial platinum in the Greenland Ice Sheet and of the earliest age of the Younger Dryas climate episode in six proxy records, suggesting a causal connection between the YDB impact event and the Younger Dryas. Two statistical tests indicate that both modeled and unmodeled ages in the 30 records are consistent with synchronous deposition of the YDB layer within the limits of dating uncertainty (∼100 y). The widespread distribution of the YDB layer suggests that it may serve as a datum layer. PMID:26216981
Context-dependent decision-making: a simple Bayesian model
Lloyd, Kevin; Leslie, David S.
2013-01-01
Many phenomena in animal learning can be explained by a context-learning process whereby an animal learns about different patterns of relationship between environmental variables. Differentiating between such environmental regimes or ‘contexts’ allows an animal to rapidly adapt its behaviour when context changes occur. The current work views animals as making sequential inferences about current context identity in a world assumed to be relatively stable but also capable of rapid switches to previously observed or entirely new contexts. We describe a novel decision-making model in which contexts are assumed to follow a Chinese restaurant process with inertia and full Bayesian inference is approximated by a sequential-sampling scheme in which only a single hypothesis about current context is maintained. Actions are selected via Thompson sampling, allowing uncertainty in parameters to drive exploration in a straightforward manner. The model is tested on simple two-alternative choice problems with switching reinforcement schedules and the results compared with rat behavioural data from a number of T-maze studies. The model successfully replicates a number of important behavioural effects: spontaneous recovery, the effect of partial reinforcement on extinction and reversal, the overtraining reversal effect, and serial reversal-learning effects. PMID:23427101
Context-dependent decision-making: a simple Bayesian model.
Lloyd, Kevin; Leslie, David S
2013-05-06
Many phenomena in animal learning can be explained by a context-learning process whereby an animal learns about different patterns of relationship between environmental variables. Differentiating between such environmental regimes or 'contexts' allows an animal to rapidly adapt its behaviour when context changes occur. The current work views animals as making sequential inferences about current context identity in a world assumed to be relatively stable but also capable of rapid switches to previously observed or entirely new contexts. We describe a novel decision-making model in which contexts are assumed to follow a Chinese restaurant process with inertia and full Bayesian inference is approximated by a sequential-sampling scheme in which only a single hypothesis about current context is maintained. Actions are selected via Thompson sampling, allowing uncertainty in parameters to drive exploration in a straightforward manner. The model is tested on simple two-alternative choice problems with switching reinforcement schedules and the results compared with rat behavioural data from a number of T-maze studies. The model successfully replicates a number of important behavioural effects: spontaneous recovery, the effect of partial reinforcement on extinction and reversal, the overtraining reversal effect, and serial reversal-learning effects.
Phylogenetic evidence for cladogenetic polyploidization in land plants.
Zhan, Shing H; Drori, Michal; Goldberg, Emma E; Otto, Sarah P; Mayrose, Itay
2016-07-01
Polyploidization is a common and recurring phenomenon in plants and is often thought to be a mechanism of "instant speciation". Whether polyploidization is associated with the formation of new species (cladogenesis) or simply occurs over time within a lineage (anagenesis), however, has never been assessed systematically. We tested this hypothesis using phylogenetic and karyotypic information from 235 plant genera (mostly angiosperms). We first constructed a large database of combined sequence and chromosome number data sets using an automated procedure. We then applied likelihood models (ClaSSE) that estimate the degree of synchronization between polyploidization and speciation events in maximum likelihood and Bayesian frameworks. Our maximum likelihood analysis indicated that 35 genera supported a model that includes cladogenetic transitions over a model with only anagenetic transitions, whereas three genera supported a model that incorporates anagenetic transitions over one with only cladogenetic transitions. Furthermore, the Bayesian analysis supported a preponderance of cladogenetic change in four genera but did not support a preponderance of anagenetic change in any genus. Overall, these phylogenetic analyses provide the first broad confirmation that polyploidization is temporally associated with speciation events, suggesting that it is indeed a major speciation mechanism in plants, at least in some genera. © 2016 Botanical Society of America.
Efficient Bayesian inference for natural time series using ARFIMA processes
NASA Astrophysics Data System (ADS)
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-11-01
Many geophysical quantities, such as atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long memory (LM). LM implies that these quantities experience non-trivial temporal memory, which potentially not only enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LM. In this paper we present a modern and systematic approach to the inference of LM. We use the flexible autoregressive fractional integrated moving average (ARFIMA) model, which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LM, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g., short-memory effects) can be integrated over in order to focus on long-memory parameters and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data and the central England temperature (CET) time series, with favorable comparison to the standard estimators. For CET we also extend our method to seasonal long memory.
NASA Astrophysics Data System (ADS)
Kennett, James P.; Kennett, Douglas J.; Culleton, Brendan J.; Emili Aura Tortosa, J.; Bischoff, James L.; Bunch, Ted E.; Daniel, I. Randolph, Jr.; Erlandson, Jon M.; Ferraro, David; Firestone, Richard B.; Goodyear, Albert C.; Israde-Alcántara, Isabel; Johnson, John R.; Jordá Pardo, Jesús F.; Kimbel, David R.; LeCompte, Malcolm A.; Lopinot, Neal H.; Mahaney, William C.; Moore, Andrew M. T.; Moore, Christopher R.; Ray, Jack H.; Stafford, Thomas W., Jr.; Barnett Tankersley, Kenneth; Wittke, James H.; Wolbach, Wendy S.; West, Allen
2015-08-01
The Younger Dryas impact hypothesis posits that a cosmic impact across much of the Northern Hemisphere deposited the Younger Dryas boundary (YDB) layer, containing peak abundances in a variable assemblage of proxies, including magnetic and glassy impact-related spherules, high-temperature minerals and melt glass, nanodiamonds, carbon spherules, aciniform carbon, platinum, and osmium. Bayesian chronological modeling was applied to 354 dates from 23 stratigraphic sections in 12 countries on four continents to establish a modeled YDB age range for this event of 12,835-12,735 Cal B.P. at 95% probability. This range overlaps that of a peak in extraterrestrial platinum in the Greenland Ice Sheet and of the earliest age of the Younger Dryas climate episode in six proxy records, suggesting a causal connection between the YDB impact event and the Younger Dryas. Two statistical tests indicate that both modeled and unmodeled ages in the 30 records are consistent with synchronous deposition of the YDB layer within the limits of dating uncertainty (∼100 y). The widespread distribution of the YDB layer suggests that it may serve as a datum layer.
Kennett, James P; Kennett, Douglas J; Culleton, Brendan J; Aura Tortosa, J Emili; Bischoff, James L; Bunch, Ted E; Daniel, I Randolph; Erlandson, Jon M; Ferraro, David; Firestone, Richard B; Goodyear, Albert C; Israde-Alcántara, Isabel; Johnson, John R; Jordá Pardo, Jesús F; Kimbel, David R; LeCompte, Malcolm A; Lopinot, Neal H; Mahaney, William C; Moore, Andrew M T; Moore, Christopher R; Ray, Jack H; Stafford, Thomas W; Tankersley, Kenneth Barnett; Wittke, James H; Wolbach, Wendy S; West, Allen
2015-08-11
The Younger Dryas impact hypothesis posits that a cosmic impact across much of the Northern Hemisphere deposited the Younger Dryas boundary (YDB) layer, containing peak abundances in a variable assemblage of proxies, including magnetic and glassy impact-related spherules, high-temperature minerals and melt glass, nanodiamonds, carbon spherules, aciniform carbon, platinum, and osmium. Bayesian chronological modeling was applied to 354 dates from 23 stratigraphic sections in 12 countries on four continents to establish a modeled YDB age range for this event of 12,835-12,735 Cal B.P. at 95% probability. This range overlaps that of a peak in extraterrestrial platinum in the Greenland Ice Sheet and of the earliest age of the Younger Dryas climate episode in six proxy records, suggesting a causal connection between the YDB impact event and the Younger Dryas. Two statistical tests indicate that both modeled and unmodeled ages in the 30 records are consistent with synchronous deposition of the YDB layer within the limits of dating uncertainty (∼ 100 y). The widespread distribution of the YDB layer suggests that it may serve as a datum layer.
Nonparametric Bayesian clustering to detect bipolar methylated genomic loci.
Wu, Xiaowei; Sun, Ming-An; Zhu, Hongxiao; Xie, Hehuang
2015-01-16
With recent development in sequencing technology, a large number of genome-wide DNA methylation studies have generated massive amounts of bisulfite sequencing data. The analysis of DNA methylation patterns helps researchers understand epigenetic regulatory mechanisms. Highly variable methylation patterns reflect stochastic fluctuations in DNA methylation, whereas well-structured methylation patterns imply deterministic methylation events. Among these methylation patterns, bipolar patterns are important as they may originate from allele-specific methylation (ASM) or cell-specific methylation (CSM). Utilizing nonparametric Bayesian clustering followed by hypothesis testing, we have developed a novel statistical approach to identify bipolar methylated genomic regions in bisulfite sequencing data. Simulation studies demonstrate that the proposed method achieves good performance in terms of specificity and sensitivity. We used the method to analyze data from mouse brain and human blood methylomes. The bipolar methylated segments detected are found highly consistent with the differentially methylated regions identified by using purified cell subsets. Bipolar DNA methylation often indicates epigenetic heterogeneity caused by ASM or CSM. With allele-specific events filtered out or appropriately taken into account, our proposed approach sheds light on the identification of cell-specific genes/pathways under strong epigenetic control in a heterogeneous cell population.
Probabilistic Cross-identification of Cosmic Events
NASA Astrophysics Data System (ADS)
Budavári, Tamás
2011-08-01
I discuss a novel approach to identifying cosmic events in separate and independent observations. The focus is on the true events, such as supernova explosions, that happen once and, hence, whose measurements are not repeatable. Their classification and analysis must make the best use of all available data. Bayesian hypothesis testing is used to associate streams of events in space and time. Probabilities are assigned to the matches by studying their rates of occurrence. A case study of Type Ia supernovae illustrates how to use light curves in the cross-identification process. Constraints from realistic light curves happen to be well approximated by Gaussians in time, which makes the matching process very efficient. Model-dependent associations are computationally more demanding but can further boost one's confidence.
The inverse problem of brain energetics: ketone bodies as alternative substrates
NASA Astrophysics Data System (ADS)
Calvetti, D.; Occhipinti, R.; Somersalo, E.
2008-07-01
Little is known about brain energy metabolism under ketosis, although there is evidence that ketone bodies have a neuroprotective role in several neurological disorders. We investigate the inverse problem of estimating reaction fluxes and transport rates in the different cellular compartments of the brain, when the data amounts to a few measured arterial venous concentration differences. By using a recently developed methodology to perform Bayesian Flux Balance Analysis and a new five compartment model of the astrocyte-glutamatergic neuron cellular complex, we are able to identify the preferred biochemical pathways during shortage of glucose and in the presence of ketone bodies in the arterial blood. The analysis is performed in a minimally biased way, therefore revealing the potential of this methodology for hypothesis testing.
Multigene analysis of lophophorate and chaetognath phylogenetic relationships.
Helmkampf, Martin; Bruchhaus, Iris; Hausdorf, Bernhard
2008-01-01
Maximum likelihood and Bayesian inference analyses of seven concatenated fragments of nuclear-encoded housekeeping genes indicate that Lophotrochozoa is monophyletic, i.e., the lophophorate groups Bryozoa, Brachiopoda and Phoronida are more closely related to molluscs and annelids than to Deuterostomia or Ecdysozoa. Lophophorates themselves, however, form a polyphyletic assemblage. The hypotheses that they are monophyletic and more closely allied to Deuterostomia than to Protostomia can be ruled out with both the approximately unbiased test and the expected likelihood weights test. The existence of Phoronozoa, a putative clade including Brachiopoda and Phoronida, has also been rejected. According to our analyses, phoronids instead share a more recent common ancestor with bryozoans than with brachiopods. Platyhelminthes is the sister group of Lophotrochozoa. Together these two constitute Spiralia. Although Chaetognatha appears as the sister group of Priapulida within Ecdysozoa in our analyses, alternative hypothesis concerning chaetognath relationships could not be rejected.
A New Method for Predicting Patient Survivorship Using Efficient Bayesian Network Learning
Jiang, Xia; Xue, Diyang; Brufsky, Adam; Khan, Seema; Neapolitan, Richard
2014-01-01
The purpose of this investigation is to develop and evaluate a new Bayesian network (BN)-based patient survivorship prediction method. The central hypothesis is that the method predicts patient survivorship well, while having the capability to handle high-dimensional data and be incorporated into a clinical decision support system (CDSS). We have developed EBMC_Survivorship (EBMC_S), which predicts survivorship for each year individually. EBMC_S is based on the EBMC BN algorithm, which has been shown to handle high-dimensional data. BNs have excellent architecture for decision support systems. In this study, we evaluate EBMC_S using the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset, which concerns breast tumors. A 5-fold cross-validation study indicates that EMBC_S performs better than the Cox proportional hazard model and is comparable to the random survival forest method. We show that EBMC_S provides additional information such as sensitivity analyses, which covariates predict each year, and yearly areas under the ROC curve (AUROCs). We conclude that our investigation supports the central hypothesis. PMID:24558297
A new method for predicting patient survivorship using efficient bayesian network learning.
Jiang, Xia; Xue, Diyang; Brufsky, Adam; Khan, Seema; Neapolitan, Richard
2014-01-01
The purpose of this investigation is to develop and evaluate a new Bayesian network (BN)-based patient survivorship prediction method. The central hypothesis is that the method predicts patient survivorship well, while having the capability to handle high-dimensional data and be incorporated into a clinical decision support system (CDSS). We have developed EBMC_Survivorship (EBMC_S), which predicts survivorship for each year individually. EBMC_S is based on the EBMC BN algorithm, which has been shown to handle high-dimensional data. BNs have excellent architecture for decision support systems. In this study, we evaluate EBMC_S using the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset, which concerns breast tumors. A 5-fold cross-validation study indicates that EMBC_S performs better than the Cox proportional hazard model and is comparable to the random survival forest method. We show that EBMC_S provides additional information such as sensitivity analyses, which covariates predict each year, and yearly areas under the ROC curve (AUROCs). We conclude that our investigation supports the central hypothesis.
Watts, Joseph; Greenhill, Simon J; Atkinson, Quentin D; Currie, Thomas E; Bulbulia, Joseph; Gray, Russell D
2015-04-07
Supernatural belief presents an explanatory challenge to evolutionary theorists-it is both costly and prevalent. One influential functional explanation claims that the imagined threat of supernatural punishment can suppress selfishness and enhance cooperation. Specifically, morally concerned supreme deities or 'moralizing high gods' have been argued to reduce free-riding in large social groups, enabling believers to build the kind of complex societies that define modern humanity. Previous cross-cultural studies claiming to support the MHG hypothesis rely on correlational analyses only and do not correct for the statistical non-independence of sampled cultures. Here we use a Bayesian phylogenetic approach with a sample of 96 Austronesian cultures to test the MHG hypothesis as well as an alternative supernatural punishment hypothesis that allows punishment by a broad range of moralizing agents. We find evidence that broad supernatural punishment drives political complexity, whereas MHGs follow political complexity. We suggest that the concept of MHGs diffused as part of a suite of traits arising from cultural exchange between complex societies. Our results show the power of phylogenetic methods to address long-standing debates about the origins and functions of religion in human society. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Gronau, Quentin Frederik; Duizer, Monique; Bakker, Marjan; Wagenmakers, Eric-Jan
2017-09-01
Publication bias and questionable research practices have long been known to corrupt the published record. One method to assess the extent of this corruption is to examine the meta-analytic collection of significant p values, the so-called p -curve (Simonsohn, Nelson, & Simmons, 2014a). Inspired by statistical research on false-discovery rates, we propose a Bayesian mixture model analysis of the p -curve. Our mixture model assumes that significant p values arise either from the null-hypothesis H ₀ (when their distribution is uniform) or from the alternative hypothesis H1 (when their distribution is accounted for by a simple parametric model). The mixture model estimates the proportion of significant results that originate from H ₀, but it also estimates the probability that each specific p value originates from H ₀. We apply our model to 2 examples. The first concerns the set of 587 significant p values for all t tests published in the 2007 volumes of Psychonomic Bulletin & Review and the Journal of Experimental Psychology: Learning, Memory, and Cognition; the mixture model reveals that p values higher than about .005 are more likely to stem from H ₀ than from H ₁. The second example concerns 159 significant p values from studies on social priming and 130 from yoked control studies. The results from the yoked controls confirm the findings from the first example, whereas the results from the social priming studies are difficult to interpret because they are sensitive to the prior specification. To maximize accessibility, we provide a web application that allows researchers to apply the mixture model to any set of significant p values. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Hyperspectral techniques in analysis of oral dosage forms.
Hamilton, Sara J; Lowell, Amanda E; Lodder, Robert A
2002-10-01
Pharmaceutical oral dosage forms are used in this paper to test the sensitivity and spatial resolution of hyperspectral imaging instruments. The first experiment tested the hypothesis that a near-infrared (IR) tunable diode-based remote sensing system is capable of monitoring degradation of hard gelatin capsules at a relatively long distance (0.5 km). Spectra from the capsules were used to differentiate among capsules exposed to an atmosphere containing 150 ppb formaldehyde for 0, 2, 4, and 8 h. Robust median-based principal component regression with Bayesian inference was employed for outlier detection. The second experiment tested the hypothesis that near-IR imaging spectrometry of tablets permits the identification and composition of multiple individual tablets to be determined simultaneously. A near-IR camera was used to collect thousands of spectra simultaneously from a field of blister-packaged tablets. The number of tablets that a typical near-IR camera can currently analyze simultaneously was estimated to be approximately 1300. The bootstrap error-adjusted single-sample technique chemometric-imaging algorithm was used to draw probability-density contour plots that revealed tablet composition. The single-capsule analysis provides an indication of how far apart the sample and instrumentation can be and still maintain adequate signal-to-noise ratio (S/N), while the multiple-tablet imaging experiment gives an indication of how many samples can be analyzed simultaneously while maintaining an adequate S/N and pixel coverage on each sample.
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
No grammatical gender effect on affective ratings: evidence from Italian and German languages.
Montefinese, Maria; Ambrosini, Ettore; Roivainen, Eka
2018-06-06
In this study, we tested the linguistic relativity hypothesis by studying the effect of grammatical gender (feminine vs. masculine) on affective judgments of conceptual representation in Italian and German. In particular, we examined the within- and cross-language grammatical gender effect and its interaction with participants' demographic characteristics (such as, the raters' age and sex) on semantic differential scales (affective ratings of valence, arousal and dominance) in Italian and German speakers. We selected the stimuli and the relative affective measures from Italian and German adaptations of the ANEW (Affective Norms for English Words). Bayesian and frequentist analyses yielded evidence for the absence of within- and cross-languages effects of grammatical gender and sex- and age-dependent interactions. These results suggest that grammatical gender does not affect judgments of affective features of semantic representation in Italian and German speakers, since an overt coding of word grammar is not required. Although further research is recommended to refine the impact of the grammatical gender on properties of semantic representation, these results have implications for any strong view of the linguistic relativity hypothesis.
Maximum saliency bias in binocular fusion
NASA Astrophysics Data System (ADS)
Lu, Yuhao; Stafford, Tom; Fox, Charles
2016-07-01
Subjective experience at any instant consists of a single ("unitary"), coherent interpretation of sense data rather than a "Bayesian blur" of alternatives. However, computation of Bayes-optimal actions has no role for unitary perception, instead being required to integrate over every possible action-percept pair to maximise expected utility. So what is the role of unitary coherent percepts, and how are they computed? Recent work provided objective evidence for non-Bayes-optimal, unitary coherent, perception and action in humans; and further suggested that the percept selected is not the maximum a posteriori percept but is instead affected by utility. The present study uses a binocular fusion task first to reproduce the same effect in a new domain, and second, to test multiple hypotheses about exactly how utility may affect the percept. After accounting for high experimental noise, it finds that both Bayes optimality (maximise expected utility) and the previously proposed maximum-utility hypothesis are outperformed in fitting the data by a modified maximum-salience hypothesis, using unsigned utility magnitudes in place of signed utilities in the bias function.
Back off! The effect of emotion on backward step initiation.
Bouman, Daniëlle; Stins, John F
2018-02-01
The distance regulation (DR) hypothesis states that actors are inclined to increase their distance from an unpleasant stimulus. The current study investigated the relation between emotion and its effect on the control of backward step initiation, which constitutes an avoidance-like behavior. Participants stepped backward on a force plate in response to neutral, high-arousing pleasant and high-arousing unpleasant visual emotional stimuli. Gait initiation parameters and the results of an exploratory analysis of postural sway were compared across the emotion categories using significance testing and Bayesian statistics. Evidence was found that gait initiation parameters were largely unaffected by emotional conditions. In contrast, the exploratory analysis of postural immobility showed a significant effect: highly arousing stimuli (pleasant and unpleasant) resulted in more postural sway immediately preceding gait initiation compared to neutral stimuli. This suggests that arousal, rather than valence, affects pre-step sway. These results contradict the DR hypothesis, since avoidance gait-initiation in response to unpleasant stimuli was no different compared to pleasant stimuli. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Cheung, Shao-Yong; Lee, Chieh-Han; Yu, Hwa-Lung
2017-04-01
Due to the limited hydrogeological observation data and high levels of uncertainty within, parameter estimation of the groundwater model has been an important issue. There are many methods of parameter estimation, for example, Kalman filter provides a real-time calibration of parameters through measurement of groundwater monitoring wells, related methods such as Extended Kalman Filter and Ensemble Kalman Filter are widely applied in groundwater research. However, Kalman Filter method is limited to linearity. This study propose a novel method, Bayesian Maximum Entropy Filtering, which provides a method that can considers the uncertainty of data in parameter estimation. With this two methods, we can estimate parameter by given hard data (certain) and soft data (uncertain) in the same time. In this study, we use Python and QGIS in groundwater model (MODFLOW) and development of Extended Kalman Filter and Bayesian Maximum Entropy Filtering in Python in parameter estimation. This method may provide a conventional filtering method and also consider the uncertainty of data. This study was conducted through numerical model experiment to explore, combine Bayesian maximum entropy filter and a hypothesis for the architecture of MODFLOW groundwater model numerical estimation. Through the virtual observation wells to simulate and observe the groundwater model periodically. The result showed that considering the uncertainty of data, the Bayesian maximum entropy filter will provide an ideal result of real-time parameters estimation.
Functional Multi-Locus QTL Mapping of Temporal Trends in Scots Pine Wood Traits
Li, Zitong; Hallingbäck, Henrik R.; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J.; García-Gil, M. Rosario
2014-01-01
Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. PMID:25305041
Functional multi-locus QTL mapping of temporal trends in Scots pine wood traits.
Li, Zitong; Hallingbäck, Henrik R; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J; García-Gil, M Rosario
2014-10-09
Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. Copyright © 2014 Li et al.
Molecular Phylogenies indicate a Paleo-Tibetan Origin of Himalayan Lazy Toads (Scutiger).
Hofmann, Sylvia; Stöck, Matthias; Zheng, Yuchi; Ficetola, Francesco G; Li, Jia-Tang; Scheidt, Ulrich; Schmidt, Joachim
2017-06-12
The Himalaya presents an outstanding geologically active orogen and biodiversity hotspot. However, our understanding of the historical biogeography of its fauna is far from comprehensive. Many taxa are commonly assumed to have originated from China-Indochina and dispersed westward along the Himalayan chain. Alternatively, the "Tibetan-origin hypothesis" suggests primary diversification of lineages in Paleo-Tibet, and secondary diversification along the slopes of the later uplifted Greater Himalaya. We test these hypotheses in high-mountain megophryid anurans (Scutiger). Extensive sampling from High Asia, and analyses of mitochondrial (2839 bp) and nuclear DNA (2208 bp), using Bayesian and Maximum likelihood phylogenetics, suggest that the Himalayan species form a distinct clade, possibly older than those from the eastern Himalaya-Tibet orogen. While immigration from China-Indochina cannot be excluded, our data may indicate that Himalayan Scutiger originated to the north of the Himalaya by colonization from Paleo-Tibet and then date back to the Oligocene. High intraspecific diversity of Scutiger implies limited migration across mountains and drainages along the Himalaya. While our study strengthens support for a "Tibetan-origin hypothesis", current sampling (10/22 species; 1 revalidated: S. occidentalis) remains insufficient to draw final conclusions on Scutiger but urges comparative phylogeographers to test alternative, geologically supported hypotheses for a true future understanding of Himalayan biogeography.
Badre, David
2012-01-01
Growing evidence suggests that the prefrontal cortex (PFC) is organized hierarchically, with more anterior regions having increasingly abstract representations. How does this organization support hierarchical cognitive control and the rapid discovery of abstract action rules? We present computational models at different levels of description. A neural circuit model simulates interacting corticostriatal circuits organized hierarchically. In each circuit, the basal ganglia gate frontal actions, with some striatal units gating the inputs to PFC and others gating the outputs to influence response selection. Learning at all of these levels is accomplished via dopaminergic reward prediction error signals in each corticostriatal circuit. This functionality allows the system to exhibit conditional if–then hypothesis testing and to learn rapidly in environments with hierarchical structure. We also develop a hybrid Bayesian-reinforcement learning mixture of experts (MoE) model, which can estimate the most likely hypothesis state of individual participants based on their observed sequence of choices and rewards. This model yields accurate probabilistic estimates about which hypotheses are attended by manipulating attentional states in the generative neural model and recovering them with the MoE model. This 2-pronged modeling approach leads to multiple quantitative predictions that are tested with functional magnetic resonance imaging in the companion paper. PMID:21693490
Clark, Cameron M; Lawlor-Savage, Linette; Goghari, Vina M
2017-01-01
Training of working memory as a method of increasing working memory capacity and fluid intelligence has received much attention in recent years. This burgeoning field remains highly controversial with empirically-backed disagreements at all levels of evidence, including individual studies, systematic reviews, and even meta-analyses. The current study investigated the effect of a randomized six week online working memory intervention on untrained cognitive abilities in a community-recruited sample of healthy young adults, in relation to both a processing speed training active control condition, as well as a no-contact control condition. Results of traditional null hypothesis significance testing, as well as Bayesian factor analyses, revealed support for the null hypothesis across all cognitive tests administered before and after training. Importantly, all three groups were similar at pre-training for a variety of individual variables purported to moderate transfer of training to fluid intelligence, including personality traits, motivation to train, and expectations of cognitive improvement from training. Because these results are consistent with experimental trials of equal or greater methodological rigor, we suggest that future research re-focus on: 1) other promising interventions known to increase memory performance in healthy young adults, and; 2) examining sub-populations or alternative populations in which working memory training may be efficacious.
NASA Astrophysics Data System (ADS)
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2015-12-01
Models in biogeoscience involve uncertainties in observation data, model inputs, model structure, model processes and modeling scenarios. To accommodate for different sources of uncertainty, multimodal analysis such as model combination, model selection, model elimination or model discrimination are becoming more popular. To illustrate theoretical and practical challenges of multimodal analysis, we use an example about microbial soil respiration modeling. Global soil respiration releases more than ten times more carbon dioxide to the atmosphere than all anthropogenic emissions. Thus, improving our understanding of microbial soil respiration is essential for improving climate change models. This study focuses on a poorly understood phenomena, which is the soil microbial respiration pulses in response to episodic rainfall pulses (the "Birch effect"). We hypothesize that the "Birch effect" is generated by the following three mechanisms. To test our hypothesis, we developed and assessed five evolving microbial-enzyme models against field measurements from a semiarid Savannah that is characterized by pulsed precipitation. These five model evolve step-wise such that the first model includes none of these three mechanism, while the fifth model includes the three mechanisms. The basic component of Bayesian multimodal analysis is the estimation of marginal likelihood to rank the candidate models based on their overall likelihood with respect to observation data. The first part of the study focuses on using this Bayesian scheme to discriminate between these five candidate models. The second part discusses some theoretical and practical challenges, which are mainly the effect of likelihood function selection and the marginal likelihood estimation methods on both model ranking and Bayesian model averaging. The study shows that making valid inference from scientific data is not a trivial task, since we are not only uncertain about the candidate scientific models, but also about the statistical methods that are used to discriminate between these models.
Bayesian population receptive field modelling.
Zeidman, Peter; Silson, Edward Harry; Schwarzkopf, Dietrich Samuel; Baker, Chris Ian; Penny, Will
2017-09-08
We introduce a probabilistic (Bayesian) framework and associated software toolbox for mapping population receptive fields (pRFs) based on fMRI data. This generic approach is intended to work with stimuli of any dimension and is demonstrated and validated in the context of 2D retinotopic mapping. The framework enables the experimenter to specify generative (encoding) models of fMRI timeseries, in which experimental stimuli enter a pRF model of neural activity, which in turns drives a nonlinear model of neurovascular coupling and Blood Oxygenation Level Dependent (BOLD) response. The neuronal and haemodynamic parameters are estimated together on a voxel-by-voxel or region-of-interest basis using a Bayesian estimation algorithm (variational Laplace). This offers several novel contributions to receptive field modelling. The variance/covariance of parameters are estimated, enabling receptive fields to be plotted while properly representing uncertainty about pRF size and location. Variability in the haemodynamic response across the brain is accounted for. Furthermore, the framework introduces formal hypothesis testing to pRF analysis, enabling competing models to be evaluated based on their log model evidence (approximated by the variational free energy), which represents the optimal tradeoff between accuracy and complexity. Using simulations and empirical data, we found that parameters typically used to represent pRF size and neuronal scaling are strongly correlated, which is taken into account by the Bayesian methods we describe when making inferences. We used the framework to compare the evidence for six variants of pRF model using 7 T functional MRI data and we found a circular Difference of Gaussians (DoG) model to be the best explanation for our data overall. We hope this framework will prove useful for mapping stimulus spaces with any number of dimensions onto the anatomy of the brain. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Informative priors on fetal fraction increase power of the noninvasive prenatal screen.
Xu, Hanli; Wang, Shaowei; Ma, Lin-Lin; Huang, Shuai; Liang, Lin; Liu, Qian; Liu, Yang-Yang; Liu, Ke-Di; Tan, Ze-Min; Ban, Hao; Guan, Yongtao; Lu, Zuhong
2017-11-09
PurposeNoninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction.MethodOur Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values.ResultsOur Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives.ConclusionBayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.Genetics in Medicine advance online publication, 9 November 2017; doi:10.1038/gim.2017.186.
Application of Bayesian Methods for Detecting Fraudulent Behavior on Tests
ERIC Educational Resources Information Center
Sinharay, Sandip
2018-01-01
Producers and consumers of test scores are increasingly concerned about fraudulent behavior before and during the test. There exist several statistical or psychometric methods for detecting fraudulent behavior on tests. This paper provides a review of the Bayesian approaches among them. Four hitherto-unpublished real data examples are provided to…
Wilkinson, Michael
2014-03-01
Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood.
Karabatsos, George
2018-06-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon previous methods because it provides an omnibus test of the entire hierarchy of cancellation axioms, beyond double cancellation. It does so while accounting for the posterior uncertainty that is inherent in the empirical orderings that are implied by these axioms, together. The new method is illustrated through a test of the cancellation axioms on a classic survey data set, and through the analysis of simulated data.
Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina
2013-01-01
In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.
Bayesian inference for disease prevalence using negative binomial group testing
Pritchard, Nicholas A.; Tebbs, Joshua M.
2011-01-01
Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308
Too good to be true: publication bias in two prominent studies from experimental psychology.
Francis, Gregory
2012-04-01
Empirical replication has long been considered the final arbiter of phenomena in science, but replication is undermined when there is evidence for publication bias. Evidence for publication bias in a set of experiments can be found when the observed number of rejections of the null hypothesis exceeds the expected number of rejections. Application of this test reveals evidence of publication bias in two prominent investigations from experimental psychology that have purported to reveal evidence of extrasensory perception and to indicate severe limitations of the scientific method. The presence of publication bias suggests that those investigations cannot be taken as proper scientific studies of such phenomena, because critical data are not available to the field. Publication bias could partly be avoided if experimental psychologists started using Bayesian data analysis techniques.
Donaldson, Theodore; Wollert, Richard
2008-06-01
Expert witnesses in sexually violent predator (SVP) cases often rely on actuarial instruments to make risk determinations. Many questions surround their use, however. Bayes's Theorem holds much promise for addressing these questions. Some experts nonetheless claim that Bayesian analyses are inadmissible in SVP cases because they are not accepted by the relevant scientific community. This position is illogical because Bayes's Theorem is simply a probabilistic restatement of the way that frequency data are combined to arrive at whatever recidivism rates are paired with each test score in an actuarial table. This article presents a mathematical proof and example validating this assertion. The advantages and implications of a logic model that combines Bayes's Theorem and the null hypothesis are also discussed.
NASA Astrophysics Data System (ADS)
Gomes, Guilherme J. C.; Vrugt, Jasper A.; Vargas, Eurípedes A.
2016-04-01
The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on fresh-bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.
Modeling the Evolution of Beliefs Using an Attentional Focus Mechanism
Marković, Dimitrije; Gläscher, Jan; Bossaerts, Peter; O’Doherty, John; Kiebel, Stefan J.
2015-01-01
For making decisions in everyday life we often have first to infer the set of environmental features that are relevant for the current task. Here we investigated the computational mechanisms underlying the evolution of beliefs about the relevance of environmental features in a dynamical and noisy environment. For this purpose we designed a probabilistic Wisconsin card sorting task (WCST) with belief solicitation, in which subjects were presented with stimuli composed of multiple visual features. At each moment in time a particular feature was relevant for obtaining reward, and participants had to infer which feature was relevant and report their beliefs accordingly. To test the hypothesis that attentional focus modulates the belief update process, we derived and fitted several probabilistic and non-probabilistic behavioral models, which either incorporate a dynamical model of attentional focus, in the form of a hierarchical winner-take-all neuronal network, or a diffusive model, without attention-like features. We used Bayesian model selection to identify the most likely generative model of subjects’ behavior and found that attention-like features in the behavioral model are essential for explaining subjects’ responses. Furthermore, we demonstrate a method for integrating both connectionist and Bayesian models of decision making within a single framework that allowed us to infer hidden belief processes of human subjects. PMID:26495984
Combining statistical inference and decisions in ecology
Williams, Perry J.; Hooten, Mevin B.
2016-01-01
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation, and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem.
2018-01-01
The genus Liolaemus comprises more than 260 species and can be divided in two subgenera: Eulaemus and Liolaemus sensu stricto. In this paper, we present a phylogenetic analysis, divergence times, and ancestral distribution ranges of the Liolaemus alticolor-bibronii group (Liolaemus sensu stricto subgenus). We inferred a total evidence phylogeny combining molecular (Cytb and 12S genes) and morphological characters using Maximum Parsimony and Bayesian Inference. Divergence times were calculated using Bayesian MCMC with an uncorrelated lognormal distributed relaxed clock, calibrated with a fossil record. Ancestral ranges were estimated using the Dispersal-Extinction-Cladogenesis (DEC-Lagrange). Effects of some a priori parameters of DEC were also tested. Distribution ranged from central Perú to southern Argentina, including areas at sea level up to the high Andes. The L. alticolor-bibronii group was recovered as monophyletic, formed by two clades: L. walkeri and L. gracilis, the latter can be split in two groups. Additionally, many species candidates were recognized. We estimate that the L. alticolor-bibronii group diversified 14.5 Myr ago, during the Middle Miocene. Our results suggest that the ancestor of the Liolaemus alticolor-bibronii group was distributed in a wide area including Patagonia and Puna highlands. The speciation pattern follows the South-North Diversification Hypothesis, following the Andean uplift. PMID:29479502
From reading numbers to seeing ratios: a benefit of icons for risk comprehension.
Tubau, Elisabet; Rodríguez-Ferreiro, Javier; Barberia, Itxaso; Colomé, Àngels
2018-06-21
Promoting a better understanding of statistical data is becoming increasingly important for improving risk comprehension and decision-making. In this regard, previous studies on Bayesian problem solving have shown that iconic representations help infer frequencies in sets and subsets. Nevertheless, the mechanisms by which icons enhance performance remain unclear. Here, we tested the hypothesis that the benefit offered by icon arrays lies in a better alignment between presented and requested relationships, which should facilitate the comprehension of the requested ratio beyond the represented quantities. To this end, we analyzed individual risk estimates based on data presented either in standard verbal presentations (percentages and natural frequency formats) or as icon arrays. Compared to the other formats, icons led to estimates that were more accurate, and importantly, promoted the use of equivalent expressions for the requested probability. Furthermore, whereas the accuracy of the estimates based on verbal formats depended on their alignment with the text, all the estimates based on icons were equally accurate. Therefore, these results support the proposal that icons enhance the comprehension of the ratio and its mapping onto the requested probability and point to relational misalignment as potential interference for text-based Bayesian reasoning. The present findings also argue against an intrinsic difficulty with understanding single-event probabilities.
Genetic Structure of Bluefin Tuna in the Mediterranean Sea Correlates with Environmental Variables
Riccioni, Giulia; Stagioni, Marco; Landi, Monica; Ferrara, Giorgia; Barbujani, Guido; Tinti, Fausto
2013-01-01
Background Atlantic Bluefin Tuna (ABFT) shows complex demography and ecological variation in the Mediterranean Sea. Genetic surveys have detected significant, although weak, signals of population structuring; catch series analyses and tagging programs identified complex ABFT spatial dynamics and migration patterns. Here, we tested the hypothesis that the genetic structure of the ABFT in the Mediterranean is correlated with mean surface temperature and salinity. Methodology We used six samples collected from Western and Central Mediterranean integrated with a new sample collected from the recently identified easternmost reproductive area of Levantine Sea. To assess population structure in the Mediterranean we used a multidisciplinary framework combining classical population genetics, spatial and Bayesian clustering methods and a multivariate approach based on factor analysis. Conclusions FST analysis and Bayesian clustering methods detected several subpopulations in the Mediterranean, a result also supported by multivariate analyses. In addition, we identified significant correlations of genetic diversity with mean salinity and surface temperature values revealing that ABFT is genetically structured along two environmental gradients. These results suggest that a preference for some spawning habitat conditions could contribute to shape ABFT genetic structuring in the Mediterranean. However, further studies should be performed to assess to what extent ABFT spawning behaviour in the Mediterranean Sea can be affected by environmental variation. PMID:24260341
Statistical Symbolic Execution with Informed Sampling
NASA Technical Reports Server (NTRS)
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Gu, Hairong; Kim, Woojae; Hou, Fang; Lesmes, Luis Andres; Pitt, Mark A; Lu, Zhong-Lin; Myung, Jay I
2016-01-01
Measurement efficiency is of concern when a large number of observations are required to obtain reliable estimates for parametric models of vision. The standard entropy-based Bayesian adaptive testing procedures addressed the issue by selecting the most informative stimulus in sequential experimental trials. Noninformative, diffuse priors were commonly used in those tests. Hierarchical adaptive design optimization (HADO; Kim, Pitt, Lu, Steyvers, & Myung, 2014) further improves the efficiency of the standard Bayesian adaptive testing procedures by constructing an informative prior using data from observers who have already participated in the experiment. The present study represents an empirical validation of HADO in estimating the human contrast sensitivity function. The results show that HADO significantly improves the accuracy and precision of parameter estimates, and therefore requires many fewer observations to obtain reliable inference about contrast sensitivity, compared to the method of quick contrast sensitivity function (Lesmes, Lu, Baek, & Albright, 2010), which uses the standard Bayesian procedure. The improvement with HADO was maintained even when the prior was constructed from heterogeneous populations or a relatively small number of observers. These results of this case study support the conclusion that HADO can be used in Bayesian adaptive testing by replacing noninformative, diffuse priors with statistically justified informative priors without introducing unwanted bias.
Gu, Hairong; Kim, Woojae; Hou, Fang; Lesmes, Luis Andres; Pitt, Mark A.; Lu, Zhong-Lin; Myung, Jay I.
2016-01-01
Measurement efficiency is of concern when a large number of observations are required to obtain reliable estimates for parametric models of vision. The standard entropy-based Bayesian adaptive testing procedures addressed the issue by selecting the most informative stimulus in sequential experimental trials. Noninformative, diffuse priors were commonly used in those tests. Hierarchical adaptive design optimization (HADO; Kim, Pitt, Lu, Steyvers, & Myung, 2014) further improves the efficiency of the standard Bayesian adaptive testing procedures by constructing an informative prior using data from observers who have already participated in the experiment. The present study represents an empirical validation of HADO in estimating the human contrast sensitivity function. The results show that HADO significantly improves the accuracy and precision of parameter estimates, and therefore requires many fewer observations to obtain reliable inference about contrast sensitivity, compared to the method of quick contrast sensitivity function (Lesmes, Lu, Baek, & Albright, 2010), which uses the standard Bayesian procedure. The improvement with HADO was maintained even when the prior was constructed from heterogeneous populations or a relatively small number of observers. These results of this case study support the conclusion that HADO can be used in Bayesian adaptive testing by replacing noninformative, diffuse priors with statistically justified informative priors without introducing unwanted bias. PMID:27105061
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
Exact Bayesian p-values for a test of independence in a 2 × 2 contingency table with missing data.
Lin, Yan; Lipsitz, Stuart R; Sinha, Debajyoti; Fitzmaurice, Garrett; Lipshultz, Steven
2017-01-01
Altham (Altham PME. Exact Bayesian analysis of a 2 × 2 contingency table, and Fisher's "exact" significance test. J R Stat Soc B 1969; 31: 261-269) showed that a one-sided p-value from Fisher's exact test of independence in a 2 × 2 contingency table is equal to the posterior probability of negative association in the 2 × 2 contingency table under a Bayesian analysis using an improper prior. We derive an extension of Fisher's exact test p-value in the presence of missing data, assuming the missing data mechanism is ignorable (i.e., missing at random or completely at random). Further, we propose Bayesian p-values for a test of independence in a 2 × 2 contingency table with missing data using alternative priors; we also present results from a simulation study exploring the Type I error rate and power of the proposed exact test p-values. An example, using data on the association between blood pressure and a cardiac enzyme, is presented to illustrate the methods.
Using the Bayes Factors to Evaluate Person Fit in the Item Response Theory
ERIC Educational Resources Information Center
Pan, Tianshu; Yin, Yue
2017-01-01
In this article, we propose using the Bayes factors (BF) to evaluate person fit in item response theory models under the framework of Bayesian evaluation of an informative diagnostic hypothesis. We first discuss the theoretical foundation for this application and how to analyze person fit using BF. To demonstrate the feasibility of this approach,…
A Bayesian Perspective on the Reproducibility Project: Psychology
Etz, Alexander; Vandekerckhove, Joachim
2016-01-01
We revisit the results of the recent Reproducibility Project: Psychology by the Open Science Collaboration. We compute Bayes factors—a quantity that can be used to express comparative evidence for an hypothesis but also for the null hypothesis—for a large subset (N = 72) of the original papers and their corresponding replication attempts. In our computation, we take into account the likely scenario that publication bias had distorted the originally published results. Overall, 75% of studies gave qualitatively similar results in terms of the amount of evidence provided. However, the evidence was often weak (i.e., Bayes factor < 10). The majority of the studies (64%) did not provide strong evidence for either the null or the alternative hypothesis in either the original or the replication, and no replication attempts provided strong evidence in favor of the null. In all cases where the original paper provided strong evidence but the replication did not (15%), the sample size in the replication was smaller than the original. Where the replication provided strong evidence but the original did not (10%), the replication sample size was larger. We conclude that the apparent failure of the Reproducibility Project to replicate many target effects can be adequately explained by overestimation of effect sizes (or overestimation of evidence against the null hypothesis) due to small sample sizes and publication bias in the psychological literature. We further conclude that traditional sample sizes are insufficient and that a more widespread adoption of Bayesian methods is desirable. PMID:26919473
Using Bayesian Networks to Improve Knowledge Assessment
ERIC Educational Resources Information Center
Millan, Eva; Descalco, Luis; Castillo, Gladys; Oliveira, Paula; Diogo, Sandra
2013-01-01
In this paper, we describe the integration and evaluation of an existing generic Bayesian student model (GBSM) into an existing computerized testing system within the Mathematics Education Project (PmatE--Projecto Matematica Ensino) of the University of Aveiro. This generic Bayesian student model had been previously evaluated with simulated…
Using Alien Coins to Test Whether Simple Inference Is Bayesian
ERIC Educational Resources Information Center
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Smilanich, Angela M; Fincher, R Malia; Dyer, Lee A
2016-05-01
According to the plant-apparency hypothesis, apparent plants allocate resources to quantitative defenses that negatively affect generalist and specialist herbivores, while unapparent plants invest more in qualitative defenses that negatively affect nonadapted generalists. Although this hypothesis has provided a useful framework for understanding the evolution of plant chemical defense, there are many inconsistencies surrounding associated predictions, and it has been heavily criticized and deemed obsolete. We used a hierarchical Bayesian meta-analysis model to test whether defenses from apparent and unapparent plants differ in their effects on herbivores. We collected a total of 225 effect sizes from 158 published papers in which the effects of plant chemistry on herbivore performance were reported. As predicted by the plant-apparency hypothesis, we found a prevalence of quantitative defenses in woody plants and qualitative defenses in herbaceous plants. However, the detrimental impacts of qualitative defenses were more effective against specialists than generalists, and the effects of chemical defenses did not significantly differ between specialists and generalists for woody or herbaceous plants. A striking pattern that emerged from our data was a pervasiveness of beneficial effects of secondary metabolites on herbivore performance, especially generalists. This pattern provides evidence that herbivores are evolving effective counteradaptations to putative plant defenses. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Population Genetic Structure of the Tropical Two-Wing Flyingfish (Exocoetus volitans)
Lewallen, Eric A.; Bohonak, Andrew J.; Bonin, Carolina A.; van Wijnen, Andre J.; Pitman, Robert L.; Lovejoy, Nathan R.
2016-01-01
Delineating populations of pantropical marine fish is a difficult process, due to widespread geographic ranges and complex life history traits in most species. Exocoetus volitans, a species of two-winged flyingfish, is a good model for understanding large-scale patterns of epipelagic fish population structure because it has a circumtropical geographic range and completes its entire life cycle in the epipelagic zone. Buoyant pelagic eggs should dictate high local dispersal capacity in this species, although a brief larval phase, small body size, and short lifespan may limit the dispersal of individuals over large spatial scales. Based on these biological features, we hypothesized that E. volitans would exhibit statistically and biologically significant population structure defined by recognized oceanographic barriers. We tested this hypothesis by analyzing cytochrome b mtDNA sequence data (1106 bps) from specimens collected in the Pacific, Atlantic and Indian oceans (n = 266). AMOVA, Bayesian, and coalescent analytical approaches were used to assess and interpret population-level genetic variability. A parsimony-based haplotype network did not reveal population subdivision among ocean basins, but AMOVA revealed limited, statistically significant population structure between the Pacific and Atlantic Oceans (ΦST = 0.035, p<0.001). A spatially-unbiased Bayesian approach identified two circumtropical population clusters north and south of the Equator (ΦST = 0.026, p<0.001), a previously unknown dispersal barrier for an epipelagic fish. Bayesian demographic modeling suggested the effective population size of this species increased by at least an order of magnitude ~150,000 years ago, to more than 1 billion individuals currently. Thus, high levels of genetic similarity observed in E. volitans can be explained by high rates of gene flow, a dramatic and recent population expansion, as well as extensive and consistent dispersal throughout the geographic range of the species. PMID:27736863
Population Genetic Structure of the Tropical Two-Wing Flyingfish (Exocoetus volitans).
Lewallen, Eric A; Bohonak, Andrew J; Bonin, Carolina A; van Wijnen, Andre J; Pitman, Robert L; Lovejoy, Nathan R
2016-01-01
Delineating populations of pantropical marine fish is a difficult process, due to widespread geographic ranges and complex life history traits in most species. Exocoetus volitans, a species of two-winged flyingfish, is a good model for understanding large-scale patterns of epipelagic fish population structure because it has a circumtropical geographic range and completes its entire life cycle in the epipelagic zone. Buoyant pelagic eggs should dictate high local dispersal capacity in this species, although a brief larval phase, small body size, and short lifespan may limit the dispersal of individuals over large spatial scales. Based on these biological features, we hypothesized that E. volitans would exhibit statistically and biologically significant population structure defined by recognized oceanographic barriers. We tested this hypothesis by analyzing cytochrome b mtDNA sequence data (1106 bps) from specimens collected in the Pacific, Atlantic and Indian oceans (n = 266). AMOVA, Bayesian, and coalescent analytical approaches were used to assess and interpret population-level genetic variability. A parsimony-based haplotype network did not reveal population subdivision among ocean basins, but AMOVA revealed limited, statistically significant population structure between the Pacific and Atlantic Oceans (ΦST = 0.035, p<0.001). A spatially-unbiased Bayesian approach identified two circumtropical population clusters north and south of the Equator (ΦST = 0.026, p<0.001), a previously unknown dispersal barrier for an epipelagic fish. Bayesian demographic modeling suggested the effective population size of this species increased by at least an order of magnitude ~150,000 years ago, to more than 1 billion individuals currently. Thus, high levels of genetic similarity observed in E. volitans can be explained by high rates of gene flow, a dramatic and recent population expansion, as well as extensive and consistent dispersal throughout the geographic range of the species.
Prowse, Thomas A A; Correll, Rachel A; Johnson, Christopher N; Prideaux, Gavin J; Brook, Barry W
2015-01-01
Life-history theory predicts the progressive dwarfing of animal populations that are subjected to chronic mortality stress, but the evolutionary impact of harvesting terrestrial herbivores has seldom been tested. In Australia, marsupials of the genus Macropus (kangaroos and wallabies) are subjected to size-selective commercial harvesting. Mathematical modelling suggests that harvest quotas (c. 10-20% of population estimates annually) could be driving body-size evolution in these species. We tested this hypothesis for three harvested macropod species with continental-scale distributions. To do so, we measured more than 2000 macropod skulls sourced from wildlife collections spanning the last 130 years. We analysed these data using spatial Bayesian models that controlled for the age and sex of specimens as well as environmental drivers and island effects. We found no evidence for the hypothesized decline in body size for any species; rather, models that fit trend terms supported minor body size increases over time. This apparently counterintuitive result is consistent with reduced mortality due to a depauperate predator guild and increased primary productivity of grassland vegetation following European settlement in Australia. Spatial patterns in macropod body size supported the heat dissipation limit and productivity hypotheses proposed to explain geographic body-size variation (i.e. skull size increased with decreasing summer maximum temperature and increasing rainfall, respectively). There is no empirical evidence that size-selective harvesting has driven the evolution of smaller body size in Australian macropods. Bayesian models are appropriate for investigating the long-term impact of human harvesting because they can impute missing data, fit nonlinear growth models and account for non-random spatial sampling inherent in wildlife collections. © 2014 The Authors. Journal of Animal Ecology © 2014 British Ecological Society.
Bayesian Learning and the Psychology of Rule Induction
ERIC Educational Resources Information Center
Endress, Ansgar D.
2013-01-01
In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to…
Lawlor-Savage, Linette; Goghari, Vina M.
2017-01-01
Training of working memory as a method of increasing working memory capacity and fluid intelligence has received much attention in recent years. This burgeoning field remains highly controversial with empirically-backed disagreements at all levels of evidence, including individual studies, systematic reviews, and even meta-analyses. The current study investigated the effect of a randomized six week online working memory intervention on untrained cognitive abilities in a community-recruited sample of healthy young adults, in relation to both a processing speed training active control condition, as well as a no-contact control condition. Results of traditional null hypothesis significance testing, as well as Bayesian factor analyses, revealed support for the null hypothesis across all cognitive tests administered before and after training. Importantly, all three groups were similar at pre-training for a variety of individual variables purported to moderate transfer of training to fluid intelligence, including personality traits, motivation to train, and expectations of cognitive improvement from training. Because these results are consistent with experimental trials of equal or greater methodological rigor, we suggest that future research re-focus on: 1) other promising interventions known to increase memory performance in healthy young adults, and; 2) examining sub-populations or alternative populations in which working memory training may be efficacious. PMID:28558000
Beyond statistical inference: a decision theory for science.
Killeen, Peter R
2006-08-01
Traditional null hypothesis significance testing does not yield the probability of the null or its alternative and, therefore, cannot logically ground scientific decisions. The decision theory proposed here calculates the expected utility of an effect on the basis of (1) the probability of replicating it and (2) a utility function on its size. It takes significance tests--which place all value on the replicability of an effect and none on its magnitude--as a special case, one in which the cost of a false positive is revealed to be an order of magnitude greater than the value of a true positive. More realistic utility functions credit both replicability and effect size, integrating them for a single index of merit. The analysis incorporates opportunity cost and is consistent with alternate measures of effect size, such as r2 and information transmission, and with Bayesian model selection criteria. An alternate formulation is functionally equivalent to the formal theory, transparent, and easy to compute.
Acerbi, Enzo; Viganò, Elena; Poidinger, Michael; Mortellaro, Alessandra; Zelante, Teresa; Stella, Fabio
2016-01-01
T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian networks over a time-course gene expression dataset, we inferred the global regulatory network controlling TH17 differentiation. From the network, we identified the Prdm1 gene encoding the B lymphocyte-induced maturation protein 1 as a crucial negative regulator of human TH17 cell differentiation. The results have been validated by perturbing Prdm1 expression on freshly isolated CD4+ naïve T cells: reduction of Prdm1 expression leads to augmentation of IL-17 release. These data unravel a possible novel target to control TH17 polarization in inflammatory disorders. Furthermore, this study represents the first in vitro validation of continuous time Bayesian networks as gene network reconstruction method and as hypothesis generation tool for wet-lab biological experiments. PMID:26976045
Towards a framework for testing general relativity with extreme-mass-ratio-inspiral observations
NASA Astrophysics Data System (ADS)
Chua, A. J. K.; Hee, S.; Handley, W. J.; Higson, E.; Moore, C. J.; Gair, J. R.; Hobson, M. P.; Lasenby, A. N.
2018-07-01
Extreme-mass-ratio-inspiral observations from future space-based gravitational-wave detectors such as LISA will enable strong-field tests of general relativity with unprecedented precision, but at prohibitive computational cost if existing statistical techniques are used. In one such test that is currently employed for LIGO black hole binary mergers, generic deviations from relativity are represented by N deformation parameters in a generalized waveform model; the Bayesian evidence for each of its 2N combinatorial submodels is then combined into a posterior odds ratio for modified gravity over relativity in a null-hypothesis test. We adapt and apply this test to a generalized model for extreme-mass-ratio inspirals constructed on deformed black hole spacetimes, and focus our investigation on how computational efficiency can be increased through an evidence-free method of model selection. This method is akin to the algorithm known as product-space Markov chain Monte Carlo, but uses nested sampling and improved error estimates from a rethreading technique. We perform benchmarking and robustness checks for the method, and find order-of-magnitude computational gains over regular nested sampling in the case of synthetic data generated from the null model.
Towards a framework for testing general relativity with extreme-mass-ratio-inspiral observations
NASA Astrophysics Data System (ADS)
Chua, A. J. K.; Hee, S.; Handley, W. J.; Higson, E.; Moore, C. J.; Gair, J. R.; Hobson, M. P.; Lasenby, A. N.
2018-04-01
Extreme-mass-ratio-inspiral observations from future space-based gravitational-wave detectors such as LISA will enable strong-field tests of general relativity with unprecedented precision, but at prohibitive computational cost if existing statistical techniques are used. In one such test that is currently employed for LIGO black-hole binary mergers, generic deviations from relativity are represented by N deformation parameters in a generalised waveform model; the Bayesian evidence for each of its 2N combinatorial submodels is then combined into a posterior odds ratio for modified gravity over relativity in a null-hypothesis test. We adapt and apply this test to a generalised model for extreme-mass-ratio inspirals constructed on deformed black-hole spacetimes, and focus our investigation on how computational efficiency can be increased through an evidence-free method of model selection. This method is akin to the algorithm known as product-space Markov chain Monte Carlo, but uses nested sampling and improved error estimates from a rethreading technique. We perform benchmarking and robustness checks for the method, and find order-of-magnitude computational gains over regular nested sampling in the case of synthetic data generated from the null model.
Small sample mediation testing: misplaced confidence in bootstrapped confidence intervals.
Koopman, Joel; Howe, Michael; Hollenbeck, John R; Sin, Hock-Peng
2015-01-01
Bootstrapping is an analytical tool commonly used in psychology to test the statistical significance of the indirect effect in mediation models. Bootstrapping proponents have particularly advocated for its use for samples of 20-80 cases. This advocacy has been heeded, especially in the Journal of Applied Psychology, as researchers are increasingly utilizing bootstrapping to test mediation with samples in this range. We discuss reasons to be concerned with this escalation, and in a simulation study focused specifically on this range of sample sizes, we demonstrate not only that bootstrapping has insufficient statistical power to provide a rigorous hypothesis test in most conditions but also that bootstrapping has a tendency to exhibit an inflated Type I error rate. We then extend our simulations to investigate an alternative empirical resampling method as well as a Bayesian approach and demonstrate that they exhibit comparable statistical power to bootstrapping in small samples without the associated inflated Type I error. Implications for researchers testing mediation hypotheses in small samples are presented. For researchers wishing to use these methods in their own research, we have provided R syntax in the online supplemental materials. (c) 2015 APA, all rights reserved.
Multi-Sensor Information Integration and Automatic Understanding
2008-11-01
also produced a real-time implementation of the tracking and anomalous behavior detection system that runs on real- world data – either using real-time...surveillance and airborne IED detection . 15. SUBJECT TERMS Multi-hypothesis tracking , particle filters, anomalous behavior detection , Bayesian...analyst to support decision making with large data sets. A key feature of the real-time tracking and behavior detection system developed is that the
2015-07-01
undergraduate student coauthors Aashish Jindia, Parag Srivastava, and Jay Jin for help with the research. In addition, thank you to the numerous...103 A.1.1 Sacramento Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 A.1.2 RadMap and SUNS Data Sets...parameters in a joint hypothesis space. We develop scalable branch and bound and pruning mechanisms for searching (at multiple resolutions) over source
Computational Nosology and Precision Psychiatry
Redish, A. David; Gordon, Joshua A.
2017-01-01
This article provides an illustrative treatment of psychiatric morbidity that offers an alternative to the standard nosological model in psychiatry. It considers what would happen if we treated diagnostic categories not as causes of signs and symptoms, but as diagnostic consequences of psychopathology and pathophysiology. This reformulation (of the standard nosological model) opens the door to a more natural description of how patients present—and of their likely responses to therapeutic interventions. In brief, we describe a model that generates symptoms, signs, and diagnostic outcomes from latent psychopathological states. In turn, psychopathology is caused by pathophysiological processes that are perturbed by (etiological) causes such as predisposing factors, life events, and therapeutic interventions. The key advantages of this nosological formulation include (i) the formal integration of diagnostic (e.g., DSM) categories and latent psychopathological constructs (e.g., the dimensions of the Research Domain Criteria); (ii) the provision of a hypothesis or model space that accommodates formal, evidence-based hypothesis testing (using Bayesian model comparison); and (iii) the ability to predict therapeutic responses (using a posterior predictive density), as in precision medicine. These and other advantages are largely promissory at present: The purpose of this article is to show what might be possible, through the use of idealized simulations. PMID:29400354
Mitochondrial genomes of two Australian fishflies with an evolutionary timescale of Chauliodinae.
Yang, Fan; Jiang, Yunlan; Yang, Ding; Liu, Xingyue
2017-06-30
Fishflies (Corydalidae: Chauliodinae) with a total of ca. 130 extant species are one of the major groups of the holometabolous insect order Megaloptera. As a group which originated during the Mesozoic, the phylogeny and historical biogeography of fishflies are of high interest. The previous hypothesis on the evolutionary history of fishflies was based primarily on morphological data. To further test the existing phylogenetic relationships and to understand the divergence pattern of fishflies, we conducted a molecule-based study. We determined the complete mitochondrial (mt) genomes of two Australian fishfly species, Archichauliodes deceptor Kimmins, 1954 and Protochauliodes biconicus Kimmins, 1954, both members of a major subgroup of Chauliodinae with high phylogenetic significance. A phylogenomic analysis was carried out based on 13 mt protein coding genes (PCGs) and two rRNAs genes from the megalopteran species with determined mt genomes. Both maximum likelihood and Bayesian inference analyses recovered the Dysmicohermes clade as the sister group of the Archichauliodes clade + the Protochauliodes clade, which is consistent with the previous morphology-based hypothesis. The divergence time estimation suggested that the divergence among the three major subgroups of fishflies occurred during the Late Jurassic and Early Cretaceous when the supercontinent Pangaea was undergoing sequential breakup.
Gompert, Zachariah; Lucas, Lauren K; Nice, Chris C; Fordyce, James A; Forister, Matthew L; Buerkle, C Alex
2012-07-01
Speciation is the process by which reproductively isolated lineages arise, and is one of the fundamental means by which the diversity of life increases. Whereas numerous studies have documented an association between ecological divergence and reproductive isolation, relatively little is known about the role of natural selection in genome divergence during the process of speciation. Here, we use genome-wide DNA sequences and Bayesian models to test the hypothesis that loci under divergent selection between two butterfly species (Lycaeides idas and L. melissa) also affect fitness in an admixed population. Locus-specific measures of genetic differentiation between L. idas and L. melissa and genomic introgression in hybrids varied across the genome. The most differentiated genetic regions were characterized by elevated L. idas ancestry in the admixed population, which occurs in L. idas-like habitat, consistent with the hypothesis that local adaptation contributes to speciation. Moreover, locus-specific measures of genetic differentiation (a metric of divergent selection) were positively associated with extreme genomic introgression (a metric of hybrid fitness). Interestingly, concordance of differentiation and introgression was only partial. We discuss multiple, complementary explanations for this partial concordance. © 2012 The Author(s).
Bayesian Inference in the Modern Design of Experiments
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Bayesian Estimation Supersedes the "t" Test
ERIC Educational Resources Information Center
Kruschke, John K.
2013-01-01
Bayesian estimation for 2 groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their difference, and the normality of the data. The method handles outliers. The decision rule can accept the null value (unlike traditional "t" tests) when certainty in the estimate is…
Hozo, Iztok; Schell, Michael J; Djulbegovic, Benjamin
2008-07-01
The absolute truth in research is unobtainable, as no evidence or research hypothesis is ever 100% conclusive. Therefore, all data and inferences can in principle be considered as "inconclusive." Scientific inference and decision-making need to take into account errors, which are unavoidable in the research enterprise. The errors can occur at the level of conclusions that aim to discern the truthfulness of research hypothesis based on the accuracy of research evidence and hypothesis, and decisions, the goal of which is to enable optimal decision-making under present and specific circumstances. To optimize the chance of both correct conclusions and correct decisions, the synthesis of all major statistical approaches to clinical research is needed. The integration of these approaches (frequentist, Bayesian, and decision-analytic) can be accomplished through formal risk:benefit (R:B) analysis. This chapter illustrates the rational choice of a research hypothesis using R:B analysis based on decision-theoretic expected utility theory framework and the concept of "acceptable regret" to calculate the threshold probability of the "truth" above which the benefit of accepting a research hypothesis outweighs its risks.
Causal learning with local computations.
Fernbach, Philip M; Sloman, Steven A
2009-05-01
The authors proposed and tested a psychological theory of causal structure learning based on local computations. Local computations simplify complex learning problems via cues available on individual trials to update a single causal structure hypothesis. Structural inferences from local computations make minimal demands on memory, require relatively small amounts of data, and need not respect normative prescriptions as inferences that are principled locally may violate those principles when combined. Over a series of 3 experiments, the authors found (a) systematic inferences from small amounts of data; (b) systematic inference of extraneous causal links; (c) influence of data presentation order on inferences; and (d) error reduction through pretraining. Without pretraining, a model based on local computations fitted data better than a Bayesian structural inference model. The data suggest that local computations serve as a heuristic for learning causal structure. Copyright 2009 APA, all rights reserved.
Revised neutrino-gallium cross section and prospects of BEST in resolving the gallium anomaly
NASA Astrophysics Data System (ADS)
Barinov, Vladislav; Cleveland, Bruce; Gavrin, Vladimir; Gorbunov, Dmitry; Ibragimova, Tatiana
2018-04-01
O (1 )eV sterile neutrino can be responsible for a number of anomalous results of neutrino oscillation experiments. This hypothesis may be tested at short base line neutrino oscillation experiments, several of which are either ongoing or under construction. Here, we concentrate on the so-called gallium anomaly, found by SAGE and GALLEX experiments, and its foreseeable future tests with BEST experiment at Baksan Neutrino Observatory. We start with a revision of the neutrino-gallium cross section that is performed by utilizing the recent measurements of the nuclear final state spectra. We accordingly correct the parameters of gallium anomaly and refine the BEST prospects in testing it and searching for sterile neutrinos. We further evolve the previously proposed idea to investigate the anomaly with 65Zn artificial neutrino source as a next option available at BEST and estimate its sensitivity to the sterile neutrino model parameters following the Bayesian approach. We show that after the two stages of operation BEST will make 5 σ discovery of the sterile neutrinos, if they are behind the gallium anomaly.
Lim, Cherry; Wannapinij, Prapass; White, Lisa; Day, Nicholas P J; Cooper, Ben S; Peacock, Sharon J; Limmathurotsakul, Direk
2013-01-01
Estimates of the sensitivity and specificity for new diagnostic tests based on evaluation against a known gold standard are imprecise when the accuracy of the gold standard is imperfect. Bayesian latent class models (LCMs) can be helpful under these circumstances, but the necessary analysis requires expertise in computational programming. Here, we describe open-access web-based applications that allow non-experts to apply Bayesian LCMs to their own data sets via a user-friendly interface. Applications for Bayesian LCMs were constructed on a web server using R and WinBUGS programs. The models provided (http://mice.tropmedres.ac) include two Bayesian LCMs: the two-tests in two-population model (Hui and Walter model) and the three-tests in one-population model (Walter and Irwig model). Both models are available with simplified and advanced interfaces. In the former, all settings for Bayesian statistics are fixed as defaults. Users input their data set into a table provided on the webpage. Disease prevalence and accuracy of diagnostic tests are then estimated using the Bayesian LCM, and provided on the web page within a few minutes. With the advanced interfaces, experienced researchers can modify all settings in the models as needed. These settings include correlation among diagnostic test results and prior distributions for all unknown parameters. The web pages provide worked examples with both models using the original data sets presented by Hui and Walter in 1980, and by Walter and Irwig in 1988. We also illustrate the utility of the advanced interface using the Walter and Irwig model on a data set from a recent melioidosis study. The results obtained from the web-based applications were comparable to those published previously. The newly developed web-based applications are open-access and provide an important new resource for researchers worldwide to evaluate new diagnostic tests.
Ide, Kazuki; Kawasaki, Yohei; Akutagawa, Maiko; Yamada, Hiroshi
2017-02-01
The aim of this study is to analyze the data obtained from a randomized trial on the prevention of influenza by gargling with green tea, which gave nonsignificant results based on frequentist approaches, by using Bayesian approaches. The posterior proportion, with 95% credible interval (CrI), of influenza in each group was calculated. The Bayesian index θ is the probability that a hypothesis is true. In this case, θ is the probability that the hypothesis that green tea gargling reduced influenza compared with water gargling is true. Univariate and multivariate logistic regression analyses were also performed by using the Markov chain Monte Carlo method. The full analysis set included 747 participants. During the study period, influenza occurred in 44 participants (5.9%). The difference between the two independent binominal proportions was -0.019 (95% CrI, -0.054 to 0.015; θ = 0.87). The partial regression coefficients in the univariate analysis were -0.35 (95% CrI, -1.00 to 0.24) with use of a uniform prior and -0.34 (95% CrI, -0.96 to 0.27) with use of a Jeffreys prior. In the multivariate analysis, the values were -0.37 (95% CrI, -0.96 to 0.30) and -0.36 (95% CrI, -1.03 to 0.21), respectively. The difference between the two independent binominal proportions was less than 0, and θ was greater than 0.85. Therefore, green tea gargling may slightly reduce influenza compared with water gargling. This analysis suggests that green tea gargling can be an additional preventive measure for use with other pharmaceutical and nonpharmaceutical measures and indicates the need for additional studies to confirm the effect of green tea gargling.
Natural selection promotes antigenic evolvability.
Graves, Christopher J; Ros, Vera I D; Stevenson, Brian; Sniegowski, Paul D; Brisson, Dustin
2013-01-01
The hypothesis that evolvability - the capacity to evolve by natural selection - is itself the object of natural selection is highly intriguing but remains controversial due in large part to a paucity of direct experimental evidence. The antigenic variation mechanisms of microbial pathogens provide an experimentally tractable system to test whether natural selection has favored mechanisms that increase evolvability. Many antigenic variation systems consist of paralogous unexpressed 'cassettes' that recombine into an expression site to rapidly alter the expressed protein. Importantly, the magnitude of antigenic change is a function of the genetic diversity among the unexpressed cassettes. Thus, evidence that selection favors among-cassette diversity is direct evidence that natural selection promotes antigenic evolvability. We used the Lyme disease bacterium, Borrelia burgdorferi, as a model to test the prediction that natural selection favors amino acid diversity among unexpressed vls cassettes and thereby promotes evolvability in a primary surface antigen, VlsE. The hypothesis that diversity among vls cassettes is favored by natural selection was supported in each B. burgdorferi strain analyzed using both classical (dN/dS ratios) and Bayesian population genetic analyses of genetic sequence data. This hypothesis was also supported by the conservation of highly mutable tandem-repeat structures across B. burgdorferi strains despite a near complete absence of sequence conservation. Diversification among vls cassettes due to natural selection and mutable repeat structures promotes long-term antigenic evolvability of VlsE. These findings provide a direct demonstration that molecular mechanisms that enhance evolvability of surface antigens are an evolutionary adaptation. The molecular evolutionary processes identified here can serve as a model for the evolution of antigenic evolvability in many pathogens which utilize similar strategies to establish chronic infections.
Natural Selection Promotes Antigenic Evolvability
Graves, Christopher J.; Ros, Vera I. D.; Stevenson, Brian; Sniegowski, Paul D.; Brisson, Dustin
2013-01-01
The hypothesis that evolvability - the capacity to evolve by natural selection - is itself the object of natural selection is highly intriguing but remains controversial due in large part to a paucity of direct experimental evidence. The antigenic variation mechanisms of microbial pathogens provide an experimentally tractable system to test whether natural selection has favored mechanisms that increase evolvability. Many antigenic variation systems consist of paralogous unexpressed ‘cassettes’ that recombine into an expression site to rapidly alter the expressed protein. Importantly, the magnitude of antigenic change is a function of the genetic diversity among the unexpressed cassettes. Thus, evidence that selection favors among-cassette diversity is direct evidence that natural selection promotes antigenic evolvability. We used the Lyme disease bacterium, Borrelia burgdorferi, as a model to test the prediction that natural selection favors amino acid diversity among unexpressed vls cassettes and thereby promotes evolvability in a primary surface antigen, VlsE. The hypothesis that diversity among vls cassettes is favored by natural selection was supported in each B. burgdorferi strain analyzed using both classical (dN/dS ratios) and Bayesian population genetic analyses of genetic sequence data. This hypothesis was also supported by the conservation of highly mutable tandem-repeat structures across B. burgdorferi strains despite a near complete absence of sequence conservation. Diversification among vls cassettes due to natural selection and mutable repeat structures promotes long-term antigenic evolvability of VlsE. These findings provide a direct demonstration that molecular mechanisms that enhance evolvability of surface antigens are an evolutionary adaptation. The molecular evolutionary processes identified here can serve as a model for the evolution of antigenic evolvability in many pathogens which utilize similar strategies to establish chronic infections. PMID:24244173
Estimation of Post-Test Probabilities by Residents: Bayesian Reasoning versus Heuristics?
ERIC Educational Resources Information Center
Hall, Stacey; Phang, Sen Han; Schaefer, Jeffrey P.; Ghali, William; Wright, Bruce; McLaughlin, Kevin
2014-01-01
Although the process of diagnosing invariably begins with a heuristic, we encourage our learners to support their diagnoses by analytical cognitive processes, such as Bayesian reasoning, in an attempt to mitigate the effects of heuristics on diagnosing. There are, however, limited data on the use ± impact of Bayesian reasoning on the accuracy of…
Is Bayesian Estimation Proper for Estimating the Individual's Ability? Research Report 80-3.
ERIC Educational Resources Information Center
Samejima, Fumiko
The effect of prior information in Bayesian estimation is considered, mainly from the standpoint of objective testing. In the estimation of a parameter belonging to an individual, the prior information is, in most cases, the density function of the population to which the individual belongs. Bayesian estimation was compared with maximum likelihood…
ERIC Educational Resources Information Center
Wu, Haiyan
2013-01-01
General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…
NASA Astrophysics Data System (ADS)
Verma, Sneha K.; Chun, Sophia; Liu, Brent J.
2014-03-01
Pain is a common complication after spinal cord injury with prevalence estimates ranging 77% to 81%, which highly affects a patient's lifestyle and well-being. In the current clinical setting paper-based forms are used to classify pain correctly, however, the accuracy of diagnoses and optimal management of pain largely depend on the expert reviewer, which in many cases is not possible because of very few experts in this field. The need for a clinical decision support system that can be used by expert and non-expert clinicians has been cited in literature, but such a system has not been developed. We have designed and developed a stand-alone tool for correctly classifying pain type in spinal cord injury (SCI) patients, using Bayesian decision theory. Various machine learning simulation methods are used to verify the algorithm using a pilot study data set, which consists of 48 patients data set. The data set consists of the paper-based forms, collected at Long Beach VA clinic with pain classification done by expert in the field. Using the WEKA as the machine learning tool we have tested on the 48 patient dataset that the hypothesis that attributes collected on the forms and the pain location marked by patients have very significant impact on the pain type classification. This tool will be integrated with an imaging informatics system to support a clinical study that will test the effectiveness of using Proton Beam radiotherapy for treating spinal cord injury (SCI) related neuropathic pain as an alternative to invasive surgical lesioning.
Estimation of post-test probabilities by residents: Bayesian reasoning versus heuristics?
Hall, Stacey; Phang, Sen Han; Schaefer, Jeffrey P; Ghali, William; Wright, Bruce; McLaughlin, Kevin
2014-08-01
Although the process of diagnosing invariably begins with a heuristic, we encourage our learners to support their diagnoses by analytical cognitive processes, such as Bayesian reasoning, in an attempt to mitigate the effects of heuristics on diagnosing. There are, however, limited data on the use ± impact of Bayesian reasoning on the accuracy of disease probability estimates. In this study our objective was to explore whether Internal Medicine residents use a Bayesian process to estimate disease probabilities by comparing their disease probability estimates to literature-derived Bayesian post-test probabilities. We gave 35 Internal Medicine residents four clinical vignettes in the form of a referral letter and asked them to estimate the post-test probability of the target condition in each case. We then compared these to literature-derived probabilities. For each vignette the estimated probability was significantly different from the literature-derived probability. For the two cases with low literature-derived probability our participants significantly overestimated the probability of these target conditions being the correct diagnosis, whereas for the two cases with high literature-derived probability the estimated probability was significantly lower than the calculated value. Our results suggest that residents generate inaccurate post-test probability estimates. Possible explanations for this include ineffective application of Bayesian reasoning, attribute substitution whereby a complex cognitive task is replaced by an easier one (e.g., a heuristic), or systematic rater bias, such as central tendency bias. Further studies are needed to identify the reasons for inaccuracy of disease probability estimates and to explore ways of improving accuracy.
Bayesian Model Selection in Geophysics: The evidence
NASA Astrophysics Data System (ADS)
Vrugt, J. A.
2016-12-01
Bayesian inference has found widespread application and use in science and engineering to reconcile Earth system models with data, including prediction in space (interpolation), prediction in time (forecasting), assimilation of observations and deterministic/stochastic model output, and inference of the model parameters. Per Bayes theorem, the posterior probability, , P(H|D), of a hypothesis, H, given the data D, is equivalent to the product of its prior probability, P(H), and likelihood, L(H|D), divided by a normalization constant, P(D). In geophysics, the hypothesis, H, often constitutes a description (parameterization) of the subsurface for some entity of interest (e.g. porosity, moisture content). The normalization constant, P(D), is not required for inference of the subsurface structure, yet of great value for model selection. Unfortunately, it is not particularly easy to estimate P(D) in practice. Here, I will introduce the various building blocks of a general purpose method which provides robust and unbiased estimates of the evidence, P(D). This method uses multi-dimensional numerical integration of the posterior (parameter) distribution. I will then illustrate this new estimator by application to three competing subsurface models (hypothesis) using GPR travel time data from the South Oyster Bacterial Transport Site, in Virginia, USA. The three subsurface models differ in their treatment of the porosity distribution and use (a) horizontal layering with fixed layer thicknesses, (b) vertical layering with fixed layer thicknesses and (c) a multi-Gaussian field. The results of the new estimator are compared against the brute force Monte Carlo method, and the Laplace-Metropolis method.
On the use of Bayesian Monte-Carlo in evaluation of nuclear data
NASA Astrophysics Data System (ADS)
De Saint Jean, Cyrille; Archier, Pascal; Privas, Edwin; Noguere, Gilles
2017-09-01
As model parameters, necessary ingredients of theoretical models, are not always predicted by theory, a formal mathematical framework associated to the evaluation work is needed to obtain the best set of parameters (resonance parameters, optical models, fission barrier, average width, multigroup cross sections) with Bayesian statistical inference by comparing theory to experiment. The formal rule related to this methodology is to estimate the posterior density probability function of a set of parameters by solving an equation of the following type: pdf(posterior) ˜ pdf(prior) × a likelihood function. A fitting procedure can be seen as an estimation of the posterior density probability of a set of parameters (referred as x→?) knowing a prior information on these parameters and a likelihood which gives the probability density function of observing a data set knowing x→?. To solve this problem, two major paths could be taken: add approximations and hypothesis and obtain an equation to be solved numerically (minimum of a cost function or Generalized least Square method, referred as GLS) or use Monte-Carlo sampling of all prior distributions and estimate the final posterior distribution. Monte Carlo methods are natural solution for Bayesian inference problems. They avoid approximations (existing in traditional adjustment procedure based on chi-square minimization) and propose alternative in the choice of probability density distribution for priors and likelihoods. This paper will propose the use of what we are calling Bayesian Monte Carlo (referred as BMC in the rest of the manuscript) in the whole energy range from thermal, resonance and continuum range for all nuclear reaction models at these energies. Algorithms will be presented based on Monte-Carlo sampling and Markov chain. The objectives of BMC are to propose a reference calculation for validating the GLS calculations and approximations, to test probability density distributions effects and to provide the framework of finding global minimum if several local minimums exist. Application to resolved resonance, unresolved resonance and continuum evaluation as well as multigroup cross section data assimilation will be presented.
Drummond, Christopher S; Eastwood, Ruth J; Miotto, Silvia T S; Hughes, Colin E
2012-05-01
Replicate radiations provide powerful comparative systems to address questions about the interplay between opportunity and innovation in driving episodes of diversification and the factors limiting their subsequent progression. However, such systems have been rarely documented at intercontinental scales. Here, we evaluate the hypothesis of multiple radiations in the genus Lupinus (Leguminosae), which exhibits some of the highest known rates of net diversification in plants. Given that incomplete taxon sampling, background extinction, and lineage-specific variation in diversification rates can confound macroevolutionary inferences regarding the timing and mechanisms of cladogenesis, we used Bayesian relaxed clock phylogenetic analyses as well as MEDUSA and BiSSE birth-death likelihood models of diversification, to evaluate the evolutionary patterns of lineage accumulation in Lupinus. We identified 3 significant shifts to increased rates of net diversification (r) relative to background levels in the genus (r = 0.18-0.48 lineages/myr). The primary shift occurred approximately 4.6 Ma (r = 0.48-1.76) in the montane regions of western North America, followed by a secondary shift approximately 2.7 Ma (r = 0.89-3.33) associated with range expansion and diversification of allopatrically distributed sister clades in the Mexican highlands and Andes. We also recovered evidence for a third independent shift approximately 6.5 Ma at the base of a lower elevation eastern South American grassland and campo rupestre clade (r = 0.36-1.33). Bayesian ancestral state reconstructions and BiSSE likelihood analyses of correlated diversification indicated that increased rates of speciation are strongly associated with the derived evolution of perennial life history and invasion of montane ecosystems. Although we currently lack hard evidence for "replicate adaptive radiations" in the sense of convergent morphological and ecological trajectories among species in different clades, these results are consistent with the hypothesis that iteroparity functioned as an adaptive key innovation, providing a mechanism for range expansion and rapid divergence in upper elevation regions across much of the New World.
Children Can Solve Bayesian Problems: The Role of Representation in Mental Computation
ERIC Educational Resources Information Center
Zhu, Liqi; Gigerenzer, Gerd
2006-01-01
Can children reason the Bayesian way? We argue that the answer to this question depends on how numbers are represented, because a representation can do part of the computation. We test, for the first time, whether Bayesian reasoning can be elicited in children by means of natural frequencies. We show that when information was presented to fourth,…
ERIC Educational Resources Information Center
Jenkins, Gavin W.; Samuelson, Larissa K.; Smith, Jodi R.; Spencer, John P.
2015-01-01
It is unclear how children learn labels for multiple overlapping categories such as "Labrador," "dog," and "animal." Xu and Tenenbaum (2007a) suggested that learners infer correct meanings with the help of Bayesian inference. They instantiated these claims in a Bayesian model, which they tested with preschoolers and…
With or without you: predictive coding and Bayesian inference in the brain
Aitchison, Laurence; Lengyel, Máté
2018-01-01
Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly. PMID:28942084
Sa-Ngamuang, Chaitawat; Haddawy, Peter; Luvira, Viravarn; Piyaphanee, Watcharapong; Iamsirithaworn, Sopon; Lawpoolsri, Saranath
2018-06-18
Differentiating dengue patients from other acute febrile illness patients is a great challenge among physicians. Several dengue diagnosis methods are recommended by WHO. The application of specific laboratory tests is still limited due to high cost, lack of equipment, and uncertain validity. Therefore, clinical diagnosis remains a common practice especially in resource limited settings. Bayesian networks have been shown to be a useful tool for diagnostic decision support. This study aimed to construct Bayesian network models using basic demographic, clinical, and laboratory profiles of acute febrile illness patients to diagnose dengue. Data of 397 acute undifferentiated febrile illness patients who visited the fever clinic of the Bangkok Hospital for Tropical Diseases, Thailand, were used for model construction and validation. The two best final models were selected: one with and one without NS1 rapid test result. The diagnostic accuracy of the models was compared with that of physicians on the same set of patients. The Bayesian network models provided good diagnostic accuracy of dengue infection, with ROC AUC of 0.80 and 0.75 for models with and without NS1 rapid test result, respectively. The models had approximately 80% specificity and 70% sensitivity, similar to the diagnostic accuracy of the hospital's fellows in infectious disease. Including information on NS1 rapid test improved the specificity, but reduced the sensitivity, both in model and physician diagnoses. The Bayesian network model developed in this study could be useful to assist physicians in diagnosing dengue, particularly in regions where experienced physicians and laboratory confirmation tests are limited.
Sequential Probability Ratio Test for Collision Avoidance Maneuver Decisions
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis
2010-01-01
When facing a conjunction between space objects, decision makers must chose whether to maneuver for collision avoidance or not. We apply a well-known decision procedure, the sequential probability ratio test, to this problem. We propose two approaches to the problem solution, one based on a frequentist method, and the other on a Bayesian method. The frequentist method does not require any prior knowledge concerning the conjunction, while the Bayesian method assumes knowledge of prior probability densities. Our results show that both methods achieve desired missed detection rates, but the frequentist method's false alarm performance is inferior to the Bayesian method's
NASA Astrophysics Data System (ADS)
Harken, B.; Geiges, A.; Rubin, Y.
2013-12-01
There are several stages in any hydrological modeling campaign, including: formulation and analysis of a priori information, data acquisition through field campaigns, inverse modeling, and forward modeling and prediction of some environmental performance metric (EPM). The EPM being predicted could be, for example, contaminant concentration, plume travel time, or aquifer recharge rate. These predictions often have significant bearing on some decision that must be made. Examples include: how to allocate limited remediation resources between multiple contaminated groundwater sites, where to place a waste repository site, and what extraction rates can be considered sustainable in an aquifer. Providing an answer to these questions depends on predictions of EPMs using forward models as well as levels of uncertainty related to these predictions. Uncertainty in model parameters, such as hydraulic conductivity, leads to uncertainty in EPM predictions. Often, field campaigns and inverse modeling efforts are planned and undertaken with reduction of parametric uncertainty as the objective. The tool of hypothesis testing allows this to be taken one step further by considering uncertainty reduction in the ultimate prediction of the EPM as the objective and gives a rational basis for weighing costs and benefits at each stage. When using the tool of statistical hypothesis testing, the EPM is cast into a binary outcome. This is formulated as null and alternative hypotheses, which can be accepted and rejected with statistical formality. When accounting for all sources of uncertainty at each stage, the level of significance of this test provides a rational basis for planning, optimization, and evaluation of the entire campaign. Case-specific information, such as consequences prediction error and site-specific costs can be used in establishing selection criteria based on what level of risk is deemed acceptable. This framework is demonstrated and discussed using various synthetic case studies. The case studies involve contaminated aquifers where a decision must be made based on prediction of when a contaminant will arrive at a given location. The EPM, in this case contaminant travel time, is cast into the hypothesis testing framework. The null hypothesis states that the contaminant plume will arrive at the specified location before a critical value of time passes, and the alternative hypothesis states that the plume will arrive after the critical time passes. Different field campaigns are analyzed based on effectiveness in reducing the probability of selecting the wrong hypothesis, which in this case corresponds to reducing uncertainty in the prediction of plume arrival time. To examine the role of inverse modeling in this framework, case studies involving both Maximum Likelihood parameter estimation and Bayesian inversion are used.
Sequential structural damage diagnosis algorithm using a change point detection method
NASA Astrophysics Data System (ADS)
Noh, H.; Rajagopal, R.; Kiremidjian, A. S.
2013-11-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method. The general change point detection method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori, unless we are looking for a known specific type of damage. Therefore, we introduce an additional algorithm that estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using a set of experimental data collected from a four-story steel special moment-resisting frame and multiple sets of simulated data. Various features of different dimensions have been explored, and the algorithm was able to identify damage, particularly when it uses multidimensional damage sensitive features and lower false alarm rates, with a known post-damage feature distribution. For unknown feature distribution cases, the post-damage distribution was consistently estimated and the detection delays were only a few time steps longer than the delays from the general method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
NASA Astrophysics Data System (ADS)
Noh, Hae Young; Rajagopal, Ram; Kiremidjian, Anne S.
2012-04-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method for the cases where the post-damage feature distribution is unknown a priori. This algorithm extracts features from structural vibration data using time-series analysis and then declares damage using the change point detection method. The change point detection method asymptotically minimizes detection delay for a given false alarm rate. The conventional method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori. Therefore, our algorithm estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using multiple sets of simulated data and a set of experimental data collected from a four-story steel special moment-resisting frame. Our algorithm was able to estimate the post-damage distribution consistently and resulted in detection delays only a few seconds longer than the delays from the conventional method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
Paleogene Radiation of a Plant Pathogenic Mushroom
Coetzee, Martin P. A.; Bloomer, Paulette; Wingfield, Michael J.; Wingfield, Brenda D.
2011-01-01
Background The global movement and speciation of fungal plant pathogens is important, especially because of the economic losses they cause and the ease with which they are able to spread across large areas. Understanding the biogeography and origin of these plant pathogens can provide insights regarding their dispersal and current day distribution. We tested the hypothesis of a Gondwanan origin of the plant pathogenic mushroom genus Armillaria and the currently accepted premise that vicariance accounts for the extant distribution of the species. Methods The phylogeny of a selection of Armillaria species was reconstructed based on Maximum Parsimony (MP), Maximum Likelihood (ML) and Bayesian Inference (BI). A timeline was then placed on the divergence of lineages using a Bayesian relaxed molecular clock approach. Results Phylogenetic analyses of sequenced data for three combined nuclear regions provided strong support for three major geographically defined clades: Holarctic, South American-Australasian and African. Molecular dating placed the initial radiation of the genus at 54 million years ago within the Early Paleogene, postdating the tectonic break-up of Gondwana. Conclusions The distribution of extant Armillaria species is the result of ancient long-distance dispersal rather than vicariance due to continental drift. As these finding are contrary to most prior vicariance hypotheses for fungi, our results highlight the important role of long-distance dispersal in the radiation of fungal pathogens from the Southern Hemisphere. PMID:22216099
Combining statistical inference and decisions in ecology.
Williams, Perry J; Hooten, Mevin B
2016-09-01
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods, including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem. © 2016 by the Ecological Society of America.
Bayesian networks for satellite payload testing
NASA Astrophysics Data System (ADS)
Przytula, Krzysztof W.; Hagen, Frank; Yung, Kar
1999-11-01
Satellite payloads are fast increasing in complexity, resulting in commensurate growth in cost of manufacturing and operation. A need exists for a software tool, which would assist engineers in production and operation of satellite systems. We have designed and implemented a software tool, which performs part of this task. The tool aids a test engineer in debugging satellite payloads during system testing. At this stage of satellite integration and testing both the tested payload and the testing equipment represent complicated systems consisting of a very large number of components and devices. When an error is detected during execution of a test procedure, the tool presents to the engineer a ranked list of potential sources of the error and a list of recommended further tests. The engineer decides this on this basis if to perform some of the recommended additional test or replace the suspect component. The tool has been installed in payload testing facility. The tool is based on Bayesian networks, a graphical method of representing uncertainty in terms of probabilistic influences. The Bayesian network was configured using detailed flow diagrams of testing procedures and block diagrams of the payload and testing hardware. The conditional and prior probability values were initially obtained from experts and refined in later stages of design. The Bayesian network provided a very informative model of the payload and testing equipment and inspired many new ideas regarding the future test procedures and testing equipment configurations. The tool is the first step in developing a family of tools for various phases of satellite integration and operation.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2016-10-04
A novel method for compound identification in liquid chromatography-high resolution mass spectrometry (LC-HRMS) is proposed. The method, based on Bayesian statistics, accommodates all possible uncertainties involved, from instrumentation up to data analysis into a single model yielding the probability of the compound of interest being present/absent in the sample. This approach differs from the classical methods in two ways. First, it is probabilistic (instead of deterministic); hence, it computes the probability that the compound is (or is not) present in a sample. Second, it answers the hypothesis "the compound is present", opposed to answering the question "the compound feature is present". This second difference implies a shift in the way data analysis is tackled, since the probability of interfering compounds (i.e., isomers and isobaric compounds) is also taken into account.
A Bayesian Method for Evaluating Passing Scores: The PPoP Curve
ERIC Educational Resources Information Center
Wainer, Howard; Wang, X. A.; Skorupski, William P.; Bradlow, Eric T.
2005-01-01
In this note, we demonstrate an interesting use of the posterior distributions (and corresponding posterior samples of proficiency) that are yielded by fitting a fully Bayesian test scoring model to a complex assessment. Specifically, we examine the efficacy of the test in combination with the specific passing score that was chosen through expert…
Bayesian Ideal Types: Integration of Psychometric Data for Visually Impaired Persons.
ERIC Educational Resources Information Center
Jones, W. P.
1991-01-01
A model is proposed for the clinical synthesis of data from psychological tests of persons with visual impairments. The model integrates the concepts of the ideal type and Bayesian probability and compares actual test scores with ideal scores through use of a pattern similarity coefficient. A pilot study with Business Enterprise Program operators…
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Tenenbaum, Joshua B.
2011-01-01
Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…
A close examination of double filtering with fold change and t test in microarray analysis
2009-01-01
Background Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure. PMID:19995439
A Rapid Item-Search Procedure for Bayesian Adaptive Testing.
1977-05-01
properties of the • procedure , they migh t well introduce undesirable psychological effects on test scores (e.g., Betz & Weiss , 1976r.’ , 1976b...ge of results and adaptive ability test .~~~~ (Research Rep . 76—4). Minneapolis: University of Minnesota , Departmen t of Psychology , Psychometric...t~~[AH ~~~ ~~~~ r _ _ _ _ A RAPID ITEM -SEARC H PROCEDURE FOR BAYESIAN ADAPTIVE TESTING C. David Vale d D D Can David J . Weiss RESEARCH REPORT 77-n
Chapinal, Núria; Schumaker, Brant A; Joly, Damien O; Elkin, Brett T; Stephen, Craig
2015-07-01
We estimated the sensitivity and specificity of the caudal-fold skin test (CFT), the fluorescent polarization assay (FPA), and the rapid lateral-flow test (RT) for the detection of Mycobacterium bovis in free-ranging wild wood bison (Bison bison athabascae), in the absence of a gold standard, by using Bayesian analysis, and then used those estimates to forecast the performance of a pairwise combination of tests in parallel. In 1998-99, 212 wood bison from Wood Buffalo National Park (Canada) were tested for M. bovis infection using CFT and two serologic tests (FPA and RT). The sensitivity and specificity of each test were estimated using a three-test, one-population, Bayesian model allowing for conditional dependence between FPA and RT. The sensitivity and specificity of the combination of CFT and each serologic test in parallel were calculated assuming conditional independence. The test performance estimates were influenced by the prior values chosen. However, the rank of tests and combinations of tests based on those estimates remained constant. The CFT was the most sensitive test and the FPA was the least sensitive, whereas RT was the most specific test and CFT was the least specific. In conclusion, given the fact that gold standards for the detection of M. bovis are imperfect and difficult to obtain in the field, Bayesian analysis holds promise as a tool to rank tests and combinations of tests based on their performance. Combining a skin test with an animal-side serologic test, such as RT, increases sensitivity in the detection of M. bovis and is a good approach to enhance disease eradication or control in wild bison.
Stepwise and stagewise approaches for spatial cluster detection
Xu, Jiale
2016-01-01
Spatial cluster detection is an important tool in many areas such as sociology, botany and public health. Previous work has mostly taken either hypothesis testing framework or Bayesian framework. In this paper, we propose a few approaches under a frequentist variable selection framework for spatial cluster detection. The forward stepwise methods search for multiple clusters by iteratively adding currently most likely cluster while adjusting for the effects of previously identified clusters. The stagewise methods also consist of a series of steps, but with tiny step size in each iteration. We study the features and performances of our proposed methods using simulations on idealized grids or real geographic area. From the simulations, we compare the performance of the proposed methods in terms of estimation accuracy and power of detections. These methods are applied to the the well-known New York leukemia data as well as Indiana poverty data. PMID:27246273
Stepwise and stagewise approaches for spatial cluster detection.
Xu, Jiale; Gangnon, Ronald E
2016-05-01
Spatial cluster detection is an important tool in many areas such as sociology, botany and public health. Previous work has mostly taken either a hypothesis testing framework or a Bayesian framework. In this paper, we propose a few approaches under a frequentist variable selection framework for spatial cluster detection. The forward stepwise methods search for multiple clusters by iteratively adding currently most likely cluster while adjusting for the effects of previously identified clusters. The stagewise methods also consist of a series of steps, but with a tiny step size in each iteration. We study the features and performances of our proposed methods using simulations on idealized grids or real geographic areas. From the simulations, we compare the performance of the proposed methods in terms of estimation accuracy and power. These methods are applied to the the well-known New York leukemia data as well as Indiana poverty data. Copyright © 2016 Elsevier Ltd. All rights reserved.
Things we still haven't learned (so far).
Ivarsson, Andreas; Andersen, Mark B; Stenling, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Null hypothesis significance testing (NHST) is like an immortal horse that some researchers have been trying to beat to death for over 50 years but without any success. In this article we discuss the flaws in NHST, the historical background in relation to both Fisher's and Neyman and Pearson's statistical ideas, the common misunderstandings of what p < .05 actually means, and the 2010 APA publication manual's clear, but most often ignored, instructions to report effect sizes and to interpret what they all mean in the real world. In addition, we discuss how Bayesian statistics can be used to overcome some of the problems with NHST. We then analyze quantitative articles published over the past three years (2012-2014) in two top-rated sport and exercise psychology journals to determine whether we have learned what we should have learned decades ago about our use and meaningful interpretations of statistics.
A Modular Mind? A Test Using Individual Data from Seven Primate Species
Amici, Federica; Barney, Bradley; Johnson, Valen E.; Call, Josep; Aureli, Filippo
2012-01-01
It has long been debated whether the mind consists of specialized and independently evolving modules, or whether and to what extent a general factor accounts for the variance in performance across different cognitive domains. In this study, we used a hierarchical Bayesian model to re-analyse individual level data collected on seven primate species (chimpanzees, bonobos, orangutans, gorillas, spider monkeys, brown capuchin monkeys and long-tailed macaques) across 17 tasks within four domains (inhibition, memory, transposition and support). Our modelling approach evidenced the existence of both a domain-specific factor and a species factor, each accounting for the same amount (17%) of the observed variance. In contrast, inter-individual differences played a minimal role. These results support the hypothesis that the mind of primates is (at least partially) modular, with domain-specific cognitive skills undergoing different evolutionary pressures in different species in response to specific ecological and social demands. PMID:23284816
Inference in the age of big data: Future perspectives on neuroscience.
Bzdok, Danilo; Yeo, B T Thomas
2017-07-15
Neuroscience is undergoing faster changes than ever before. Over 100 years our field qualitatively described and invasively manipulated single or few organisms to gain anatomical, physiological, and pharmacological insights. In the last 10 years neuroscience spawned quantitative datasets of unprecedented breadth (e.g., microanatomy, synaptic connections, and optogenetic brain-behavior assays) and size (e.g., cognition, brain imaging, and genetics). While growing data availability and information granularity have been amply discussed, we direct attention to a less explored question: How will the unprecedented data richness shape data analysis practices? Statistical reasoning is becoming more important to distill neurobiological knowledge from healthy and pathological brain measurements. We argue that large-scale data analysis will use more statistical models that are non-parametric, generative, and mixing frequentist and Bayesian aspects, while supplementing classical hypothesis testing with out-of-sample predictions. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
The idiosyncratic nature of confidence
Navajas, Joaquin; Hindocha, Chandni; Foda, Hebah; Keramati, Mehdi; Latham, Peter E; Bahrami, Bahador
2017-01-01
Confidence is the ‘feeling of knowing’ that accompanies decision making. Bayesian theory proposes that confidence is a function solely of the perceived probability of being correct. Empirical research has suggested, however, that different individuals may perform different computations to estimate confidence from uncertain evidence. To test this hypothesis, we collected confidence reports in a task where subjects made categorical decisions about the mean of a sequence. We found that for most individuals, confidence did indeed reflect the perceived probability of being correct. However, in approximately half of them, confidence also reflected a different probabilistic quantity: the perceived uncertainty in the estimated variable. We found that the contribution of both quantities was stable over weeks. We also observed that the influence of the perceived probability of being correct was stable across two tasks, one perceptual and one cognitive. Overall, our findings provide a computational interpretation of individual differences in human confidence. PMID:29152591
OTD Observations of Continental US Ground and Cloud Flashes
NASA Technical Reports Server (NTRS)
Koshak, William
2007-01-01
Lightning optical flash parameters (e.g., radiance, area, duration, number of optical groups, and number of optical events) derived from almost five years of Optical Transient Detector (OTD) data are analyzed. Hundreds of thousands of OTD flashes occurring over the continental US are categorized according to flash type (ground or cloud flash) using US National Lightning Detection Network TM (NLDN) data. The statistics of the optical characteristics of the ground and cloud flashes are inter-compared on an overall basis, and as a function of ground flash polarity. A standard two-distribution hypothesis test is used to inter-compare the population means of a given lightning parameter for the two flash types. Given the differences in the statistics of the optical characteristics, it is suggested that statistical analyses (e.g., Bayesian Inference) of the space-based optical measurements might make it possible to successfully discriminate ground and cloud flashes a reasonable percentage of the time.
Navarrete, Gorka; Correia, Rut; Sirota, Miroslav; Juanchich, Marie; Huepe, David
2015-01-01
Most of the research on Bayesian reasoning aims to answer theoretical questions about the extent to which people are able to update their beliefs according to Bayes' Theorem, about the evolutionary nature of Bayesian inference, or about the role of cognitive abilities in Bayesian inference. Few studies aim to answer practical, mainly health-related questions, such as, “What does it mean to have a positive test in a context of cancer screening?” or “What is the best way to communicate a medical test result so a patient will understand it?”. This type of research aims to translate empirical findings into effective ways of providing risk information. In addition, the applied research often adopts the paradigms and methods of the theoretically-motivated research. But sometimes it works the other way around, and the theoretical research borrows the importance of the practical question in the medical context. The study of Bayesian reasoning is relevant to risk communication in that, to be as useful as possible, applied research should employ specifically tailored methods and contexts specific to the recipients of the risk information. In this paper, we concentrate on the communication of the result of medical tests and outline the epidemiological and test parameters that affect the predictive power of a test—whether it is correct or not. Building on this, we draw up recommendations for better practice to convey the results of medical tests that could inform health policy makers (What are the drawbacks of mass screenings?), be used by health practitioners and, in turn, help patients to make better and more informed decisions. PMID:26441711
Numerical study on the sequential Bayesian approach for radioactive materials detection
NASA Astrophysics Data System (ADS)
Qingpei, Xiang; Dongfeng, Tian; Jianyu, Zhu; Fanhua, Hao; Ge, Ding; Jun, Zeng
2013-01-01
A new detection method, based on the sequential Bayesian approach proposed by Candy et al., offers new horizons for the research of radioactive detection. Compared with the commonly adopted detection methods incorporated with statistical theory, the sequential Bayesian approach offers the advantages of shorter verification time during the analysis of spectra that contain low total counts, especially in complex radionuclide components. In this paper, a simulation experiment platform implanted with the methodology of sequential Bayesian approach was developed. Events sequences of γ-rays associating with the true parameters of a LaBr3(Ce) detector were obtained based on an events sequence generator using Monte Carlo sampling theory to study the performance of the sequential Bayesian approach. The numerical experimental results are in accordance with those of Candy. Moreover, the relationship between the detection model and the event generator, respectively represented by the expected detection rate (Am) and the tested detection rate (Gm) parameters, is investigated. To achieve an optimal performance for this processor, the interval of the tested detection rate as a function of the expected detection rate is also presented.
Bayesian Estimation of Combined Accuracy for Tests with Verification Bias
Broemeling, Lyle D.
2011-01-01
This presentation will emphasize the estimation of the combined accuracy of two or more tests when verification bias is present. Verification bias occurs when some of the subjects are not subject to the gold standard. The approach is Bayesian where the estimation of test accuracy is based on the posterior distribution of the relevant parameter. Accuracy of two combined binary tests is estimated employing either “believe the positive” or “believe the negative” rule, then the true and false positive fractions for each rule are computed for two tests. In order to perform the analysis, the missing at random assumption is imposed, and an interesting example is provided by estimating the combined accuracy of CT and MRI to diagnose lung cancer. The Bayesian approach is extended to two ordinal tests when verification bias is present, and the accuracy of the combined tests is based on the ROC area of the risk function. An example involving mammography with two readers with extreme verification bias illustrates the estimation of the combined test accuracy for ordinal tests. PMID:26859487
Bayesian model checking: A comparison of tests
NASA Astrophysics Data System (ADS)
Lucy, L. B.
2018-06-01
Two procedures for checking Bayesian models are compared using a simple test problem based on the local Hubble expansion. Over four orders of magnitude, p-values derived from a global goodness-of-fit criterion for posterior probability density functions agree closely with posterior predictive p-values. The former can therefore serve as an effective proxy for the difficult-to-calculate posterior predictive p-values.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests
NASA Astrophysics Data System (ADS)
Shumway, R. H.
2001-10-01
- The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests
NASA Astrophysics Data System (ADS)
Shumway, R. H.
The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Probabilistic objective functions for sensor management
NASA Astrophysics Data System (ADS)
Mahler, Ronald P. S.; Zajic, Tim R.
2004-08-01
This paper continues the investigation of a foundational and yet potentially practical basis for control-theoretic sensor management, using a comprehensive, intuitive, system-level Bayesian paradigm based on finite-set statistics (FISST). In this paper we report our most recent progress, focusing on multistep look-ahead -- i.e., allocation of sensor resources throughout an entire future time-window. We determine future sensor states in the time-window using a "probabilistically natural" sensor management objective function, the posterior expected number of targets (PENT). This objective function is constructed using a new "maxi-PIMS" optimization strategy that hedges against unknowable future observation-collections. PENT is used in conjuction with approximate multitarget filters: the probability hypothesis density (PHD) filter or the multi-hypothesis correlator (MHC) filter.
Association with humans and seasonality interact to reverse predictions for animal space use.
Laver, Peter N; Alexander, Kathleen A
2018-01-01
Variation in animal space use reflects fitness trade-offs associated with ecological constraints. Associated theories such as the metabolic theory of ecology and the resource dispersion hypothesis generate predictions about what drives variation in animal space use. But, metabolic theory is usually tested in macro-ecological studies and is seldom invoked explicitly in within-species studies. Full evaluation of the resource dispersion hypothesis requires testing in more species. Neither have been evaluated in the context of anthropogenic landscape change. In this study, we used data for banded mongooses ( Mungos mungo ) in northeastern Botswana, along a gradient of association with humans, to test for effects of space use drivers predicted by these theories. We used Bayesian parameter estimation and inference from linear models to test for seasonal differences in space use metrics and to model seasonal effects of space use drivers. Results suggest that space use is strongly associated with variation in the level of overlap that mongoose groups have with humans. Seasonality influences this association, reversing seasonal space use predictions historically-accepted by ecologists. We found support for predictions of the metabolic theory when moderated by seasonality, by association with humans and by their interaction. Space use of mongooses living in association with humans was more concentrated in the dry season than the wet season, when historically-accepted ecological theory predicted more dispersed space use. Resource richness factors such as building density were associated with space use only during the dry season. We found negligible support for predictions of the resource dispersion hypothesis in general or for metabolic theory where seasonality and association with humans were not included. For mongooses living in association with humans, space use was not associated with patch dispersion or group size over both seasons. In our study, living in association with humans influenced space use patterns that diverged from historically-accepted predictions. There is growing need to explicitly incorporate human-animal interactions into ecological theory and research. Our results and methodology may contribute to understanding effects of anthropogenic landscape change on wildlife populations.
Model-Selection Theory: The Need for a More Nuanced Picture of Use-Novelty and Double-Counting
Steele, Katie; Werndl, Charlotte
2018-01-01
Abstract This article argues that common intuitions regarding (a) the specialness of ‘use-novel’ data for confirmation and (b) that this specialness implies the ‘no-double-counting rule’, which says that data used in ‘constructing’ (calibrating) a model cannot also play a role in confirming the model’s predictions, are too crude. The intuitions in question are pertinent in all the sciences, but we appeal to a climate science case study to illustrate what is at stake. Our strategy is to analyse the intuitive claims in light of prominent accounts of confirmation of model predictions. We show that on the Bayesian account of confirmation, and also on the standard classical hypothesis-testing account, claims (a) and (b) are not generally true; but for some select cases, it is possible to distinguish data used for calibration from use-novel data, where only the latter confirm. The more specialized classical model-selection methods, on the other hand, uphold a nuanced version of claim (a), but this comes apart from (b), which must be rejected in favour of a more refined account of the relationship between calibration and confirmation. Thus, depending on the framework of confirmation, either the scope or the simplicity of the intuitive position must be revised. 1 Introduction2 A Climate Case Study3 The Bayesian Method vis-à-vis Intuitions4 Classical Tests vis-à-vis Intuitions5 Classical Model-Selection Methods vis-à-vis Intuitions 5.1 Introducing classical model-selection methods 5.2 Two cases6 Re-examining Our Case Study7 Conclusion PMID:29780170
2012-01-01
Background The majority of Haemosporida species infect birds or reptiles, but many important genera, including Plasmodium, infect mammals. Dipteran vectors shared by avian, reptilian and mammalian Haemosporida, suggest multiple invasions of Mammalia during haemosporidian evolution; yet, phylogenetic analyses have detected only a single invasion event. Until now, several important mammal-infecting genera have been absent in these analyses. This study focuses on the evolutionary origin of Polychromophilus, a unique malaria genus that only infects bats (Microchiroptera) and is transmitted by bat flies (Nycteribiidae). Methods Two species of Polychromophilus were obtained from wild bats caught in Switzerland. These were molecularly characterized using four genes (asl, clpc, coI, cytb) from the three different genomes (nucleus, apicoplast, mitochondrion). These data were then combined with data of 60 taxa of Haemosporida available in GenBank. Bayesian inference, maximum likelihood and a range of rooting methods were used to test specific hypotheses concerning the phylogenetic relationships between Polychromophilus and the other haemosporidian genera. Results The Polychromophilus melanipherus and Polychromophilus murinus samples show genetically distinct patterns and group according to species. The Bayesian tree topology suggests that the monophyletic clade of Polychromophilus falls within the avian/saurian clade of Plasmodium and directed hypothesis testing confirms the Plasmodium origin. Conclusion Polychromophilus' ancestor was most likely a bird- or reptile-infecting Plasmodium before it switched to bats. The invasion of mammals as hosts has, therefore, not been a unique event in the evolutionary history of Haemosporida, despite the suspected costs of adapting to a new host. This was, moreover, accompanied by a switch in dipteran host. PMID:22356874
Model-Selection Theory: The Need for a More Nuanced Picture of Use-Novelty and Double-Counting.
Steele, Katie; Werndl, Charlotte
2018-06-01
This article argues that common intuitions regarding (a) the specialness of 'use-novel' data for confirmation and (b) that this specialness implies the 'no-double-counting rule', which says that data used in 'constructing' (calibrating) a model cannot also play a role in confirming the model's predictions, are too crude. The intuitions in question are pertinent in all the sciences, but we appeal to a climate science case study to illustrate what is at stake. Our strategy is to analyse the intuitive claims in light of prominent accounts of confirmation of model predictions. We show that on the Bayesian account of confirmation, and also on the standard classical hypothesis-testing account, claims (a) and (b) are not generally true; but for some select cases, it is possible to distinguish data used for calibration from use-novel data, where only the latter confirm. The more specialized classical model-selection methods, on the other hand, uphold a nuanced version of claim (a), but this comes apart from (b), which must be rejected in favour of a more refined account of the relationship between calibration and confirmation. Thus, depending on the framework of confirmation, either the scope or the simplicity of the intuitive position must be revised. 1 Introduction 2 A Climate Case Study 3 The Bayesian Method vis-à-vis Intuitions 4 Classical Tests vis-à-vis Intuitions 5 Classical Model-Selection Methods vis-à-vis Intuitions 5.1 Introducing classical model-selection methods 5.2 Two cases 6 Re-examining Our Case Study 7 Conclusion .
Assessing Mediational Models: Testing and Interval Estimation for Indirect Effects.
Biesanz, Jeremy C; Falk, Carl F; Savalei, Victoria
2010-08-06
Theoretical models specifying indirect or mediated effects are common in the social sciences. An indirect effect exists when an independent variable's influence on the dependent variable is mediated through an intervening variable. Classic approaches to assessing such mediational hypotheses ( Baron & Kenny, 1986 ; Sobel, 1982 ) have in recent years been supplemented by computationally intensive methods such as bootstrapping, the distribution of the product methods, and hierarchical Bayesian Markov chain Monte Carlo (MCMC) methods. These different approaches for assessing mediation are illustrated using data from Dunn, Biesanz, Human, and Finn (2007). However, little is known about how these methods perform relative to each other, particularly in more challenging situations, such as with data that are incomplete and/or nonnormal. This article presents an extensive Monte Carlo simulation evaluating a host of approaches for assessing mediation. We examine Type I error rates, power, and coverage. We study normal and nonnormal data as well as complete and incomplete data. In addition, we adapt a method, recently proposed in statistical literature, that does not rely on confidence intervals (CIs) to test the null hypothesis of no indirect effect. The results suggest that the new inferential method-the partial posterior p value-slightly outperforms existing ones in terms of maintaining Type I error rates while maximizing power, especially with incomplete data. Among confidence interval approaches, the bias-corrected accelerated (BC a ) bootstrapping approach often has inflated Type I error rates and inconsistent coverage and is not recommended; In contrast, the bootstrapped percentile confidence interval and the hierarchical Bayesian MCMC method perform best overall, maintaining Type I error rates, exhibiting reasonable power, and producing stable and accurate coverage rates.
Salas-Leiva, Dayana E; Meerow, Alan W; Calonje, Michael; Griffith, M Patrick; Francisco-Ortega, Javier; Nakamura, Kyoko; Stevenson, Dennis W; Lewis, Carl E; Namoff, Sandra
2013-11-01
Despite a recent new classification, a stable phylogeny for the cycads has been elusive, particularly regarding resolution of Bowenia, Stangeria and Dioon. In this study, five single-copy nuclear genes (SCNGs) are applied to the phylogeny of the order Cycadales. The specific aim is to evaluate several gene tree-species tree reconciliation approaches for developing an accurate phylogeny of the order, to contrast them with concatenated parsimony analysis and to resolve the erstwhile problematic phylogenetic position of these three genera. DNA sequences of five SCNGs were obtained for 20 cycad species representing all ten genera of Cycadales. These were analysed with parsimony, maximum likelihood (ML) and three Bayesian methods of gene tree-species tree reconciliation, using Cycas as the outgroup. A calibrated date estimation was developed with Bayesian methods, and biogeographic analysis was also conducted. Concatenated parsimony, ML and three species tree inference methods resolve exactly the same tree topology with high support at most nodes. Dioon and Bowenia are the first and second branches of Cycadales after Cycas, respectively, followed by an encephalartoid clade (Macrozamia-Lepidozamia-Encephalartos), which is sister to a zamioid clade, of which Ceratozamia is the first branch, and in which Stangeria is sister to Microcycas and Zamia. A single, well-supported phylogenetic hypothesis of the generic relationships of the Cycadales is presented. However, massive extinction events inferred from the fossil record that eliminated broader ancestral distributions within Zamiaceae compromise accurate optimization of ancestral biogeographical areas for that hypothesis. While major lineages of Cycadales are ancient, crown ages of all modern genera are no older than 12 million years, supporting a recent hypothesis of mostly Miocene radiations. This phylogeny can contribute to an accurate infrafamilial classification of Zamiaceae.
Are humans the initial source of canine mange?
Andriantsoanirina, Valérie; Fang, Fang; Ariey, Frédéric; Izri, Arezki; Foulet, Françoise; Botterel, Françoise; Bernigaud, Charlotte; Chosidow, Olivier; Huang, Weiyi; Guillot, Jacques; Durand, Rémy
2016-03-25
Scabies, or mange as it is called in animals, is an ectoparasitic contagious infestation caused by the mite Sarcoptes scabiei. Sarcoptic mange is an important veterinary disease leading to significant morbidity and mortality in wild and domestic animals. A widely accepted hypothesis, though never substantiated by factual data, suggests that humans were the initial source of the animal contamination. In this study we performed phylogenetic analyses of populations of S. scabiei from humans and from canids to validate or not the hypothesis of a human origin of the mites infecting domestic dogs. Mites from dogs and foxes were obtained from three French sites and from other countries. A part of cytochrome c oxidase subunit 1 (cox1) gene was amplified and directly sequenced. Other sequences corresponding to mites from humans, raccoon dogs, foxes, jackal and dogs from various geographical areas were retrieved from GenBank. Phylogenetic analyses were performed using the Otodectes cynotis cox1 sequence as outgroup. Maximum Likelihood and Bayesian Inference analysis approaches were used. To visualize the relationship between the haplotypes, a median joining haplotype network was constructed using Network v4.6 according to host. Twenty-one haplotypes were observed among mites collected from five different host species, including humans and canids from nine geographical areas. The phylogenetic trees based on Maximum Likelihood and Bayesian Inference analyses showed similar topologies with few differences in node support values. The results were not consistent with a human origin of S. scabiei mites in dogs and, on the contrary, did not exclude the opposite hypothesis of a host switch from dogs to humans. Phylogenetic relatedness may have an impact in terms of epidemiological control strategy. Our results and other recent studies suggest to re-evaluate the level of transmission between domestic dogs and humans.
NASA Astrophysics Data System (ADS)
Mustać, Marija; Tkalčić, Hrvoje; Burky, Alexander L.
2018-01-01
Moment tensor (MT) inversion studies of events in The Geysers geothermal field mostly focused on microseismicity and found a large number of earthquakes with significant non-double-couple (non-DC) seismic radiation. Here we concentrate on the largest events in the area in recent years using a hierarchical Bayesian MT inversion. Initially, we show that the non-DC components of the MT can be reliably retrieved using regional waveform data from a small number of stations. Subsequently, we present results for a number of events and show that accounting for noise correlations can lead to retrieval of a lower isotropic (ISO) component and significantly different focal mechanisms. We compute the Bayesian evidence to compare solutions obtained with different assumptions of the noise covariance matrix. Although a diagonal covariance matrix produces a better waveform fit, inversions that account for noise correlations via an empirically estimated noise covariance matrix account for interdependences of data errors and are preferred from a Bayesian point of view. This implies that improper treatment of data noise in waveform inversions can result in fitting the noise and misinterpreting the non-DC components. Finally, one of the analyzed events is characterized as predominantly DC, while the others still have significant non-DC components, probably as a result of crack opening, which is a reasonable hypothesis for The Geysers geothermal field geological setting.
Bowden, Vanessa K; Loft, Shayne
2016-06-01
In 2 experiments we examined the impact of memory for prior events on conflict detection in simulated air traffic control under conditions where individuals proactively controlled aircraft and completed concurrent tasks. Individuals were faster to detect conflicts that had repeatedly been presented during training (positive transfer). Bayesian statistics indicated strong evidence for the null hypothesis that conflict detection was not impaired for events that resembled an aircraft pair that had repeatedly come close to conflicting during training. This is likely because aircraft altitude (the feature manipulated between training and test) was attended to by participants when proactively controlling aircraft. In contrast, a minor change to the relative position of a repeated nonconflicting aircraft pair moderately impaired conflict detection (negative transfer). There was strong evidence for the null hypothesis that positive transfer was not impacted by dividing participant attention, which suggests that part of the information retrieved regarding prior aircraft events was perceptual (the new aircraft pair "looked" like a conflict based on familiarity). These findings extend the effects previously reported by Loft, Humphreys, and Neal (2004), answering the recent strong and unanimous calls across the psychological science discipline to formally establish the robustness and generality of previously published effects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Ursenbacher, Sylvain; Guillon, Michaël; Cubizolle, Hervé; Dupoué, Andréaz; Blouin-Demers, Gabriel; Lourdais, Olivier
2015-07-01
Understanding the impact of postglacial recolonization on genetic diversity is essential in explaining current patterns of genetic variation. The central-marginal hypothesis (CMH) predicts a reduction in genetic diversity from the core of the distribution to peripheral populations, as well as reduced connectivity between peripheral populations. While the CMH has received considerable empirical support, its broad applicability is still debated and alternative hypotheses predict different spatial patterns of genetic diversity. Using microsatellite markers, we analysed the genetic diversity of the adder (Vipera berus) in western Europe to reconstruct postglacial recolonization. Approximate Bayesian Computation (ABC) analyses suggested a postglacial recolonization from two routes: a western route from the Atlantic Coast up to Belgium and a central route from the Massif Central to the Alps. This cold-adapted species likely used two isolated glacial refugia in southern France, in permafrost-free areas during the last glacial maximum. Adder populations further from putative glacial refugia had lower genetic diversity and reduced connectivity; therefore, our results support the predictions of the CMH. Our study also illustrates the utility of highly variable nuclear markers, such as microsatellites, and ABC to test competing recolonization hypotheses. © 2015 John Wiley & Sons Ltd.
Intuitive Logic Revisited: New Data and a Bayesian Mixed Model Meta-Analysis
Singmann, Henrik; Klauer, Karl Christoph; Kellen, David
2014-01-01
Recent research on syllogistic reasoning suggests that the logical status (valid vs. invalid) of even difficult syllogisms can be intuitively detected via differences in conceptual fluency between logically valid and invalid syllogisms when participants are asked to rate how much they like a conclusion following from a syllogism (Morsanyi & Handley, 2012). These claims of an intuitive logic are at odds with most theories on syllogistic reasoning which posit that detecting the logical status of difficult syllogisms requires effortful and deliberate cognitive processes. We present new data replicating the effects reported by Morsanyi and Handley, but show that this effect is eliminated when controlling for a possible confound in terms of conclusion content. Additionally, we reanalyze three studies () without this confound with a Bayesian mixed model meta-analysis (i.e., controlling for participant and item effects) which provides evidence for the null-hypothesis and against Morsanyi and Handley's claim. PMID:24755777
Dougherty, Michael R; Hamovitz, Toby; Tidwell, Joe W
2016-02-01
A recent meta-analysis by Au et al. Psychonomic Bulletin & Review, 22, 366-377, (2015) reviewed the n-back training paradigm for working memory (WM) and evaluated whether (when aggregating across existing studies) there was evidence that gains obtained for training tasks transferred to gains in fluid intelligence (Gf). Their results revealed an overall effect size of g = 0.24 for the effect of n-back training on Gf. We reexamine the data through a Bayesian lens, to evaluate the relative strength of the evidence for the alternative versus null hypotheses, contingent on the type of control condition used. We find that studies using a noncontact (passive) control group strongly favor the alternative hypothesis that training leads to transfer but that studies using active-control groups show modest evidence in favor of the null. We discuss these findings in the context of placebo effects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sigeti, David E.; Pelak, Robert A.
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis withmore » an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a general beta-function prior for {theta}, enabling sequential analysis in which a small number of new simulations may be done and the resulting posterior for {theta} used as a prior to inform the next stage of power analysis.« less
NASA Astrophysics Data System (ADS)
Pleban, J. R.; Mackay, D. S.; Aston, T.; Ewers, B. E.; Wienig, C.
2013-12-01
Quantifying the drought tolerance of crop species and genotypes is essential in order to predict how water stress may impact agricultural productivity. As climate models predict an increase in both frequency and severity of drought corresponding plant hydraulic and biochemical models are needed to accurately predict crop drought tolerance. Drought can result in cavitation of xylem conduits and related loss of plant hydraulic conductivity. This study tested the hypothesis that a model incorporating a plants vulnerability to cavitation would best assess drought tolerance in Brassica rapa. Four Brassica genotypes were subjected to drought conditions at a field site in Laramie, WY. Concurrent leaf gas exchange, volumetric soil moisture content and xylem pressure measurements were made during the drought period. Three models were used to access genotype specific drought tolerance. All 3 models rely on the Farquhar biochemical/biophysical model of leaf level photosynthesis, which is integrated into the Terrestrial Regional Ecosystem Exchange Simulator (TREES). The models differ in how TREES applies the environmental driving data and plant physiological mechanisms; specifically how water availability at the site of photosynthesis is derived. Model 1 established leaf water availability from a modeled soil moisture content; Model 2 input soil moisture measurements directly to establish leaf water availability; Model 3 incorporated the Sperry soil-plant transport model, which calculates flows and pressure along the soil-plant water transport pathway to establish leaf water availability. This third model incorporated measured xylem pressures thus constraining leaf water availability via genotype specific vulnerability curves. A multi-model intercomparison was made using a Bayesian approach, which assessed the interaction between uncertainty in model results and data. The three models were further evaluated by assessing model accuracy and complexity via deviance information criteria (DIC). Results suggest that model 1 was unable to model soil moisture accurately and thus did not effectively characterize drought tolerance. Models 2 and 3 were both effective at characterizing drought tolerance; model 3 preformed best in genotypes with the highest vulnerability to cavitation. By identifying through both Bayesian and DIC analyses models that best characterize drought tolerance future investigations into the interaction between crop productivity and water use can be informed by hypothesis testing using models prior to experimentation.
Bayesian analyses of time-interval data for environmental radiation monitoring.
Luo, Peng; Sharp, Julia L; DeVol, Timothy A
2013-01-01
Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.
Bayes factor design analysis: Planning for compelling evidence.
Schönbrodt, Felix D; Wagenmakers, Eric-Jan
2018-02-01
A sizeable literature exists on the use of frequentist power analysis in the null-hypothesis significance testing (NHST) paradigm to facilitate the design of informative experiments. In contrast, there is almost no literature that discusses the design of experiments when Bayes factors (BFs) are used as a measure of evidence. Here we explore Bayes Factor Design Analysis (BFDA) as a useful tool to design studies for maximum efficiency and informativeness. We elaborate on three possible BF designs, (a) a fixed-n design, (b) an open-ended Sequential Bayes Factor (SBF) design, where researchers can test after each participant and can stop data collection whenever there is strong evidence for either [Formula: see text] or [Formula: see text], and (c) a modified SBF design that defines a maximal sample size where data collection is stopped regardless of the current state of evidence. We demonstrate how the properties of each design (i.e., expected strength of evidence, expected sample size, expected probability of misleading evidence, expected probability of weak evidence) can be evaluated using Monte Carlo simulations and equip researchers with the necessary information to compute their own Bayesian design analyses.
Editorial: Bayesian benefits for child psychology and psychiatry researchers.
Oldehinkel, Albertine J
2016-09-01
For many scientists, performing statistical tests has become an almost automated routine. However, p-values are frequently used and interpreted incorrectly; and even when used appropriately, p-values tend to provide answers that do not match researchers' questions and hypotheses well. Bayesian statistics present an elegant and often more suitable alternative. The Bayesian approach has rarely been applied in child psychology and psychiatry research so far, but the development of user-friendly software packages and tutorials has placed it well within reach now. Because Bayesian analyses require a more refined definition of hypothesized probabilities of possible outcomes than the classical approach, going Bayesian may offer the additional benefit of sparkling the development and refinement of theoretical models in our field. © 2016 Association for Child and Adolescent Mental Health.
NASA Astrophysics Data System (ADS)
Alehosseini, Ali; A. Hejazi, Maryam; Mokhtari, Ghassem; B. Gharehpetian, Gevork; Mohammadi, Mohammad
2015-06-01
In this paper, the Bayesian classifier is used to detect and classify the radial deformation and axial displacement of transformer windings. The proposed method is tested on a model of transformer for different volumes of radial deformation and axial displacement. In this method, ultra-wideband (UWB) signal is sent to the simplified model of the transformer winding. The received signal from the winding model is recorded and used for training and testing of Bayesian classifier in different axial displacement and radial deformation states of the winding. It is shown that the proposed method has a good accuracy to detect and classify the axial displacement and radial deformation of the winding.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
Woodbury, Allan D.; Rubin, Yoram
2000-01-01
A method for inverting the travel time moments of solutes in heterogeneous aquifers is presented and is based on peak concentration arrival times as measured at various samplers in an aquifer. The approach combines a Lagrangian [Rubin and Dagan, 1992] solute transport framework with full‐Bayesian hydrogeological parameter inference. In the full‐Bayesian approach the noise values in the observed data are treated as hyperparameters, and their effects are removed by marginalization. The prior probability density functions (pdfs) for the model parameters (horizontal integral scale, velocity, and log K variance) and noise values are represented by prior pdfs developed from minimum relative entropy considerations. Analysis of the Cape Cod (Massachusetts) field experiment is presented. Inverse results for the hydraulic parameters indicate an expected value for the velocity, variance of log hydraulic conductivity, and horizontal integral scale of 0.42 m/d, 0.26, and 3.0 m, respectively. While these results are consistent with various direct‐field determinations, the importance of the findings is in the reduction of confidence range about the various expected values. On selected control planes we compare observed travel time frequency histograms with the theoretical pdf, conditioned on the observed travel time moments. We observe a positive skew in the travel time pdf which tends to decrease as the travel time distance grows. We also test the hypothesis that there is no scale dependence of the integral scale λ with the scale of the experiment at Cape Cod. We adopt two strategies. The first strategy is to use subsets of the full data set and then to see if the resulting parameter fits are different as we use different data from control planes at expanding distances from the source. The second approach is from the viewpoint of entropy concentration. No increase in integral scale with distance is inferred from either approach over the range of the Cape Cod tracer experiment.
Bayesian parameter estimation of a k-ε model for accurate jet-in-crossflow simulations
Ray, Jaideep; Lefantzi, Sophia; Arunajatesan, Srinivasan; ...
2016-05-31
Reynolds-averaged Navier–Stokes models are not very accurate for high-Reynolds-number compressible jet-in-crossflow interactions. The inaccuracy arises from the use of inappropriate model parameters and model-form errors in the Reynolds-averaged Navier–Stokes model. In this study, the hypothesis is pursued that Reynolds-averaged Navier–Stokes predictions can be significantly improved by using parameters inferred from experimental measurements of a supersonic jet interacting with a transonic crossflow.
Gustafsson, Mats G; Wallman, Mikael; Wickenberg Bolin, Ulrika; Göransson, Hanna; Fryknäs, M; Andersson, Claes R; Isaksson, Anders
2010-06-01
Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (CI) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the CI is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice. It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples. Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets. An empirically derived ME prior seems promising for improving the Bayesian CI for the unknown error rate of a designed classifier. Copyright 2010 Elsevier B.V. All rights reserved.
Divergence and diversification in North American Psoraleeae (Fabaceae) due to climate change
Egan, Ashley N; Crandall, Keith A
2008-01-01
Background Past studies in the legume family (Fabaceae) have uncovered several evolutionary trends including differential mutation and diversification rates across varying taxonomic levels. The legume tribe Psoraleeae is shown herein to exemplify these trends at the generic and species levels. This group includes a sizable diversification within North America dated at approximately 6.3 million years ago with skewed species distribution to the most recently derived genus, Pediomelum, suggesting a diversification rate shift. We estimate divergence dates of North American (NAm) Psoraleeae using Bayesian MCMC sampling in BEAST based on eight DNA regions (ITS, waxy, matK, trnD-trnT, trnL-trnF, trnK, trnS-trnG, and rpoB-trnC). We also test the hypothesis of a diversification rate shift within NAm Psoraleeae using topological and temporal methods. We investigate the impact of climate change on diversification in this group by (1) testing the hypothesis that a shift from mesic to xeric habitats acted as a key innovation and (2) investigating diversification rate shifts along geologic time, discussing the impact of Quaternary climate oscillations on diversification. Results NAm Psoraleeae represents a recent, rapid radiation with several genera originating during the Pleistocene, 1 to 2 million years ago. A shift in diversification rate is supported by both methods with a 2.67-fold increase suggested around 2 million years ago followed by a 8.73-fold decrease 440,000 years ago. The hypothesis that a climate regime shift from mesic to xeric habitats drove increased diversification in affected taxa was not supported. Timing of the diversification rate increase supports the hypothesis that glaciation-induced climate changes during the Quaternary influenced diversification of the group. Nonrandom spatial diversification also exists, with greater species richness in the American Southwest. Conclusion This study outlines NAm Psoraleeae as a model example of a recent, rapid radiation. Diversification rate shifts in NAm Psoraleeae are not due to current climate regimes as represented by habitat, but instead to past global climate change resulting from Quaternary glaciations. NAm Psoraleeae diversification is a good example of how earthly dynamics including global climate change and topography work together to shape biodiversity. PMID:19091055
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics
Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.
2015-01-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564
Application of Poisson random effect models for highway network screening.
Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer
2014-02-01
In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification. Copyright © 2013 Elsevier Ltd. All rights reserved.
The PMHT: solutions for some of its problems
NASA Astrophysics Data System (ADS)
Wieneke, Monika; Koch, Wolfgang
2007-09-01
Tracking multiple targets in a cluttered environment is a challenging task. Probabilistic Multiple Hypothesis Tracking (PMHT) is an efficient approach for dealing with it. Essentially PMHT is based on the method of Expectation-Maximization for handling with association conflicts. Linearity in the number of targets and measurements is the main motivation for a further development and extension of this methodology. Unfortunately, compared with the Probabilistic Data Association Filter (PDAF), PMHT has not yet shown its superiority in terms of track-lost statistics. Furthermore, the problem of track extraction and deletion is apparently not yet satisfactorily solved within this framework. Four properties of PMHT are responsible for its problems in track maintenance: Non-Adaptivity, Hospitality, Narcissism and Local Maxima. 1, 2 In this work we present a solution for each of them and derive an improved PMHT by integrating the solutions into the PMHT formalism. The new PMHT is evaluated by Monte-Carlo simulations. A sequential Likelihood-Ratio (LR) test for track extraction has been developed and already integrated into the framework of traditional Bayesian Multiple Hypothesis Tracking. 3 As a multi-scan approach, also the PMHT methodology has the potential for track extraction. In this paper an analogous integration of a sequential LR test into the PMHT framework is proposed. We present an LR formula for track extraction and deletion using the PMHT update formulae. As PMHT provides all required ingredients for a sequential LR calculation, the LR is thus a by-product of the PMHT iteration process. Therefore the resulting update formula for the sequential LR test affords the development of Track-Before-Detect algorithms for PMHT. The approach is illustrated by a simple example.
Pannacciulli, Federica G; Maltagliati, Ferruccio; de Guttry, Christian; Achituv, Yair
2017-01-01
The model marine broadcast-spawner barnacle Chthamalus montagui was investigated to understand its genetic structure and quantify levels of population divergence, and to make inference on historical demography in terms of time of divergence and changes in population size. We collected specimens from rocky shores of the north-east Atlantic Ocean (4 locations), Mediterranean Sea (8) and Black Sea (1). The 312 sequences 537 bp) of the mitochondrial cytochrome c oxidase I allowed to detect 130 haplotypes. High within-location genetic variability was recorded, with haplotype diversity ranging between h = 0.750 and 0.967. Parameters of genetic divergence, haplotype network and Bayesian assignment analysis were consistent in rejecting the hypothesis of panmixia. C. montagui is genetically structured in three geographically discrete populations, which corresponded to north-eastern Atlantic Ocean, western-central Mediterranean Sea, and Aegean Sea-Black Sea. These populations are separated by two main effective barriers to gene flow located at the Almeria-Oran Front and in correspondence of the Cyclades Islands. According to the 'isolation with migration' model, adjacent population pairs diverged during the early to middle Pleistocene transition, a period in which geological events provoked significant changes in the structure and composition of palaeocommunities. Mismatch distributions, neutrality tests and Bayesian skyline plots showed past population expansions, which started approximately in the Mindel-Riss interglacial, in which ecological conditions were favourable for temperate species and calcium-uptaking marine organisms.
Efficient Bayesian inference for natural time series using ARFIMA processes
NASA Astrophysics Data System (ADS)
Graves, Timothy; Gramacy, Robert; Franzke, Christian; Watkins, Nicholas
2016-04-01
Many geophysical quantities, such as atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long memory (LM). LM implies that these quantities experience non-trivial temporal memory, which potentially not only enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LM. We present a modern and systematic approach to the inference of LM. We use the flexible autoregressive fractional integrated moving average (ARFIMA) model, which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LM, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g., short-memory effects) can be integrated over in order to focus on long-memory parameters and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data and the central England temperature (CET) time series, with favorable comparison to the standard estimators [1]. In addition we show how the method can be used to perform joint inference of the stability exponent and the memory parameter when ARFIMA is extended to allow for alpha-stable innovations. Such models can be used to study systems where heavy tails and long range memory coexist. [1] Graves et al, Nonlin. Processes Geophys., 22, 679-700, 2015; doi:10.5194/npg-22-679-2015.
Pannacciulli, Federica G.; de Guttry, Christian; Achituv, Yair
2017-01-01
The model marine broadcast-spawner barnacle Chthamalus montagui was investigated to understand its genetic structure and quantify levels of population divergence, and to make inference on historical demography in terms of time of divergence and changes in population size. We collected specimens from rocky shores of the north-east Atlantic Ocean (4 locations), Mediterranean Sea (8) and Black Sea (1). The 312 sequences 537 bp) of the mitochondrial cytochrome c oxidase I allowed to detect 130 haplotypes. High within-location genetic variability was recorded, with haplotype diversity ranging between h = 0.750 and 0.967. Parameters of genetic divergence, haplotype network and Bayesian assignment analysis were consistent in rejecting the hypothesis of panmixia. C. montagui is genetically structured in three geographically discrete populations, which corresponded to north-eastern Atlantic Ocean, western-central Mediterranean Sea, and Aegean Sea-Black Sea. These populations are separated by two main effective barriers to gene flow located at the Almeria-Oran Front and in correspondence of the Cyclades Islands. According to the ‘isolation with migration’ model, adjacent population pairs diverged during the early to middle Pleistocene transition, a period in which geological events provoked significant changes in the structure and composition of palaeocommunities. Mismatch distributions, neutrality tests and Bayesian skyline plots showed past population expansions, which started approximately in the Mindel-Riss interglacial, in which ecological conditions were favourable for temperate species and calcium-uptaking marine organisms. PMID:28594840
Li, Jun; Fu, Cuizhang; Lei, Guangchun
2011-01-01
Few studies have explored the role of Cenozoic tectonic evolution in shaping patterns and processes of extant animal distributions within East Asian margins. We select Hynobius salamanders (Amphibia: Hynobiidae) as a model to examine biogeographical consequences of Cenozoic tectonic events within East Asian margins. First, we use GenBank molecular data to reconstruct phylogenetic interrelationships of Hynobius by Bayesian and maximum likelihood analyses. Second, we estimate the divergence time using the Bayesian relaxed clock approach and infer dispersal/vicariance histories under the ‘dispersal–extinction–cladogenesis’ model. Finally, we test whether evolutionary history and biogeographical processes of Hynobius should coincide with the predictions of two major hypotheses (the ‘vicariance’/‘out of southwestern Japan’ hypothesis). The resulting phylogeny confirmed Hynobius as a monophyletic group, which could be divided into nine major clades associated with six geographical areas. Our results show that: (1) the most recent common ancestor of Hynobius was distributed in southwestern Japan and Hokkaido Island, (2) a sister taxon relationship between Hynobius retardatus and all remaining species was the results of a vicariance event between Hokkaido Island and southwestern Japan in the Middle Eocene, (3) ancestral Hynobius in southwestern Japan dispersed into the Taiwan Island, central China, ‘Korean Peninsula and northeastern China’ as well as northeastern Honshu during the Late Eocene–Late Miocene. Our findings suggest that Cenozoic tectonic evolution plays an important role in shaping disjunctive distributions of extant Hynobius within East Asian margins. PMID:21738684
Zhao, Lei; Annie, Ang Shi Hui; Amrita, Srivathsan; Yi, Su Kathy Feng; Rudolf, Meier
2013-10-01
We here present a phylogenetic hypothesis for Sepsidae (Diptera: Cyclorrhapha), a group of schizophoran flies with ca. 320 described species that is widely used in sexual selection research. The hypothesis is based on five nuclear and five mitochondrial markers totaling 8813 bp for ca. 30% of the diversity (105 sepsid taxa) and - depending on analysis - six or nine outgroup species. Maximum parsimony (MP), maximum likelihood (ML), and Bayesian inferences (BI) yield overall congruent, well-resolved, and supported trees that are largely unaffected by three different ways to partition the data in BI and ML analyses. However, there are also five areas of uncertainty that affect suprageneric relationships where different analyses yield alternate topologies and MP and ML trees have significant conflict according to Shimodaira-Hasegawa tests. Two of these were already affected by conflict in a previous analysis that was based on the same genes and a subset of 69 species. The remaining three involve newly added taxa or genera whose relationships were previously resolved with low support. We thus find that the denser taxon sample in the present analysis does not reduce the topological conflict that had been identified previously. The present study nevertheless presents a significant contribution to the understanding of sepsid relationships in that 50 additional taxa from 18 genera are added to the Tree-of-Life of Sepsidae and that the placement of most taxa is well supported and robust to different tree reconstruction techniques. Copyright © 2013 Elsevier Inc. All rights reserved.
Bayesian analysis of multimethod ego-depletion studies favours the null hypothesis.
Etherton, Joseph L; Osborne, Randall; Stephenson, Katelyn; Grace, Morgan; Jones, Chas; De Nadai, Alessandro S
2018-04-01
Ego-depletion refers to the purported decrease in performance on a task requiring self-control after engaging in a previous task involving self-control, with self-control proposed to be a limited resource. Despite many published studies consistent with this hypothesis, recurrent null findings within our laboratory and indications of publication bias have called into question the validity of the depletion effect. This project used three depletion protocols involved three different depleting initial tasks followed by three different self-control tasks as dependent measures (total n = 840). For each method, effect sizes were not significantly different from zero When data were aggregated across the three different methods and examined meta-analytically, the pooled effect size was not significantly different from zero (for all priors evaluated, Hedges' g = 0.10 with 95% credibility interval of [-0.05, 0.24]) and Bayes factors reflected strong support for the null hypothesis (Bayes factor > 25 for all priors evaluated). © 2018 The British Psychological Society.
Dor, Roi; Carling, Matthew D; Lovette, Irby J; Sheldon, Frederick H; Winkler, David W
2012-10-01
The New World swallow genus Tachycineta comprises nine species that collectively have a wide geographic distribution and remarkable variation both within- and among-species in ecologically important traits. Existing phylogenetic hypotheses for Tachycineta are based on mitochondrial DNA sequences, thus they provide estimates of a single gene tree. In this study we sequenced multiple individuals from each species at 16 nuclear intron loci. We used gene concatenated approaches (Bayesian and maximum likelihood) as well as coalescent-based species tree inference to reconstruct phylogenetic relationships of the genus. We examined the concordance and conflict between the nuclear and mitochondrial trees and between concatenated and coalescent-based inferences. Our results provide an alternative phylogenetic hypothesis to the existing mitochondrial DNA estimate of phylogeny. This new hypothesis provides a more accurate framework in which to explore trait evolution and examine the evolution of the mitochondrial genome in this group. Copyright © 2012 Elsevier Inc. All rights reserved.
Win-Stay, Lose-Sample: a simple sequential algorithm for approximating Bayesian inference.
Bonawitz, Elizabeth; Denison, Stephanie; Gopnik, Alison; Griffiths, Thomas L
2014-11-01
People can behave in a way that is consistent with Bayesian models of cognition, despite the fact that performing exact Bayesian inference is computationally challenging. What algorithms could people be using to make this possible? We show that a simple sequential algorithm "Win-Stay, Lose-Sample", inspired by the Win-Stay, Lose-Shift (WSLS) principle, can be used to approximate Bayesian inference. We investigate the behavior of adults and preschoolers on two causal learning tasks to test whether people might use a similar algorithm. These studies use a "mini-microgenetic method", investigating how people sequentially update their beliefs as they encounter new evidence. Experiment 1 investigates a deterministic causal learning scenario and Experiments 2 and 3 examine how people make inferences in a stochastic scenario. The behavior of adults and preschoolers in these experiments is consistent with our Bayesian version of the WSLS principle. This algorithm provides both a practical method for performing Bayesian inference and a new way to understand people's judgments. Copyright © 2014 Elsevier Inc. All rights reserved.
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
NASA Astrophysics Data System (ADS)
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Vilar, M J; Ranta, J; Virtanen, S; Korkeala, H
2015-01-01
Bayesian analysis was used to estimate the pig's and herd's true prevalence of enteropathogenic Yersinia in serum samples collected from Finnish pig farms. The sensitivity and specificity of the diagnostic test were also estimated for the commercially available ELISA which is used for antibody detection against enteropathogenic Yersinia. The Bayesian analysis was performed in two steps; the first step estimated the prior true prevalence of enteropathogenic Yersinia with data obtained from a systematic review of the literature. In the second step, data of the apparent prevalence (cross-sectional study data), prior true prevalence (first step), and estimated sensitivity and specificity of the diagnostic methods were used for building the Bayesian model. The true prevalence of Yersinia in slaughter-age pigs was 67.5% (95% PI 63.2-70.9). The true prevalence of Yersinia in sows was 74.0% (95% PI 57.3-82.4). The estimates of sensitivity and specificity values of the ELISA were 79.5% and 96.9%.
Multiple model cardinalized probability hypothesis density filter
NASA Astrophysics Data System (ADS)
Georgescu, Ramona; Willett, Peter
2011-09-01
The Probability Hypothesis Density (PHD) filter propagates the first-moment approximation to the multi-target Bayesian posterior distribution while the Cardinalized PHD (CPHD) filter propagates both the posterior likelihood of (an unlabeled) target state and the posterior probability mass function of the number of targets. Extensions of the PHD filter to the multiple model (MM) framework have been published and were implemented either with a Sequential Monte Carlo or a Gaussian Mixture approach. In this work, we introduce the multiple model version of the more elaborate CPHD filter. We present the derivation of the prediction and update steps of the MMCPHD particularized for the case of two target motion models and proceed to show that in the case of a single model, the new MMCPHD equations reduce to the original CPHD equations.
Moderate Levels of Activation Lead to Forgetting In the Think/No-Think Paradigm
Detre, Greg J.; Natarajan, Annamalai; Gershman, Samuel J.; Norman, Kenneth A.
2013-01-01
Using the think/no-think paradigm (Anderson & Green, 2001), researchers have found that suppressing retrieval of a memory (in the presence of a strong retrieval cue) can make it harder to retrieve that memory on a subsequent test. This effect has been replicated numerous times, but the size of the effect is highly variable. Also, it is unclear from a neural mechanistic standpoint why preventing recall of a memory now should impair your ability to recall that memory later. Here, we address both of these puzzles using the idea, derived from computational modeling and studies of synaptic plasticity, that the function relating memory activation to learning is U-shaped, such that moderate levels of memory activation lead to weakening of the memory and higher levels of activation lead to strengthening. According to this view, forgetting effects in the think/no-think paradigm occur when the suppressed item activates moderately during the suppression attempt, leading to weakening; the effect is variable because sometimes the suppressed item activates strongly (leading to strengthening) and sometimes it does not activate at all (in which case no learning takes place). To test this hypothesis, we ran a think/no-think experiment where participants learned word-picture pairs; we used pattern classifiers, applied to fMRI data, to measure how strongly the picture associates were activating when participants were trying not to retrieve these associates, and we used a novel Bayesian curve-fitting procedure to relate this covert neural measure of retrieval to performance on a later memory test. In keeping with our hypothesis, the curve-fitting procedure revealed a nonmonotonic relationship between memory activation (as measured by the classifier) and subsequent memory, whereby moderate levels of activation of the to-be-suppressed item led to diminished performance on the final memory test, and higher levels of activation led to enhanced performance on the final test. PMID:23499722
Moderate levels of activation lead to forgetting in the think/no-think paradigm.
Detre, Greg J; Natarajan, Annamalai; Gershman, Samuel J; Norman, Kenneth A
2013-10-01
Using the think/no-think paradigm (Anderson & Green, 2001), researchers have found that suppressing retrieval of a memory (in the presence of a strong retrieval cue) can make it harder to retrieve that memory on a subsequent test. This effect has been replicated numerous times, but the size of the effect is highly variable. Also, it is unclear from a neural mechanistic standpoint why preventing recall of a memory now should impair your ability to recall that memory later. Here, we address both of these puzzles using the idea, derived from computational modeling and studies of synaptic plasticity, that the function relating memory activation to learning is U-shaped, such that moderate levels of memory activation lead to weakening of the memory and higher levels of activation lead to strengthening. According to this view, forgetting effects in the think/no-think paradigm occur when the suppressed item activates moderately during the suppression attempt, leading to weakening; the effect is variable because sometimes the suppressed item activates strongly (leading to strengthening) and sometimes it does not activate at all (in which case no learning takes place). To test this hypothesis, we ran a think/no-think experiment where participants learned word-picture pairs; we used pattern classifiers, applied to fMRI data, to measure how strongly the picture associates were activating when participants were trying not to retrieve these associates, and we used a novel Bayesian curve-fitting procedure to relate this covert neural measure of retrieval to performance on a later memory test. In keeping with our hypothesis, the curve-fitting procedure revealed a nonmonotonic relationship between memory activation (as measured by the classifier) and subsequent memory, whereby moderate levels of activation of the to-be-suppressed item led to diminished performance on the final memory test, and higher levels of activation led to enhanced performance on the final test. Copyright © 2013 Elsevier Ltd. All rights reserved.
Ducrot, Virginie; Billoir, Elise; Péry, Alexandre R R; Garric, Jeanne; Charles, Sandrine
2010-05-01
Effects of zinc were studied in the freshwater worm Branchiura sowerbyi using partial and full life-cycle tests. Only newborn and juveniles were sensitive to zinc, displaying effects on survival, growth, and age at first brood at environmentally relevant concentrations. Threshold effect models were proposed to assess toxic effects on individuals. They were fitted to life-cycle test data using Bayesian inference and adequately described life-history trait data in exposed organisms. The daily asymptotic growth rate of theoretical populations was then simulated with a matrix population model, based upon individual-level outputs. Population-level outputs were in accordance with existing literature for controls. Working in a Bayesian framework allowed incorporating parameter uncertainty in the simulation of the population-level response to zinc exposure, thus increasing the relevance of test results in the context of ecological risk assessment.
Assessing noninferiority in a three-arm trial using the Bayesian approach.
Ghosh, Pulak; Nathoo, Farouk; Gönen, Mithat; Tiwari, Ram C
2011-07-10
Non-inferiority trials, which aim to demonstrate that a test product is not worse than a competitor by more than a pre-specified small amount, are of great importance to the pharmaceutical community. As a result, methodology for designing and analyzing such trials is required, and developing new methods for such analysis is an important area of statistical research. The three-arm trial consists of a placebo, a reference and an experimental treatment, and simultaneously tests the superiority of the reference over the placebo along with comparing this reference to an experimental treatment. In this paper, we consider the analysis of non-inferiority trials using Bayesian methods which incorporate both parametric as well as semi-parametric models. The resulting testing approach is both flexible and robust. The benefit of the proposed Bayesian methods is assessed via simulation, based on a study examining home-based blood pressure interventions. Copyright © 2011 John Wiley & Sons, Ltd.
Bayesian median regression for temporal gene expression data
NASA Astrophysics Data System (ADS)
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
NASA Astrophysics Data System (ADS)
Yin, Ping; Mu, Lan; Madden, Marguerite; Vena, John E.
2014-10-01
Lung cancer is the second most commonly diagnosed cancer in both men and women in Georgia, USA. However, the spatio-temporal patterns of lung cancer risk in Georgia have not been fully studied. Hierarchical Bayesian models are used here to explore the spatio-temporal patterns of lung cancer incidence risk by race and gender in Georgia for the period of 2000-2007. With the census tract level as the spatial scale and the 2-year period aggregation as the temporal scale, we compare a total of seven Bayesian spatio-temporal models including two under a separate modeling framework and five under a joint modeling framework. One joint model outperforms others based on the deviance information criterion. Results show that the northwest region of Georgia has consistently high lung cancer incidence risk for all population groups during the study period. In addition, there are inverse relationships between the socioeconomic status and the lung cancer incidence risk among all Georgian population groups, and the relationships in males are stronger than those in females. By mapping more reliable variations in lung cancer incidence risk at a relatively fine spatio-temporal scale for different Georgian population groups, our study aims to better support healthcare performance assessment, etiological hypothesis generation, and health policy making.
Winterton, Shaun L; Wiegmann, Brian M; Schlinger, Evert I
2007-06-01
The first formal analysis of phylogenetic relationships among small-headed flies (Acroceridae) is presented based on DNA sequence data from two ribosomal (16S and 28S) and two protein-encoding genes: carbomoylphosphate synthase (CPS) domain of CAD (i.e., rudimentary locus) and cytochrome oxidase I (COI). DNA sequences from 40 species in 22 genera of Acroceridae (representing all three subfamilies) were compared with outgroup exemplars from Nemestrinidae, Stratiomyidae, Tabanidae, and Xylophagidae. Parsimony and Bayesian simultaneous analyses of the full data set recover a well-resolved and strongly supported hypothesis of phylogenetic relationships for major lineages within the family. Molecular evidence supports the monophyly of traditionally recognised subfamilies Philopotinae and Panopinae, but Acrocerinae are polyphyletic. Panopinae, sometimes considered "primitive" based on morphology and host-use, are always placed in a more derived position in the current study. Furthermore, these data support emerging morphological evidence that the type genus Acrocera Meigen, and its sister genus Sphaerops, are atypical acrocerids, comprising a sister lineage to all other Acroceridae. Based on the phylogeny generated in the simultaneous analysis, historical divergence times were estimated using Bayesian methodology constrained with fossil data. These estimates indicate Acroceridae likely evolved during the late Triassic but did not diversify greatly until the Cretaceous.
Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murakami, Haruko; Chen, X.; Hahn, Melanie S.
2010-10-21
This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less
Incorporating Biological Knowledge into Evaluation of Casual Regulatory Hypothesis
NASA Technical Reports Server (NTRS)
Chrisman, Lonnie; Langley, Pat; Bay, Stephen; Pohorille, Andrew; DeVincenzi, D. (Technical Monitor)
2002-01-01
Biological data can be scarce and costly to obtain. The small number of samples available typically limits statistical power and makes reliable inference of causal relations extremely difficult. However, we argue that statistical power can be increased substantially by incorporating prior knowledge and data from diverse sources. We present a Bayesian framework that combines information from different sources and we show empirically that this lets one make correct causal inferences with small sample sizes that otherwise would be impossible.
Context Relevant Prediction Model for COPD Domain Using Bayesian Belief Network
Saleh, Lokman; Ajami, Hicham; Mili, Hafedh
2017-01-01
In the last three decades, researchers have examined extensively how context-aware systems can assist people, specifically those suffering from incurable diseases, to help them cope with their medical illness. Over the years, a huge number of studies on Chronic Obstructive Pulmonary Disease (COPD) have been published. However, how to derive relevant attributes and early detection of COPD exacerbations remains a challenge. In this research work, we will use an efficient algorithm to select relevant attributes where there is no proper approach in this domain. Such algorithm predicts exacerbations with high accuracy by adding discretization process, and organizes the pertinent attributes in priority order based on their impact to facilitate the emergency medical treatment. In this paper, we propose an extension of our existing Helper Context-Aware Engine System (HCES) for COPD. This project uses Bayesian network algorithm to depict the dependency between the COPD symptoms (attributes) in order to overcome the insufficiency and the independency hypothesis of naïve Bayesian. In addition, the dependency in Bayesian network is realized using TAN algorithm rather than consulting pneumologists. All these combined algorithms (discretization, selection, dependency, and the ordering of the relevant attributes) constitute an effective prediction model, comparing to effective ones. Moreover, an investigation and comparison of different scenarios of these algorithms are also done to verify which sequence of steps of prediction model gives more accurate results. Finally, we designed and validated a computer-aided support application to integrate different steps of this model. The findings of our system HCES has shown promising results using Area Under Receiver Operating Characteristic (AUC = 81.5%). PMID:28644419
A Bayesian Hybrid Adaptive Randomisation Design for Clinical Trials with Survival Outcomes.
Moatti, M; Chevret, S; Zohar, S; Rosenberger, W F
2016-01-01
Response-adaptive randomisation designs have been proposed to improve the efficiency of phase III randomised clinical trials and improve the outcomes of the clinical trial population. In the setting of failure time outcomes, Zhang and Rosenberger (2007) developed a response-adaptive randomisation approach that targets an optimal allocation, based on a fixed sample size. The aim of this research is to propose a response-adaptive randomisation procedure for survival trials with an interim monitoring plan, based on the following optimal criterion: for fixed variance of the estimated log hazard ratio, what allocation minimizes the expected hazard of failure? We demonstrate the utility of the design by redesigning a clinical trial on multiple myeloma. To handle continuous monitoring of data, we propose a Bayesian response-adaptive randomisation procedure, where the log hazard ratio is the effect measure of interest. Combining the prior with the normal likelihood, the mean posterior estimate of the log hazard ratio allows derivation of the optimal target allocation. We perform a simulation study to assess and compare the performance of this proposed Bayesian hybrid adaptive design to those of fixed, sequential or adaptive - either frequentist or fully Bayesian - designs. Non informative normal priors of the log hazard ratio were used, as well as mixture of enthusiastic and skeptical priors. Stopping rules based on the posterior distribution of the log hazard ratio were computed. The method is then illustrated by redesigning a phase III randomised clinical trial of chemotherapy in patients with multiple myeloma, with mixture of normal priors elicited from experts. As expected, there was a reduction in the proportion of observed deaths in the adaptive vs. non-adaptive designs; this reduction was maximized using a Bayes mixture prior, with no clear-cut improvement by using a fully Bayesian procedure. The use of stopping rules allows a slight decrease in the observed proportion of deaths under the alternate hypothesis compared with the adaptive designs with no stopping rules. Such Bayesian hybrid adaptive survival trials may be promising alternatives to traditional designs, reducing the duration of survival trials, as well as optimizing the ethical concerns for patients enrolled in the trial.
Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages.
Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart
2016-01-01
Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure-regularities arising in an ordered series of syllable timings-testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint.
Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages
Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart
2016-01-01
Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure—regularities arising in an ordered series of syllable timings—testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint. PMID:27994544
Riser, James P; Cardinal-McTeague, Warren M; Hall, Jocelyn C; Hahn, William J; Sytsma, Kenneth J; Roalson, Eric H
2013-10-01
A monophyletic group composed of five genera of the Cleomaceae represents an intriguing lineage with outstanding taxonomic and evolutionary questions. Generic boundaries are poorly defined, and historical hypotheses regarding the evolution of fruit type and phylogenetic relationships provide testable questions. This is the first detailed phylogenetic investigation of all 22 species in this group. We use this phylogenetic framework to assess generic monophyly and test Iltis's evolutionary "reduction series" hypothesis regarding phylogeny and fruit type/seed number. • Maximum likelihood and Bayesian analyses of four plastid intergenic spacer region sequences (rpl32-trnL, trnQ-rps16, ycf1-rps15, and psbA-trnH) and one nuclear (ITS) region were used to reconstruct phylogenetic relationships among the NA cleomoid species. Stochastic mapping and ancestral-state reconstruction were used to study the evolution of fruit type. • Both analyses recovered nearly identical phylogenies. Three of the currently recognized genera (Wislizenia, Carsonia, and Oxystylis) are monophyletic while two (Cleomella and Peritoma) are para- or polyphyletic. There was a single origin of the two-seeded schizocarp in the ancestor of the Oxystylis-Wislizenia clade and a secondary derivation of elongated capsule-type fruits in Peritoma from a truncated capsule state in Cleomella. • Our well-resolved phylogeny supports most of the current species circumscriptions but not current generic circumscriptions. Additionally, our results are inconsistent with Iltis's hypothesis of species with elongated many-seed fruits giving rise to species with truncated few-seeded fruits. Instead, we find support for the reversion to elongated multiseeded fruits from a truncate few-seeded ancestor in Peritoma.
Substantial advantage of a combined Bayesian and genotyping approach in testosterone doping tests.
Schulze, Jenny Jakobsson; Lundmark, Jonas; Garle, Mats; Ekström, Lena; Sottas, Pierre-Edouard; Rane, Anders
2009-03-01
Testosterone abuse is conventionally assessed by the urinary testosterone/epitestosterone (T/E) ratio, levels above 4.0 being considered suspicious. A deletion polymorphism in the gene coding for UGT2B17 is strongly associated with reduced testosterone glucuronide (TG) levels in urine. Many of the individuals devoid of the gene would not reach a T/E ratio of 4.0 after testosterone intake. Future test programs will most likely shift from population based- to individual-based T/E cut-off ratios using Bayesian inference. A longitudinal analysis is dependent on an individual's true negative baseline T/E ratio. The aim was to investigate whether it is possible to increase the sensitivity and specificity of the T/E test by addition of UGT2B17 genotype information in a Bayesian framework. A single intramuscular dose of 500mg testosterone enanthate was given to 55 healthy male volunteers with either two, one or no allele (ins/ins, ins/del or del/del) of the UGT2B17 gene. Urinary excretion of TG and the T/E ratio was measured during 15 days. The Bayesian analysis was conducted to calculate the individual T/E cut-off ratio. When adding the genotype information, the program returned lower individual cut-off ratios in all del/del subjects increasing the sensitivity of the test considerably. It will be difficult, if not impossible, to discriminate between a true negative baseline T/E value and a false negative one without knowledge of the UGT2B17 genotype. UGT2B17 genotype information is crucial, both to decide which initial cut-off ratio to use for an individual, and for increasing the sensitivity of the Bayesian analysis.
Bayesian methods including nonrandomized study data increased the efficiency of postlaunch RCTs.
Schmidt, Amand F; Klugkist, Irene; Klungel, Olaf H; Nielen, Mirjam; de Boer, Anthonius; Hoes, Arno W; Groenwold, Rolf H H
2015-04-01
Findings from nonrandomized studies on safety or efficacy of treatment in patient subgroups may trigger postlaunch randomized clinical trials (RCTs). In the analysis of such RCTs, results from nonrandomized studies are typically ignored. This study explores the trade-off between bias and power of Bayesian RCT analysis incorporating information from nonrandomized studies. A simulation study was conducted to compare frequentist with Bayesian analyses using noninformative and informative priors in their ability to detect interaction effects. In simulated subgroups, the effect of a hypothetical treatment differed between subgroups (odds ratio 1.00 vs. 2.33). Simulations varied in sample size, proportions of the subgroups, and specification of the priors. As expected, the results for the informative Bayesian analyses were more biased than those from the noninformative Bayesian analysis or frequentist analysis. However, because of a reduction in posterior variance, informative Bayesian analyses were generally more powerful to detect an effect. In scenarios where the informative priors were in the opposite direction of the RCT data, type 1 error rates could be 100% and power 0%. Bayesian methods incorporating data from nonrandomized studies can meaningfully increase power of interaction tests in postlaunch RCTs. Copyright © 2015 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.
2012-06-01
Tracer testing under natural or forced gradient flow holds the potential to provide useful information for characterizing subsurface properties, through monitoring, modeling and interpretation of the tracer plume migration in an aquifer. Non-reactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter (EBF) profiling. A Bayesian data assimilation technique, the method of anchored distributions (MAD) [Rubin et al., 2010], was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of themore » Hanford formation. In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using the constant-rate injection tests and the EBF data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively-parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field data shows that the hydrogeological model, when conditioned on the tracer test data, can reproduce the tracer transport behavior better than the field characterized without the tracer test data. This study successfully demonstrates that MAD can sequentially assimilate multi-scale multi-type field data through a consistent Bayesian framework.« less
ERIC Educational Resources Information Center
Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S.
2013-01-01
This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Hierarchical Bayesian Models of Subtask Learning
ERIC Educational Resources Information Center
Anglim, Jeromy; Wynton, Sarah K. A.
2015-01-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…
Incremental Bayesian Category Learning from Natural Language
ERIC Educational Resources Information Center
Frermann, Lea; Lapata, Mirella
2016-01-01
Models of category learning have been extensively studied in cognitive science and primarily tested on perceptual abstractions or artificial stimuli. In this paper, we focus on categories acquired from natural language stimuli, that is, words (e.g., "chair" is a member of the furniture category). We present a Bayesian model that, unlike…
Drummond, Christopher S.; Eastwood, Ruth J.; Miotto, Silvia T. S.; Hughes, Colin E.
2012-01-01
Replicate radiations provide powerful comparative systems to address questions about the interplay between opportunity and innovation in driving episodes of diversification and the factors limiting their subsequent progression. However, such systems have been rarely documented at intercontinental scales. Here, we evaluate the hypothesis of multiple radiations in the genus Lupinus (Leguminosae), which exhibits some of the highest known rates of net diversification in plants. Given that incomplete taxon sampling, background extinction, and lineage-specific variation in diversification rates can confound macroevolutionary inferences regarding the timing and mechanisms of cladogenesis, we used Bayesian relaxed clock phylogenetic analyses as well as MEDUSA and BiSSE birth–death likelihood models of diversification, to evaluate the evolutionary patterns of lineage accumulation in Lupinus. We identified 3 significant shifts to increased rates of net diversification (r) relative to background levels in the genus (r = 0.18–0.48 lineages/myr). The primary shift occurred approximately 4.6 Ma (r = 0.48–1.76) in the montane regions of western North America, followed by a secondary shift approximately 2.7 Ma (r = 0.89–3.33) associated with range expansion and diversification of allopatrically distributed sister clades in the Mexican highlands and Andes. We also recovered evidence for a third independent shift approximately 6.5 Ma at the base of a lower elevation eastern South American grassland and campo rupestre clade (r = 0.36–1.33). Bayesian ancestral state reconstructions and BiSSE likelihood analyses of correlated diversification indicated that increased rates of speciation are strongly associated with the derived evolution of perennial life history and invasion of montane ecosystems. Although we currently lack hard evidence for “replicate adaptive radiations” in the sense of convergent morphological and ecological trajectories among species in different clades, these results are consistent with the hypothesis that iteroparity functioned as an adaptive key innovation, providing a mechanism for range expansion and rapid divergence in upper elevation regions across much of the New World. PMID:22228799
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.
Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J
2015-07-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.
Ziehl-Quirós, E Carolina; García-Aguilar, María C; Mellink, Eric
2017-01-24
The relatively small population size and restricted distribution of the Guadalupe fur seal Arctocephalus townsendi could make it highly vulnerable to infectious diseases. We performed a colony-level assessment in this species of the prevalence and presence of Brucella spp. and Leptospira spp., pathogenic bacteria that have been reported in several pinniped species worldwide. Forty-six serum samples were collected in 2014 from pups at Isla Guadalupe, the only place where the species effectively reproduces. Samples were tested for Brucella using 3 consecutive serological tests, and for Leptospira using the microscopic agglutination test. For each bacterium, a Bayesian approach was used to estimate prevalence to exposure, and an epidemiological model was used to test the null hypothesis that the bacterium was present in the colony. No serum sample tested positive for Brucella, and the statistical analyses concluded that the colony was bacterium-free with a 96.3% confidence level. However, a Brucella surveillance program would be highly recommendable. Twelve samples were positive (titers 1:50) to 1 or more serovars of Leptospira. The prevalence was calculated at 27.1% (95% credible interval: 15.6-40.3%), and the posterior analyses indicated that the colony was not Leptospira-free with a 100% confidence level. Serovars Icterohaemorrhagiae, Canicola, and Bratislava were detected, but only further research can unveil whether they affect the fur seal population.
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Harrison, Jay M; Breeze, Matthew L; Harrigan, George G
2011-08-01
Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.
True versus Apparent Malaria Infection Prevalence: The Contribution of a Bayesian Approach
Claes, Filip; Van Hong, Nguyen; Torres, Kathy; Mao, Sokny; Van den Eede, Peter; Thi Thinh, Ta; Gamboa, Dioni; Sochantha, Tho; Thang, Ngo Duc; Coosemans, Marc; Büscher, Philippe; D'Alessandro, Umberto; Berkvens, Dirk; Erhart, Annette
2011-01-01
Aims To present a new approach for estimating the “true prevalence” of malaria and apply it to datasets from Peru, Vietnam, and Cambodia. Methods Bayesian models were developed for estimating both the malaria prevalence using different diagnostic tests (microscopy, PCR & ELISA), without the need of a gold standard, and the tests' characteristics. Several sources of information, i.e. data, expert opinions and other sources of knowledge can be integrated into the model. This approach resulting in an optimal and harmonized estimate of malaria infection prevalence, with no conflict between the different sources of information, was tested on data from Peru, Vietnam and Cambodia. Results Malaria sero-prevalence was relatively low in all sites, with ELISA showing the highest estimates. The sensitivity of microscopy and ELISA were statistically lower in Vietnam than in the other sites. Similarly, the specificities of microscopy, ELISA and PCR were significantly lower in Vietnam than in the other sites. In Vietnam and Peru, microscopy was closer to the “true” estimate than the other 2 tests while as expected ELISA, with its lower specificity, usually overestimated the prevalence. Conclusions Bayesian methods are useful for analyzing prevalence results when no gold standard diagnostic test is available. Though some results are expected, e.g. PCR more sensitive than microscopy, a standardized and context-independent quantification of the diagnostic tests' characteristics (sensitivity and specificity) and the underlying malaria prevalence may be useful for comparing different sites. Indeed, the use of a single diagnostic technique could strongly bias the prevalence estimation. This limitation can be circumvented by using a Bayesian framework taking into account the imperfect characteristics of the currently available diagnostic tests. As discussed in the paper, this approach may further support global malaria burden estimation initiatives. PMID:21364745
Guillon, Myrtille; Mace, Ruth
2016-01-01
The classification of kin into structured groups is a diverse phenomenon which is ubiquitous in human culture. For populations which are organized into large agropastoral groupings of sedentary residence but not governed within the context of a centralised state, such as our study sample of 83 historical Bantu-speaking groups of sub-Saharan Africa, cultural kinship norms guide all aspects of everyday life and social organization. Such rules operate in part through the use of differing terminological referential systems of familial organization. Although the cross-cultural study of kinship terminology was foundational in Anthropology, few modern studies have made use of statistical advances to further our sparse understanding of the structuring and diversification of terminological systems of kinship over time. In this study we use Bayesian Markov Chain Monte Carlo methods of phylogenetic comparison to investigate the evolution of Bantu kinship terminology and reconstruct the ancestral state and diversification of cousin terminology in this family of sub-Saharan ethnolinguistic groups. Using a phylogenetic tree of Bantu languages, we then test the prominent hypothesis that structured variation in systems of cousin terminology has co-evolved alongside adaptive change in patterns of descent organization, as well as rules of residence. We find limited support for this hypothesis, and argue that the shaping of systems of kinship terminology is a multifactorial process, concluding with possible avenues of future research. PMID:27008364
Johnson, Marc T J; Fitzjohn, Richard G; Smith, Stacey D; Rausher, Mark D; Otto, Sarah P
2011-11-01
The loss of sexual recombination and segregation in asexual organisms has been portrayed as an irreversible process that commits asexually reproducing lineages to reduced diversification. We test this hypothesis by estimating rates of speciation, extinction, and transition between sexuality and functional asexuality in the evening primroses. Specifically, we estimate these rates using the recently developed BiSSE (Binary State Speciation and Extinction) phylogenetic comparative method, which employs maximum likelihood and Bayesian techniques. We infer that net diversification rates (speciation minus extinction) in functionally asexual evening primrose lineages are roughly eight times faster than diversification rates in sexual lineages, largely due to higher speciation rates in asexual lineages. We further reject the hypothesis that a loss of recombination and segregation is irreversible because the transition rate from functional asexuality to sexuality is significantly greater than zero and in fact exceeded the reverse rate. These results provide the first empirical evidence in support of the alternative theoretical prediction that asexual populations should instead diversify more rapidly than sexual populations because they are free from the homogenizing effects of sexual recombination and segregation. Although asexual reproduction may often constrain adaptive evolution, our results show that the loss of recombination and segregation need not be an evolutionary dead end in terms of diversification of lineages. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.
Robustly Aligning a Shape Model and Its Application to Car Alignment of Unknown Pose.
Li, Yan; Gu, Leon; Kanade, Takeo
2011-09-01
Precisely localizing in an image a set of feature points that form a shape of an object, such as car or face, is called alignment. Previous shape alignment methods attempted to fit a whole shape model to the observed data, based on the assumption of Gaussian observation noise and the associated regularization process. However, such an approach, though able to deal with Gaussian noise in feature detection, turns out not to be robust or precise because it is vulnerable to gross feature detection errors or outliers resulting from partial occlusions or spurious features from the background or neighboring objects. We address this problem by adopting a randomized hypothesis-and-test approach. First, a Bayesian inference algorithm is developed to generate a shape-and-pose hypothesis of the object from a partial shape or a subset of feature points. For alignment, a large number of hypotheses are generated by randomly sampling subsets of feature points, and then evaluated to find the one that minimizes the shape prediction error. This method of randomized subset-based matching can effectively handle outliers and recover the correct object shape. We apply this approach on a challenging data set of over 5,000 different-posed car images, spanning a wide variety of car types, lighting, background scenes, and partial occlusions. Experimental results demonstrate favorable improvements over previous methods on both accuracy and robustness.
Kondo, Toshiaki; Crisp, Michael D.; Linde, Celeste; Bowman, David M. J. S.; Kawamura, Kensuke; Kaneko, Shingo; Isagi, Yuji
2012-01-01
Livistona mariae is an endemic palm localized in arid central Australia. This species is separated by about 1000 km from its congener L. rigida, which grows distantly in the Roper River and Nicholson–Gregory River catchments in northern Australia. Such an isolated distribution of L. mariae has been assumed to have resulted from contraction of ancestral populations as Australia aridified from the Mid-Miocene (ca 15 Ma). To test this hypothesis at the population level, we examined the genetic relationships among 14 populations of L. mariae and L. rigida using eight nuclear microsatellite loci. Our population tree and Bayesian clustering revealed that these populations comprised two genetically distinct groups that did not correspond to the current classification at species rank, and L. mariae showed closest affinity with L. rigida from Roper River. Furthermore, coalescent divergence-time estimations suggested that the disjunction between the northern populations (within L. rigida) could have originated by intermittent colonization along an ancient river that has been drowned repeatedly by marine transgression. During that time, L. mariae populations could have been established by opportunistic immigrants from Roper River about 15 000 years ago, concurrently with the settlement of indigenous Australians in central Australia, who are thus plausible vectors. Thus, our results rule out the ancient relic hypothesis for the origin of L. mariae. PMID:22398168
High migration rates shape the postglacial history of amphi-Atlantic bryophytes.
Désamoré, Aurélie; Patiño, Jairo; Mardulyn, Patrick; Mcdaniel, Stuart F; Zanatta, Florian; Laenen, Benjamin; Vanderpoorten, Alain
2016-11-01
Paleontological evidence and current patterns of angiosperm species richness suggest that European biota experienced more severe bottlenecks than North American ones during the last glacial maximum. How well this pattern fits other plant species is less clear. Bryophytes offer a unique opportunity to contrast the impact of the last glacial maximum in North America and Europe because about 60% of the European bryoflora is shared with North America. Here, we use population genetic analyses based on approximate Bayesian computation on eight amphi-Atlantic species to test the hypothesis that North American populations were less impacted by the last glacial maximum, exhibiting higher levels of genetic diversity than European ones and ultimately serving as a refugium for the postglacial recolonization of Europe. In contrast with this hypothesis, the best-fit demographic model involved similar patterns of population size contractions, comparable levels of genetic diversity and balanced migration rates between European and North American populations. Our results thus suggest that bryophytes have experienced comparable demographic glacial histories on both sides of the Atlantic. Although a weak, but significant genetic structure was systematically recovered between European and North American populations, evidence for migration from and towards both continents suggests that amphi-Atlantic bryophyte population may function as a metapopulation network. Reconstructing the biogeographic history of either North American or European bryophyte populations therefore requires a large, trans-Atlantic geographic framework. © 2016 John Wiley & Sons Ltd.
Mellows, Andrew; Barnett, Ross; Dalén, Love; Sandoval-Castellanos, Edson; Linderholm, Anna; McGovern, Thomas H.; Church, Mike J.; Larson, Greger
2012-01-01
Previous studies have suggested that the presence of sea ice is an important factor in facilitating migration and determining the degree of genetic isolation among contemporary arctic fox populations. Because the extent of sea ice is dependent upon global temperatures, periods of significant cooling would have had a major impact on fox population connectivity and genetic variation. We tested this hypothesis by extracting and sequencing mitochondrial control region sequences from 17 arctic foxes excavated from two late-ninth-century to twelfth-century AD archaeological sites in northeast Iceland, both of which predate the Little Ice Age (approx. sixteenth to nineteenth century). Despite the fact that five haplotypes have been observed in modern Icelandic foxes, a single haplotype was shared among all of the ancient individuals. Results from simulations within an approximate Bayesian computation framework suggest that the rapid increase in Icelandic arctic fox haplotype diversity can only be explained by sea-ice-mediated fox immigration facilitated by the Little Ice Age. PMID:22977155
No evidence for systematic white matter correlates of dyslexia and dyscalculia.
Moreau, David; Wilson, Anna J; McKay, Nicole S; Nihill, Kasey; Waldie, Karen E
2018-01-01
Learning disabilities such as dyslexia, dyscalculia and their comorbid manifestation are prevalent, affecting as much as 15% of the population. Structural neuroimaging studies have indicated that these disorders can be related to differences in white matter integrity, although findings remain disparate. In this study, we used a unique design composed of individuals with dyslexia, dyscalculia, both disorders and controls, to systematically explore differences in fractional anisotropy across groups using diffusion tensor imaging. Specifically, we focused on the corona radiata and the arcuate fasciculus, two tracts associated with reading and mathematics in a number of previous studies. Using Bayesian hypothesis testing, we show that the present data favor the null model of no differences between groups for these particular tracts-a finding that seems to go against the current view but might be representative of the disparities within this field of research. Together, these findings suggest that structural differences associated with dyslexia and dyscalculia might not be as reliable as previously thought, with potential ramifications in terms of remediation.
Minimal effects of latitude on present-day speciation rates in New World birds
Rabosky, Daniel L.; Title, Pascal O.; Huang, Huateng
2015-01-01
The tropics contain far greater numbers of species than temperate regions, suggesting that rates of species formation might differ systematically between tropical and non-tropical areas. We tested this hypothesis by reconstructing the history of speciation in New World (NW) land birds using BAMM, a Bayesian framework for modelling complex evolutionary dynamics on phylogenetic trees. We estimated marginal distributions of present-day speciation rates for each of 2571 species of birds. The present-day rate of speciation varies approximately 30-fold across NW birds, but there is no difference in the rate distributions for tropical and temperate taxa. Using macroevolutionary cohort analysis, we demonstrate that clades with high tropical membership do not produce species more rapidly than temperate clades. For nearly any value of present-day speciation rate, there are far more species in the tropics than the temperate zone. Any effects of latitude on speciation rate are marginal in comparison to the dramatic variation in rates among clades. PMID:26019156
Mellows, Andrew; Barnett, Ross; Dalén, Love; Sandoval-Castellanos, Edson; Linderholm, Anna; McGovern, Thomas H; Church, Mike J; Larson, Greger
2012-11-22
Previous studies have suggested that the presence of sea ice is an important factor in facilitating migration and determining the degree of genetic isolation among contemporary arctic fox populations. Because the extent of sea ice is dependent upon global temperatures, periods of significant cooling would have had a major impact on fox population connectivity and genetic variation. We tested this hypothesis by extracting and sequencing mitochondrial control region sequences from 17 arctic foxes excavated from two late-ninth-century to twelfth-century AD archaeological sites in northeast Iceland, both of which predate the Little Ice Age (approx. sixteenth to nineteenth century). Despite the fact that five haplotypes have been observed in modern Icelandic foxes, a single haplotype was shared among all of the ancient individuals. Results from simulations within an approximate Bayesian computation framework suggest that the rapid increase in Icelandic arctic fox haplotype diversity can only be explained by sea-ice-mediated fox immigration facilitated by the Little Ice Age.
Kröger, Hannes
2017-08-01
The study investigates whether sickness absence is stratified by job level - understood as the authority and autonomy a worker holds - beyond the association with education, income, and occupation. A second objective is to establish the moderating role of gender and occupational gender composition on this stratification of sickness absence. Four competing hypotheses are developed that predict different patterns of moderation. Associations between job level and sickness absence are estimated for men and women in three groups of differing occupational gender composition, using data from the German Socio-Economic Panel Study (SOEP). For the purpose of moderation analysis, this study employs a new method based on Bayesian statistics, which enables the testing of complex moderation hypotheses. The data support the hypothesis that the stratification of sickness absence by job level is strongest for occupational minorities, meaning men in female-dominated and women in male-dominated occupations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Beyond statistical inference: A decision theory for science
KILLEEN, PETER R.
2008-01-01
Traditional null hypothesis significance testing does not yield the probability of the null or its alternative and, therefore, cannot logically ground scientific decisions. The decision theory proposed here calculates the expected utility of an effect on the basis of (1) the probability of replicating it and (2) a utility function on its size. It takes significance tests—which place all value on the replicability of an effect and none on its magnitude—as a special case, one in which the cost of a false positive is revealed to be an order of magnitude greater than the value of a true positive. More realistic utility functions credit both replicability and effect size, integrating them for a single index of merit. The analysis incorporates opportunity cost and is consistent with alternate measures of effect size, such as r2 and information transmission, and with Bayesian model selection criteria. An alternate formulation is functionally equivalent to the formal theory, transparent, and easy to compute. PMID:17201351
Golubkov, Sergey M; Berezina, Nadezhda A; Gubelit, Yulia I; Demchuk, Anna S; Golubkov, Mikhail S; Tiunov, Alexei V
2018-01-01
We analyzed stable isotope composition of carbon and nitrogen of suspended organic matter (seston) and tissues of macroalgae, macroinvertebrates and fish from the coastal area of the highly eutrophic Neva Estuary to test a hypothesis that organic carbon of macroalgae Cladophora glomerata and Ulva intestinalis produced during green tides may be among primary sources supporting coastal food webs. The Stable Isotope Bayesian mixing model (SIAR) showed that consumers poorly use organic carbon produced by macroalgae. According to the results of SIAR modeling, benthic macroinvertebrates and fish mostly rely on pelagic derived carbon as a basal resource for their production. Only some species of macroinvertebrates consumed macroalgae. Fish used this resource directly consuming zooplankton or indirectly via benthic macroinvertebrates. This was consistent with the results of the gut content analysis, which revealed a high proportion of zooplankton in the guts of non-predatory fish. Copyright © 2017. Published by Elsevier Ltd.
Canedo, Clarissa; Haddad, Célio F B
2012-11-01
We present a phylogenetic hypothesis of the anuran clade Terrarana based on partial sequences of nuclear (Tyr and RAG1) and mitochondrial (12S, tRNA-Val, and 16S) genes, testing the monophyly of Ischnocnema and its species series. We performed maximum parsimony, maximum likelihood, and Bayesian inference analyses on 364 terminals: 11 outgroup terminals and 353 ingroup Terrarana terminals, including 139 Ischnocnema terminals (accounting for 29 of the 35 named Ischnocnema species) and 214 other Terrarana terminals within the families Brachycephalidae, Ceuthomantidae, Craugastoridae, and Eleutherodactylidae. Different optimality criteria produced similar results and mostly recovered the currently accepted families and genera. According to these topologies, Ischnocnema is not a monophyletic group. We propose new combinations for three species, relocating them to Pristimantis, and render Eleutherodactylus bilineatus Bokermann, 1975 incertae sedis status within Holoadeninae. The rearrangements in Ischnocnema place it outside the northernmost Brazilian Atlantic rainforest, where the fauna of Terrarana comprises typical Amazonian genera. Copyright © 2012 Elsevier Inc. All rights reserved.
Matching radio catalogues with realistic geometry: application to SWIRE and ATLAS
NASA Astrophysics Data System (ADS)
Fan, Dongwei; Budavári, Tamás; Norris, Ray P.; Hopkins, Andrew M.
2015-08-01
Cross-matching catalogues at different wavelengths is a difficult problem in astronomy, especially when the objects are not point-like. At radio wavelengths, an object can have several components corresponding, for example, to a core and lobes. Considering not all radio detections correspond to visible or infrared sources, matching these catalogues can be challenging. Traditionally, this is done by eye for better quality, which does not scale to the large data volumes expected from the next-generation of radio telescopes. We present a novel automated procedure, using Bayesian hypothesis testing, to achieve reliable associations by explicit modelling of a particular class of radio-source morphology. The new algorithm not only assesses the likelihood of an association between data at two different wavelengths, but also tries to assess whether different radio sources are physically associated, are double-lobed radio galaxies, or just distinct nearby objects. Application to the Spitzer Wide-Area Infrared Extragalactic and Australia Telescope Large Area Survey CDF-S catalogues shows that this method performs well without human intervention.
Blumenthal, Scott A.; Chritz, Kendra L.; Rothman, Jessica M.; Cerling, Thure E.
2012-01-01
We use stable isotope ratios in feces of wild mountain gorillas (Gorilla beringei) to test the hypothesis that diet shifts within a single year, as measured by dry mass intake, can be recovered. Isotopic separation of staple foods indicates that intraannual changes in the isotopic composition of feces reflect shifts in diet. Fruits are isotopically distinct compared with other staple foods, and peaks in fecal δ13C values are interpreted as periods of increased fruit feeding. Bayesian mixing model results demonstrate that, although the timing of these diet shifts match observational data, the modeled increase in proportional fruit feeding does not capture the full shift. Variation in the isotopic and nutritional composition of gorilla foods is largely independent, highlighting the difficulty for estimating nutritional intake with stable isotopes. Our results demonstrate the potential value of fecal sampling for quantifying short-term, intraindividual dietary variability in primates and other animals with high temporal resolution even when the diet is composed of C3 plants. PMID:23236160
Spiegelhalter, D J; Freedman, L S
1986-01-01
The 'textbook' approach to determining sample size in a clinical trial has some fundamental weaknesses which we discuss. We describe a new predictive method which takes account of prior clinical opinion about the treatment difference. The method adopts the point of clinical equivalence (determined by interviewing the clinical participants) as the null hypothesis. Decision rules at the end of the study are based on whether the interval estimate of the treatment difference (classical or Bayesian) includes the null hypothesis. The prior distribution is used to predict the probabilities of making the decisions to use one or other treatment or to reserve final judgement. It is recommended that sample size be chosen to control the predicted probability of the last of these decisions. An example is given from a multi-centre trial of superficial bladder cancer.
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
ERIC Educational Resources Information Center
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
[Dilemma of null hypothesis in ecological hypothesis's experiment test.
Li, Ji
2016-06-01
Experimental test is one of the major test methods of ecological hypothesis, though there are many arguments due to null hypothesis. Quinn and Dunham (1983) analyzed the hypothesis deduction model from Platt (1964) and thus stated that there is no null hypothesis in ecology that can be strictly tested by experiments. Fisher's falsificationism and Neyman-Pearson (N-P)'s non-decisivity inhibit statistical null hypothesis from being strictly tested. Moreover, since the null hypothesis H 0 (α=1, β=0) and alternative hypothesis H 1 '(α'=1, β'=0) in ecological progresses are diffe-rent from classic physics, the ecological null hypothesis can neither be strictly tested experimentally. These dilemmas of null hypothesis could be relieved via the reduction of P value, careful selection of null hypothesis, non-centralization of non-null hypothesis, and two-tailed test. However, the statistical null hypothesis significance testing (NHST) should not to be equivalent to the causality logistical test in ecological hypothesis. Hence, the findings and conclusions about methodological studies and experimental tests based on NHST are not always logically reliable.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graves, Todd L; Hamada, Michael S
2008-01-01
Good estimates of the reliability of a system make use of test data and expert knowledge at all available levels. Furthermore, by integrating all these information sources, one can determine how best to allocate scarce testing resources to reduce uncertainty. Both of these goals are facilitated by modern Bayesian computational methods. We apply these tools to examples that were previously solvable only through the use of ingenious approximations, and use genetic algorithms to guide resource allocation.
Castellanos-Morales, Gabriela; Gámez, Niza; Castillo-Gámez, Reyna A; Eguiarte, Luis E
2016-01-01
The hypothesis that endemic species could have originated by the isolation and divergence of peripheral populations of widespread species can be tested through the use of ecological niche models (ENMs) and statistical phylogeography. The joint use of these tools provides complementary perspectives on historical dynamics and allows testing hypotheses regarding the origin of endemic taxa. We used this approach to infer the historical processes that have influenced the origin of a species endemic to the Mexican Plateau (Cynomys mexicanus) and its divergence from a widespread ancestor (Cynomys ludovicianus), and to test whether this endemic species originated through peripatric speciation. We obtained genetic data for 295 individuals for two species of black-tailed prairie dogs (C. ludovicianus and C. mexicanus). Genetic data consisted of mitochondrial DNA sequences (cytochrome b and control region), and 10 nuclear microsatellite loci. We estimated dates of divergence between species and between lineages within each species and performed ecological niche modelling (Present, Last Glacial Maximum and Last Interglacial) to determine changes in the distribution range of both species during the Pleistocene. Finally, we used Bayesian inference methods (DIYABC) to test different hypotheses regarding the divergence and demographic history of these species. Data supported the hypothesis of the origin of C. mexicanus from a peripheral population isolated during the Pleistocene [∼230,000 years ago (0.1-0.43 Ma 95% HPD)], with a Pleistocene-Holocene (∼9,000-11,000 years ago) population expansion (∼10-fold increase in population size). We identified the presence of two possible refugia in the southern area of the distribution range of C. ludovicianus and another, consistent with the distribution range of C. mexicanus. Our analyses suggest that Pleistocene climate change had a strong impact in the distribution of these species, promoting peripatric speciation for the origin of C. mexicanus and lineage divergence within C. ludovicianus. Copyright © 2015 Elsevier Inc. All rights reserved.
The ranking probability approach and its usage in design and analysis of large-scale studies.
Kuo, Chia-Ling; Zaykin, Dmitri
2013-01-01
In experiments with many statistical tests there is need to balance type I and type II error rates while taking multiplicity into account. In the traditional approach, the nominal [Formula: see text]-level such as 0.05 is adjusted by the number of tests, [Formula: see text], i.e., as 0.05/[Formula: see text]. Assuming that some proportion of tests represent "true signals", that is, originate from a scenario where the null hypothesis is false, power depends on the number of true signals and the respective distribution of effect sizes. One way to define power is for it to be the probability of making at least one correct rejection at the assumed [Formula: see text]-level. We advocate an alternative way of establishing how "well-powered" a study is. In our approach, useful for studies with multiple tests, the ranking probability [Formula: see text] is controlled, defined as the probability of making at least [Formula: see text] correct rejections while rejecting hypotheses with [Formula: see text] smallest P-values. The two approaches are statistically related. Probability that the smallest P-value is a true signal (i.e., [Formula: see text]) is equal to the power at the level [Formula: see text], to an very good excellent approximation. Ranking probabilities are also related to the false discovery rate and to the Bayesian posterior probability of the null hypothesis. We study properties of our approach when the effect size distribution is replaced for convenience by a single "typical" value taken to be the mean of the underlying distribution. We conclude that its performance is often satisfactory under this simplification; however, substantial imprecision is to be expected when [Formula: see text] is very large and [Formula: see text] is small. Precision is largely restored when three values with the respective abundances are used instead of a single typical effect size value.
Goldsmith, Elizabeth W.; Renshaw, Benjamin; Clement, Christopher J.; Himschoot, Elizabeth A.; Hundertmark, Kris J.; Hueffer, Karsten
2015-01-01
For pathogens that infect multiple species the distinction between reservoir hosts and spillover hosts is often difficult. In Alaska, three variants of the arctic rabies virus exist with distinct spatial distributions. We test the hypothesis that rabies virus variant distribution corresponds to the population structure of the primary rabies hosts in Alaska, arctic foxes (Vulpes lagopus) and red foxes (V. vulpes) in order to possibly distinguish reservoir and spill over hosts. We used mitochondrial DNA (mtDNA) sequence and nine microsatellites to assess population structure in those two species. mtDNA structure did not correspond to rabies virus variant structure in either species. Microsatellite analyses gave varying results. Bayesian clustering found 2 groups of arctic foxes in the coastal tundra region, but for red foxes it identified tundra and boreal types. Spatial Bayesian clustering and spatial principal components analysis identified 3 and 4 groups of arctic foxes, respectively, closely matching the distribution of rabies virus variants in the state. Red foxes, conversely, showed eight clusters comprising 2 regions (boreal and tundra) with much admixture. These results run contrary to previous beliefs that arctic fox show no fine-scale spatial population structure. While we cannot rule out that the red fox is part of the maintenance host community for rabies in Alaska, the distribution of virus variants appears to be driven primarily by the artic fox Therefore we show that host population genetics can be utilized to distinguish between maintenance and spillover hosts when used in conjunction with other approaches. PMID:26661691
Goldsmith, Elizabeth W; Renshaw, Benjamin; Clement, Christopher J; Himschoot, Elizabeth A; Hundertmark, Kris J; Hueffer, Karsten
2016-02-01
For pathogens that infect multiple species, the distinction between reservoir hosts and spillover hosts is often difficult. In Alaska, three variants of the arctic rabies virus exist with distinct spatial distributions. We tested the hypothesis that rabies virus variant distribution corresponds to the population structure of the primary rabies hosts in Alaska, arctic foxes (Vulpes lagopus) and red foxes (Vulpes vulpes) to possibly distinguish reservoir and spillover hosts. We used mitochondrial DNA (mtDNA) sequence and nine microsatellites to assess population structure in those two species. mtDNA structure did not correspond to rabies virus variant structure in either species. Microsatellite analyses gave varying results. Bayesian clustering found two groups of arctic foxes in the coastal tundra region, but for red foxes it identified tundra and boreal types. Spatial Bayesian clustering and spatial principal components analysis identified 3 and 4 groups of arctic foxes, respectively, closely matching the distribution of rabies virus variants in the state. Red foxes, conversely, showed eight clusters comprising two regions (boreal and tundra) with much admixture. These results run contrary to previous beliefs that arctic fox show no fine-scale spatial population structure. While we cannot rule out that the red fox is part of the maintenance host community for rabies in Alaska, the distribution of virus variants appears to be driven primarily by the arctic fox. Therefore, we show that host population genetics can be utilized to distinguish between maintenance and spillover hosts when used in conjunction with other approaches. © 2015 John Wiley & Sons Ltd.
Hoyal Cuthill, Jennifer F; Charleston, Michael
2015-12-01
Examples of long-term coevolution are rare among free-living organisms. Müllerian mimicry in Heliconius butterflies had been suggested as a key example of coevolution by early genetic studies. However, research over the last two decades has been dominated by the idea that the best-studied comimics, H. erato and H. melpomene, did not coevolve at all. Recently sequenced genes associated with wing color pattern phenotype offer a new opportunity to resolve this controversy. Here, we test the hypothesis of coevolution between H. erato and H. melpomene using Bayesian multilocus analysis of five color pattern genes and five neutral genetic markers. We first explore the extent of phylogenetic agreement versus conflict between the different genes. Coevolution is then tested against three aspects of the mimicry diversifications: phylogenetic branching patterns, divergence times, and, for the first time, phylogeographic histories. We show that all three lines of evidence are compatible with strict coevolution of the diverse mimicry wing patterns, contrary to some recent suggestions. Instead, these findings tally with a coevolutionary diversification driven primarily by the ecological force of Müllerian mimicry. © 2015 The Author(s). Evolution © 2015 The Society for the Study of Evolution.
Salas-Leiva, Dayana E.; Meerow, Alan W.; Calonje, Michael; Griffith, M. Patrick; Francisco-Ortega, Javier; Nakamura, Kyoko; Stevenson, Dennis W.; Lewis, Carl E.; Namoff, Sandra
2013-01-01
Background and aims Despite a recent new classification, a stable phylogeny for the cycads has been elusive, particularly regarding resolution of Bowenia, Stangeria and Dioon. In this study, five single-copy nuclear genes (SCNGs) are applied to the phylogeny of the order Cycadales. The specific aim is to evaluate several gene tree–species tree reconciliation approaches for developing an accurate phylogeny of the order, to contrast them with concatenated parsimony analysis and to resolve the erstwhile problematic phylogenetic position of these three genera. Methods DNA sequences of five SCNGs were obtained for 20 cycad species representing all ten genera of Cycadales. These were analysed with parsimony, maximum likelihood (ML) and three Bayesian methods of gene tree–species tree reconciliation, using Cycas as the outgroup. A calibrated date estimation was developed with Bayesian methods, and biogeographic analysis was also conducted. Key Results Concatenated parsimony, ML and three species tree inference methods resolve exactly the same tree topology with high support at most nodes. Dioon and Bowenia are the first and second branches of Cycadales after Cycas, respectively, followed by an encephalartoid clade (Macrozamia–Lepidozamia–Encephalartos), which is sister to a zamioid clade, of which Ceratozamia is the first branch, and in which Stangeria is sister to Microcycas and Zamia. Conclusions A single, well-supported phylogenetic hypothesis of the generic relationships of the Cycadales is presented. However, massive extinction events inferred from the fossil record that eliminated broader ancestral distributions within Zamiaceae compromise accurate optimization of ancestral biogeographical areas for that hypothesis. While major lineages of Cycadales are ancient, crown ages of all modern genera are no older than 12 million years, supporting a recent hypothesis of mostly Miocene radiations. This phylogeny can contribute to an accurate infrafamilial classification of Zamiaceae. PMID:23997230
Interactive activation and mutual constraint satisfaction in perception and cognition.
McClelland, James L; Mirman, Daniel; Bolger, Donald J; Khaitan, Pranav
2014-08-01
In a seminal 1977 article, Rumelhart argued that perception required the simultaneous use of multiple sources of information, allowing perceivers to optimally interpret sensory information at many levels of representation in real time as information arrives. Building on Rumelhart's arguments, we present the Interactive Activation hypothesis-the idea that the mechanism used in perception and comprehension to achieve these feats exploits an interactive activation process implemented through the bidirectional propagation of activation among simple processing units. We then examine the interactive activation model of letter and word perception and the TRACE model of speech perception, as early attempts to explore this hypothesis, and review the experimental evidence relevant to their assumptions and predictions. We consider how well these models address the computational challenge posed by the problem of perception, and we consider how consistent they are with evidence from behavioral experiments. We examine empirical and theoretical controversies surrounding the idea of interactive processing, including a controversy that swirls around the relationship between interactive computation and optimal Bayesian inference. Some of the implementation details of early versions of interactive activation models caused deviation from optimality and from aspects of human performance data. More recent versions of these models, however, overcome these deficiencies. Among these is a model called the multinomial interactive activation model, which explicitly links interactive activation and Bayesian computations. We also review evidence from neurophysiological and neuroimaging studies supporting the view that interactive processing is a characteristic of the perceptual processing machinery in the brain. In sum, we argue that a computational analysis, as well as behavioral and neuroscience evidence, all support the Interactive Activation hypothesis. The evidence suggests that contemporary versions of models based on the idea of interactive activation continue to provide a basis for efforts to achieve a fuller understanding of the process of perception. Copyright © 2014 Cognitive Science Society, Inc.
Bayesian relaxed clock estimation of divergence times in foraminifera.
Groussin, Mathieu; Pawlowski, Jan; Yang, Ziheng
2011-10-01
Accurate and precise estimation of divergence times during the Neo-Proterozoic is necessary to understand the speciation dynamic of early Eukaryotes. However such deep divergences are difficult to date, as the molecular clock is seriously violated. Recent improvements in Bayesian molecular dating techniques allow the relaxation of the molecular clock hypothesis as well as incorporation of multiple and flexible fossil calibrations. Divergence times can then be estimated even when the evolutionary rate varies among lineages and even when the fossil calibrations involve substantial uncertainties. In this paper, we used a Bayesian method to estimate divergence times in Foraminifera, a group of unicellular eukaryotes, known for their excellent fossil record but also for the high evolutionary rates of their genomes. Based on multigene data we reconstructed the phylogeny of Foraminifera and dated their origin and the major radiation events. Our estimates suggest that Foraminifera emerged during the Cryogenian (650-920 Ma, Neo-Proterozoic), with a mean time around 770 Ma, about 220 Myr before the first appearance of reliable foraminiferal fossils in sediments (545 Ma). Most dates are in agreement with the fossil record, but in general our results suggest earlier origins of foraminiferal orders. We found that the posterior time estimates were robust to specifications of the prior. Our results highlight inter-species variations of evolutionary rates in Foraminifera. Their effect was partially overcome by using the partitioned Bayesian analysis to accommodate rate heterogeneity among data partitions and using the relaxed molecular clock to account for changing evolutionary rates. However, more coding genes appear necessary to obtain more precise estimates of divergence times and to resolve the conflicts between fossil and molecular date estimates. Copyright © 2011 Elsevier Inc. All rights reserved.
Zador, Zsolt; Huang, Wendy; Sperrin, Matthew; Lawton, Michael T
2018-06-01
Following the International Subarachnoid Aneurysm Trial (ISAT), evolving treatment modalities for acute aneurysmal subarachnoid hemorrhage (aSAH) has changed the case mix of patients undergoing urgent surgical clipping. To update our knowledge on outcome predictors by analyzing admission parameters in a pure surgical series using variable importance ranking and machine learning. We reviewed a single surgeon's case series of 226 patients suffering from aSAH treated with urgent surgical clipping. Predictions were made using logistic regression models, and predictive performance was assessed using areas under the receiver operating curve (AUC). We established variable importance ranking using partial Nagelkerke R2 scores. Probabilistic associations between variables were depicted using Bayesian networks, a method of machine learning. Importance ranking showed that World Federation of Neurosurgical Societies (WFNS) grade and age were the most influential outcome prognosticators. Inclusion of only these 2 predictors was sufficient to maintain model performance compared to when all variables were considered (AUC = 0.8222, 95% confidence interval (CI): 0.7646-0.88 vs 0.8218, 95% CI: 0.7616-0.8821, respectively, DeLong's P = .992). Bayesian networks showed that age and WFNS grade were associated with several variables such as laboratory results and cardiorespiratory parameters. Our study is the first to report early outcomes and formal predictor importance ranking following aSAH in a post-ISAT surgical case series. Models showed good predictive power with fewer relevant predictors than in similar size series. Bayesian networks proved to be a powerful tool in visualizing the widespread association of the 2 key predictors with admission variables, explaining their importance and demonstrating the potential for hypothesis generation.
Can Bayesian Theories of Autism Spectrum Disorder Help Improve Clinical Practice?
Haker, Helene; Schneebeli, Maya; Stephan, Klaas Enno
2016-01-01
Diagnosis and individualized treatment of autism spectrum disorder (ASD) represent major problems for contemporary psychiatry. Tackling these problems requires guidance by a pathophysiological theory. In this paper, we consider recent theories that re-conceptualize ASD from a "Bayesian brain" perspective, which posit that the core abnormality of ASD resides in perceptual aberrations due to a disbalance in the precision of prediction errors (sensory noise) relative to the precision of predictions (prior beliefs). This results in percepts that are dominated by sensory inputs and less guided by top-down regularization and shifts the perceptual focus to detailed aspects of the environment with difficulties in extracting meaning. While these Bayesian theories have inspired ongoing empirical studies, their clinical implications have not yet been carved out. Here, we consider how this Bayesian perspective on disease mechanisms in ASD might contribute to improving clinical care for affected individuals. Specifically, we describe a computational strategy, based on generative (e.g., hierarchical Bayesian) models of behavioral and functional neuroimaging data, for establishing diagnostic tests. These tests could provide estimates of specific cognitive processes underlying ASD and delineate pathophysiological mechanisms with concrete treatment targets. Written with a clinical audience in mind, this article outlines how the development of computational diagnostics applicable to behavioral and functional neuroimaging data in routine clinical practice could not only fundamentally alter our concept of ASD but eventually also transform the clinical management of this disorder.
Can Bayesian Theories of Autism Spectrum Disorder Help Improve Clinical Practice?
Haker, Helene; Schneebeli, Maya; Stephan, Klaas Enno
2016-01-01
Diagnosis and individualized treatment of autism spectrum disorder (ASD) represent major problems for contemporary psychiatry. Tackling these problems requires guidance by a pathophysiological theory. In this paper, we consider recent theories that re-conceptualize ASD from a “Bayesian brain” perspective, which posit that the core abnormality of ASD resides in perceptual aberrations due to a disbalance in the precision of prediction errors (sensory noise) relative to the precision of predictions (prior beliefs). This results in percepts that are dominated by sensory inputs and less guided by top-down regularization and shifts the perceptual focus to detailed aspects of the environment with difficulties in extracting meaning. While these Bayesian theories have inspired ongoing empirical studies, their clinical implications have not yet been carved out. Here, we consider how this Bayesian perspective on disease mechanisms in ASD might contribute to improving clinical care for affected individuals. Specifically, we describe a computational strategy, based on generative (e.g., hierarchical Bayesian) models of behavioral and functional neuroimaging data, for establishing diagnostic tests. These tests could provide estimates of specific cognitive processes underlying ASD and delineate pathophysiological mechanisms with concrete treatment targets. Written with a clinical audience in mind, this article outlines how the development of computational diagnostics applicable to behavioral and functional neuroimaging data in routine clinical practice could not only fundamentally alter our concept of ASD but eventually also transform the clinical management of this disorder. PMID:27378955
Bayesian learning and the psychology of rule induction
Endress, Ansgar D.
2014-01-01
In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to spell out the underlying assumptions, and to confront them with the empirical results Frank and Tenenbaum (2011) propose to simulate, as well as with novel experiments. While rule-learning is arguably well suited to rational Bayesian approaches, I show that their models are neither psychologically plausible nor ideal observer models. Further, I show that their central assumption is unfounded: humans do not always preferentially learn more specific rules, but, at least in some situations, those rules that happen to be more salient. Even when granting the unsupported assumptions, I show that all of the experiments modeled by Frank and Tenenbaum (2011) either contradict their models, or have a large number of more plausible interpretations. I provide an alternative account of the experimental data based on simple psychological mechanisms, and show that this account both describes the data better, and is easier to falsify. I conclude that, despite the recent surge in Bayesian models of cognitive phenomena, psychological phenomena are best understood by developing and testing psychological theories rather than models that can be fit to virtually any data. PMID:23454791
Bayesian inference of a historical bottleneck in a heavily exploited marine mammal.
Hoffman, J I; Grant, S M; Forcada, J; Phillips, C D
2011-10-01
Emerging Bayesian analytical approaches offer increasingly sophisticated means of reconstructing historical population dynamics from genetic data, but have been little applied to scenarios involving demographic bottlenecks. Consequently, we analysed a large mitochondrial and microsatellite dataset from the Antarctic fur seal Arctocephalus gazella, a species subjected to one of the most extreme examples of uncontrolled exploitation in history when it was reduced to the brink of extinction by the sealing industry during the late eighteenth and nineteenth centuries. Classical bottleneck tests, which exploit the fact that rare alleles are rapidly lost during demographic reduction, yielded ambiguous results. In contrast, a strong signal of recent demographic decline was detected using both Bayesian skyline plots and Approximate Bayesian Computation, the latter also allowing derivation of posterior parameter estimates that were remarkably consistent with historical observations. This was achieved using only contemporary samples, further emphasizing the potential of Bayesian approaches to address important problems in conservation and evolutionary biology. © 2011 Blackwell Publishing Ltd.
Number-Knower Levels in Young Children: Insights from Bayesian Modeling
ERIC Educational Resources Information Center
Lee, Michael D.; Sarnecka, Barbara W.
2011-01-01
Lee and Sarnecka (2010) developed a Bayesian model of young children's behavior on the Give-N test of number knowledge. This paper presents two new extensions of the model, and applies the model to new data. In the first extension, the model is used to evaluate competing theories about the conceptual knowledge underlying children's behavior. One,…
B.G. Marcot; J.D. Steventon; G.D. Sutherland; R.K. McCann
2006-01-01
We provide practical guidelines for developing, testing, and revising Bayesian belief networks (BBNs). Primary steps in this process include creating influence diagrams of the hypothesized "causal web" of key factors affecting a species or ecological outcome of interest; developing a first, alpha-level BBN model from the influence diagram; revising the model...
ERIC Educational Resources Information Center
Leventhal, Brian C.; Stone, Clement A.
2018-01-01
Interest in Bayesian analysis of item response theory (IRT) models has grown tremendously due to the appeal of the paradigm among psychometricians, advantages of these methods when analyzing complex models, and availability of general-purpose software. Possible models include models which reflect multidimensionality due to designed test structure,…
ERIC Educational Resources Information Center
Tsiouris, John; Mann, Rachel; Patti, Paul; Sturmey, Peter
2004-01-01
Clinicians need to know the likelihood of a condition given a positive or negative diagnostic test. In this study a Bayesian analysis of the Clinical Behavior Checklist for Persons with Intellectual Disabilities (CBCPID) to predict depression in people with intellectual disability was conducted. The CBCPID was administered to 92 adults with…
Bayesian data fusion for spatial prediction of categorical variables in environmental sciences
NASA Astrophysics Data System (ADS)
Gengler, Sarah; Bogaert, Patrick
2014-12-01
First developed to predict continuous variables, Bayesian Maximum Entropy (BME) has become a complete framework in the context of space-time prediction since it has been extended to predict categorical variables and mixed random fields. This method proposes solutions to combine several sources of data whatever the nature of the information. However, the various attempts that were made for adapting the BME methodology to categorical variables and mixed random fields faced some limitations, as a high computational burden. The main objective of this paper is to overcome this limitation by generalizing the Bayesian Data Fusion (BDF) theoretical framework to categorical variables, which is somehow a simplification of the BME method through the convenient conditional independence hypothesis. The BDF methodology for categorical variables is first described and then applied to a practical case study: the estimation of soil drainage classes using a soil map and point observations in the sandy area of Flanders around the city of Mechelen (Belgium). The BDF approach is compared to BME along with more classical approaches, as Indicator CoKringing (ICK) and logistic regression. Estimators are compared using various indicators, namely the Percentage of Correctly Classified locations (PCC) and the Average Highest Probability (AHP). Although BDF methodology for categorical variables is somehow a simplification of BME approach, both methods lead to similar results and have strong advantages compared to ICK and logistic regression.
Approximate string matching algorithms for limited-vocabulary OCR output correction
NASA Astrophysics Data System (ADS)
Lasko, Thomas A.; Hauser, Susan E.
2000-12-01
Five methods for matching words mistranslated by optical character recognition to their most likely match in a reference dictionary were tested on data from the archives of the National Library of Medicine. The methods, including an adaptation of the cross correlation algorithm, the generic edit distance algorithm, the edit distance algorithm with a probabilistic substitution matrix, Bayesian analysis, and Bayesian analysis on an actively thinned reference dictionary were implemented and their accuracy rates compared. Of the five, the Bayesian algorithm produced the most correct matches (87%), and had the advantage of producing scores that have a useful and practical interpretation.
Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model
Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.
2014-01-01
Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Harlin-Cognato, April D; Honeycutt, Rodney L
2006-01-01
Background Dolphins of the genus Lagenorhynchus are anti-tropically distributed in temperate to cool waters. Phylogenetic analyses of cytochrome b sequences have suggested that the genus is polyphyletic; however, many relationships were poorly resolved. In this study, we present a combined-analysis phylogenetic hypothesis for Lagenorhynchus and members of the subfamily Lissodelphininae, which is derived from two nuclear and two mitochondrial data sets and the addition of 34 individuals representing 9 species. In addition, we characterize with parsimony and Bayesian analyses the phylogenetic utility and interaction of characters with statistical measures, including the utility of highly consistent (non-homoplasious) characters as a conservative measure of phylogenetic robustness. We also explore the effects of removing sources of character conflict on phylogenetic resolution. Results Overall, our study provides strong support for the monophyly of the subfamily Lissodelphininae and the polyphyly of the genus Lagenorhynchus. In addition, the simultaneous parsimony analysis resolved and/or improved resolution for 12 nodes including: (1) L. albirostris, L. acutus; (2) L. obscurus and L. obliquidens; and (3) L. cruciger and L. australis. In addition, the Bayesian analysis supported the monophyly of the Cephalorhynchus, and resolved ambiguities regarding the relationship of L. australis/L. cruciger to other members of the genus Lagenorhynchus. The frequency of highly consistent characters varied among data partitions, but the rate of evolution was consistent within data partitions. Although the control region was the greatest source of character conflict, removal of this data partition impeded phylogenetic resolution. Conclusion The simultaneous analysis approach produced a more robust phylogenetic hypothesis for Lagenorhynchus than previous studies, thus supporting a phylogenetic approach employing multiple data partitions that vary in overall rate of evolution. Even in cases where there was apparent conflict among characters, our data suggest a synergistic interaction in the simultaneous analysis, and speak against a priori exclusion of data because of potential conflicts, primarily because phylogenetic results can be less robust. For example, the removal of the control region, the putative source of character conflict, produced spurious results with inconsistencies among and within topologies from parsimony and Bayesian analyses. PMID:17078887
Ye, Zhen; Zhu, Gengping; Chen, Pingping; Zhang, Danli; Bu, Wenjun
2014-06-01
This study investigated the Pleistocene history of a semi-aquatic bug, Microvelia douglasi douglasi Scott, 1874 (Hemiptera: Veliidae) in East Asia. We used M. douglasi douglasi as a model species to explore the effects of historical climatic fluctuations on montane semi-aquatic invertebrate species. Two hypotheses were developed using ecological niche models (ENMs). First, we hypothesized that M. douglasi douglasi persisted in suitable habitats in southern Guizhou, southern Yunnan, Hainan, Taiwan and southeast China during the LIG. After that, the populations expanded (Hypothesis 1). As the spatial prediction in the LGM was significantly larger than in the LIG, we then hypothesized that the population expanded during the LIG to LGM transition (Hypothesis 2). We tested these hypotheses using mitochondrial data (COI+COII) and nuclear data (ITS1+5.8S+ITS2). Young lineages, relatively deep splits, lineage differentiation among mountain ranges in central, south and southwest China and high genetic diversities were observed in these suitable habitats. Evidence of mismatch distributions and neutrality tests indicate that a population expansion occurred in the late Pleistocene. The Bayesian skyline plot (BSP) revealed an unusual population expansion that likely happened during the cooling transition between LIG and LGM. The results of genetic data were mostly consistent with the spatial predictions from ENM, a finding that can profoundly improve phylogeographic research. The ecological requirements of M. douglasi douglasi, together with the geographical heterogeneity and climatic fluctuations of Pleistocene in East Asia, could have shaped this unusual demographic history. Our study contributes to our knowledge of semi-aquatic bug/invertebrate responses to Pleistocene climatic fluctuations in East Asia. © 2014 John Wiley & Sons Ltd.
Zou, W; Ouyang, H
2016-02-01
We propose a multiple estimation adjustment (MEA) method to correct effect overestimation due to selection bias from a hypothesis-generating study (HGS) in pharmacogenetics. MEA uses a hierarchical Bayesian approach to model individual effect estimates from maximal likelihood estimation (MLE) in a region jointly and shrinks them toward the regional effect. Unlike many methods that model a fixed selection scheme, MEA capitalizes on local multiplicity independent of selection. We compared mean square errors (MSEs) in simulated HGSs from naive MLE, MEA and a conditional likelihood adjustment (CLA) method that model threshold selection bias. We observed that MEA effectively reduced MSE from MLE on null effects with or without selection, and had a clear advantage over CLA on extreme MLE estimates from null effects under lenient threshold selection in small samples, which are common among 'top' associations from a pharmacogenetics HGS.
Roelandt, S; Van der Stede, Y; Czaplicki, G; Van Loo, H; Van Driessche, E; Dewulf, J; Hooyberghs, J; Faes, C
2015-06-06
Currently, there are no perfect reference tests for the in vivo detection of Neospora caninum infection. Two commercial N caninum ELISA tests are currently used in Belgium for bovine sera (TEST A and TEST B). The goal of this study is to evaluate these tests used at their current cut-offs, with a no gold standard approach, for the test purpose of (1) demonstration of freedom of infection at purchase and (2) diagnosis in aborting cattle. Sera of two study populations, Abortion population (n=196) and Purchase population (n=514), were selected and tested with both ELISA's. Test results were entered in a Bayesian model with informative priors on population prevalences only (Scenario 1). As sensitivity analysis, two more models were used: one with informative priors on test diagnostic accuracy (Scenario 2) and one with all priors uninformative (Scenario 3). The accuracy parameters were estimated from the first model: diagnostic sensitivity (Test A: 93.54 per cent-Test B: 86.99 per cent) and specificity (Test A: 90.22 per cent-Test B: 90.15 per cent) were high and comparable (Bayesian P values >0.05). Based on predictive values in the two study populations, both tests were fit for purpose, despite an expected false negative fraction of ±0.5 per cent in the Purchase population and ±5 per cent in the Abortion population. In addition, a false positive fraction of ±3 per cent in the overall Purchase population and ±4 per cent in the overall Abortion population was found. British Veterinary Association.
ERIC Educational Resources Information Center
Vrieze, Scott I.
2012-01-01
This article reviews the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) in model selection and the appraisal of psychological theory. The focus is on latent variable models, given their growing use in theory testing and construction. Theoretical statistical results in regression are discussed, and more important…
ERIC Educational Resources Information Center
Kessler, Lawrence M.
2013-01-01
In this paper I propose Bayesian estimation of a nonlinear panel data model with a fractional dependent variable (bounded between 0 and 1). Specifically, I estimate a panel data fractional probit model which takes into account the bounded nature of the fractional response variable. I outline estimation under the assumption of strict exogeneity as…
Howard B. Stauffer; Cynthia J. Zabel; Jeffrey R. Dunk
2005-01-01
We compared a set of competing logistic regression habitat selection models for Northern Spotted Owls (Strix occidentalis caurina) in California. The habitat selection models were estimated, compared, evaluated, and tested using multiple sample datasets collected on federal forestlands in northern California. We used Bayesian methods in interpreting...
A Test of Bayesian Observer Models of Processing in the Eriksen Flanker Task
ERIC Educational Resources Information Center
White, Corey N.; Brown, Scott; Ratcliff, Roger
2012-01-01
Two Bayesian observer models were recently proposed to account for data from the Eriksen flanker task, in which flanking items interfere with processing of a central target. One model assumes that interference stems from a perceptual bias to process nearby items as if they are compatible, and the other assumes that the interference is due to…
ERIC Educational Resources Information Center
Kelava, Augustin; Nagengast, Benjamin
2012-01-01
Structural equation models with interaction and quadratic effects have become a standard tool for testing nonlinear hypotheses in the social sciences. Most of the current approaches assume normally distributed latent predictor variables. In this article, we present a Bayesian model for the estimation of latent nonlinear effects when the latent…
Phylogeny and biogeography of the amphi-Pacific genus Aphananthe
Yang, Mei-Qing; Li, De-Zhu; Wen, Jun; Yi, Ting-Shuang
2017-01-01
Aphananthe is a small genus of five species showing an intriguing amphi-Pacific distribution in eastern, southern and southeastern Asia, Australia, and Mexico, also with one species in Madagascar. The phylogenetic relationships of Aphananthe were reconstructed with two nuclear (ITS & ETS) and two plastid (psbA-trnH & trnL-trnF) regions. Clade divergence times were estimated with a Bayesian approach, and the ancestral areas were inferred using the dispersal-extinction-cladogenesis and Bayesian Binary MCMC analyses. Aphananthe was supported to be monophyletic, with the eastern Asian A. aspera resolved as sister to a clade of the remaining four species. Aphananthe was inferred to have originated in the Late Cretaceous (71.5 mya, with 95% HPD: 66.6–81.3 mya), and the crown age of the genus was dated to be in the early Miocene (19.1 mya, with 95% HPD: 12.4–28.9 mya). The fossil record indicates that Aphananthe was present in the high latitude thermophilic forests in the early Tertiary, and experienced extinctions from the middle Tertiary onwards. Aphananthe originated in Europe based on the inference that included fossil and extant species, but eastern Asia was estimated to be the ancestral area of the clade of the extant species of Aphananthe. Both the West Gondwanan vicariance hypothesis and the boreotropics hypothesis could be excluded as explanation for its amphi-Pacific distribution. Long-distance dispersals out of eastern Asia into North America, southern and southeastern Asia and Australia, and Madagascar during the Miocene account for its wide intercontinental disjunct distribution. PMID:28170425
Williams, Mary R; Sigman, Michael E; Lewis, Jennifer; Pitan, Kelly McHugh
2012-10-10
A bayesian soft classification method combined with target factor analysis (TFA) is described and tested for the analysis of fire debris data. The method relies on analysis of the average mass spectrum across the chromatographic profile (i.e., the total ion spectrum, TIS) from multiple samples taken from a single fire scene. A library of TIS from reference ignitable liquids with assigned ASTM classification is used as the target factors in TFA. The class-conditional distributions of correlations between the target and predicted factors for each ASTM class are represented by kernel functions and analyzed by bayesian decision theory. The soft classification approach assists in assessing the probability that ignitable liquid residue from a specific ASTM E1618 class, is present in a set of samples from a single fire scene, even in the presence of unspecified background contributions from pyrolysis products. The method is demonstrated with sample data sets and then tested on laboratory-scale burn data and large-scale field test burns. The overall performance achieved in laboratory and field test of the method is approximately 80% correct classification of fire debris samples. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Yu, Rongjie; Abdel-Aty, Mohamed
2013-07-01
The Bayesian inference method has been frequently adopted to develop safety performance functions. One advantage of the Bayesian inference is that prior information for the independent variables can be included in the inference procedures. However, there are few studies that discussed how to formulate informative priors for the independent variables and evaluated the effects of incorporating informative priors in developing safety performance functions. This paper addresses this deficiency by introducing four approaches of developing informative priors for the independent variables based on historical data and expert experience. Merits of these informative priors have been tested along with two types of Bayesian hierarchical models (Poisson-gamma and Poisson-lognormal models). Deviance information criterion (DIC), R-square values, and coefficients of variance for the estimations were utilized as evaluation measures to select the best model(s). Comparison across the models indicated that the Poisson-gamma model is superior with a better model fit and it is much more robust with the informative priors. Moreover, the two-stage Bayesian updating informative priors provided the best goodness-of-fit and coefficient estimation accuracies. Furthermore, informative priors for the inverse dispersion parameter have also been introduced and tested. Different types of informative priors' effects on the model estimations and goodness-of-fit have been compared and concluded. Finally, based on the results, recommendations for future research topics and study applications have been made. Copyright © 2013 Elsevier Ltd. All rights reserved.
Brase, Gary L.; Hill, W. Trey
2015-01-01
Bayesian reasoning, defined here as the updating of a posterior probability following new information, has historically been problematic for humans. Classic psychology experiments have tested human Bayesian reasoning through the use of word problems and have evaluated each participant’s performance against the normatively correct answer provided by Bayes’ theorem. The standard finding is of generally poor performance. Over the past two decades, though, progress has been made on how to improve Bayesian reasoning. Most notably, research has demonstrated that the use of frequencies in a natural sampling framework—as opposed to single-event probabilities—can improve participants’ Bayesian estimates. Furthermore, pictorial aids and certain individual difference factors also can play significant roles in Bayesian reasoning success. The mechanics of how to build tasks which show these improvements is not under much debate. The explanations for why naturally sampled frequencies and pictures help Bayesian reasoning remain hotly contested, however, with many researchers falling into ingrained “camps” organized around two dominant theoretical perspectives. The present paper evaluates the merits of these theoretical perspectives, including the weight of empirical evidence, theoretical coherence, and predictive power. By these criteria, the ecological rationality approach is clearly better than the heuristics and biases view. Progress in the study of Bayesian reasoning will depend on continued research that honestly, vigorously, and consistently engages across these different theoretical accounts rather than staying “siloed” within one particular perspective. The process of science requires an understanding of competing points of view, with the ultimate goal being integration. PMID:25873904
Brase, Gary L; Hill, W Trey
2015-01-01
Bayesian reasoning, defined here as the updating of a posterior probability following new information, has historically been problematic for humans. Classic psychology experiments have tested human Bayesian reasoning through the use of word problems and have evaluated each participant's performance against the normatively correct answer provided by Bayes' theorem. The standard finding is of generally poor performance. Over the past two decades, though, progress has been made on how to improve Bayesian reasoning. Most notably, research has demonstrated that the use of frequencies in a natural sampling framework-as opposed to single-event probabilities-can improve participants' Bayesian estimates. Furthermore, pictorial aids and certain individual difference factors also can play significant roles in Bayesian reasoning success. The mechanics of how to build tasks which show these improvements is not under much debate. The explanations for why naturally sampled frequencies and pictures help Bayesian reasoning remain hotly contested, however, with many researchers falling into ingrained "camps" organized around two dominant theoretical perspectives. The present paper evaluates the merits of these theoretical perspectives, including the weight of empirical evidence, theoretical coherence, and predictive power. By these criteria, the ecological rationality approach is clearly better than the heuristics and biases view. Progress in the study of Bayesian reasoning will depend on continued research that honestly, vigorously, and consistently engages across these different theoretical accounts rather than staying "siloed" within one particular perspective. The process of science requires an understanding of competing points of view, with the ultimate goal being integration.
A Bayesian pick-the-winner design in a randomized phase II clinical trial.
Chen, Dung-Tsa; Huang, Po-Yu; Lin, Hui-Yi; Chiappori, Alberto A; Gabrilovich, Dmitry I; Haura, Eric B; Antonia, Scott J; Gray, Jhanelle E
2017-10-24
Many phase II clinical trials evaluate unique experimental drugs/combinations through multi-arm design to expedite the screening process (early termination of ineffective drugs) and to identify the most effective drug (pick the winner) to warrant a phase III trial. Various statistical approaches have been developed for the pick-the-winner design but have been criticized for lack of objective comparison among the drug agents. We developed a Bayesian pick-the-winner design by integrating a Bayesian posterior probability with Simon two-stage design in a randomized two-arm clinical trial. The Bayesian posterior probability, as the rule to pick the winner, is defined as probability of the response rate in one arm higher than in the other arm. The posterior probability aims to determine the winner when both arms pass the second stage of the Simon two-stage design. When both arms are competitive (i.e., both passing the second stage), the Bayesian posterior probability performs better to correctly identify the winner compared with the Fisher exact test in the simulation study. In comparison to a standard two-arm randomized design, the Bayesian pick-the-winner design has a higher power to determine a clear winner. In application to two studies, the approach is able to perform statistical comparison of two treatment arms and provides a winner probability (Bayesian posterior probability) to statistically justify the winning arm. We developed an integrated design that utilizes Bayesian posterior probability, Simon two-stage design, and randomization into a unique setting. It gives objective comparisons between the arms to determine the winner.
A Bayesian Assessment of Seismic Semi-Periodicity Forecasts
NASA Astrophysics Data System (ADS)
Nava, F.; Quinteros, C.; Glowacka, E.; Frez, J.
2016-01-01
Among the schemes for earthquake forecasting, the search for semi-periodicity during large earthquakes in a given seismogenic region plays an important role. When considering earthquake forecasts based on semi-periodic sequence identification, the Bayesian formalism is a useful tool for: (1) assessing how well a given earthquake satisfies a previously made forecast; (2) re-evaluating the semi-periodic sequence probability; and (3) testing other prior estimations of the sequence probability. A comparison of Bayesian estimates with updated estimates of semi-periodic sequences that incorporate new data not used in the original estimates shows extremely good agreement, indicating that: (1) the probability that a semi-periodic sequence is not due to chance is an appropriate estimate for the prior sequence probability estimate; and (2) the Bayesian formalism does a very good job of estimating corrected semi-periodicity probabilities, using slightly less data than that used for updated estimates. The Bayesian approach is exemplified explicitly by its application to the Parkfield semi-periodic forecast, and results are given for its application to other forecasts in Japan and Venezuela.
Vogt, Martin; Bajorath, Jürgen
2008-01-01
Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
Bayesian parameter estimation for chiral effective field theory
NASA Astrophysics Data System (ADS)
Wesolowski, Sarah; Furnstahl, Richard; Phillips, Daniel; Klco, Natalie
2016-09-01
The low-energy constants (LECs) of a chiral effective field theory (EFT) interaction in the two-body sector are fit to observable data using a Bayesian parameter estimation framework. By using Bayesian prior probability distributions (pdfs), we quantify relevant physical expectations such as LEC naturalness and include them in the parameter estimation procedure. The final result is a posterior pdf for the LECs, which can be used to propagate uncertainty resulting from the fit to data to the final observable predictions. The posterior pdf also allows an empirical test of operator redundancy and other features of the potential. We compare results of our framework with other fitting procedures, interpreting the underlying assumptions in Bayesian probabilistic language. We also compare results from fitting all partial waves of the interaction simultaneously to cross section data compared to fitting to extracted phase shifts, appropriately accounting for correlations in the data. Supported in part by the NSF and DOE.
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
DOE Office of Scientific and Technical Information (OSTI.GOV)
Warren, Harry P.; Byers, Jeff M.; Crump, Nicholas A.
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of themore » solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.« less
Cipoli, Daniel E; Martinez, Edson Z; Castro, Margaret de; Moreira, Ayrton C
2012-12-01
To estimate the pretest probability of Cushing's syndrome (CS) diagnosis by a Bayesian approach using intuitive clinical judgment. Physicians were requested, in seven endocrinology meetings, to answer three questions: "Based on your personal expertise, after obtaining clinical history and physical examination, without using laboratorial tests, what is your probability of diagnosing Cushing's Syndrome?"; "For how long have you been practicing Endocrinology?"; and "Where do you work?". A Bayesian beta regression, using the WinBugs software was employed. We obtained 294 questionnaires. The mean pretest probability of CS diagnosis was 51.6% (95%CI: 48.7-54.3). The probability was directly related to experience in endocrinology, but not with the place of work. Pretest probability of CS diagnosis was estimated using a Bayesian methodology. Although pretest likelihood can be context-dependent, experience based on years of practice may help the practitioner to diagnosis CS.
Bayesian analysis of CCDM models
NASA Astrophysics Data System (ADS)
Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Refining value-at-risk estimates using a Bayesian Markov-switching GJR-GARCH copula-EVT model.
Sampid, Marius Galabe; Hasim, Haslifah M; Dai, Hongsheng
2018-01-01
In this paper, we propose a model for forecasting Value-at-Risk (VaR) using a Bayesian Markov-switching GJR-GARCH(1,1) model with skewed Student's-t innovation, copula functions and extreme value theory. A Bayesian Markov-switching GJR-GARCH(1,1) model that identifies non-constant volatility over time and allows the GARCH parameters to vary over time following a Markov process, is combined with copula functions and EVT to formulate the Bayesian Markov-switching GJR-GARCH(1,1) copula-EVT VaR model, which is then used to forecast the level of risk on financial asset returns. We further propose a new method for threshold selection in EVT analysis, which we term the hybrid method. Empirical and back-testing results show that the proposed VaR models capture VaR reasonably well in periods of calm and in periods of crisis.
Bayesian modelling of lung function data from multiple-breath washout tests.
Mahar, Robert K; Carlin, John B; Ranganathan, Sarath; Ponsonby, Anne-Louise; Vuillermin, Peter; Vukcevic, Damjan
2018-05-30
Paediatric respiratory researchers have widely adopted the multiple-breath washout (MBW) test because it allows assessment of lung function in unsedated infants and is well suited to longitudinal studies of lung development and disease. However, a substantial proportion of MBW tests in infants fail current acceptability criteria. We hypothesised that a model-based approach to analysing the data, in place of traditional simple empirical summaries, would enable more efficient use of these tests. We therefore developed a novel statistical model for infant MBW data and applied it to 1197 tests from 432 individuals from a large birth cohort study. We focus on Bayesian estimation of the lung clearance index, the most commonly used summary of lung function from MBW tests. Our results show that the model provides an excellent fit to the data and shed further light on statistical properties of the standard empirical approach. Furthermore, the modelling approach enables the lung clearance index to be estimated by using tests with different degrees of completeness, something not possible with the standard approach. Our model therefore allows previously unused data to be used rather than discarded, as well as routine use of shorter tests without significant loss of precision. Beyond our specific application, our work illustrates a number of important aspects of Bayesian modelling in practice, such as the importance of hierarchical specifications to account for repeated measurements and the value of model checking via posterior predictive distributions. Copyright © 2018 John Wiley & Sons, Ltd.
Zhang, Jingyang; Chaloner, Kathryn; McLinden, James H.; Stapleton, Jack T.
2013-01-01
Reconciling two quantitative ELISA tests for an antibody to an RNA virus, in a situation without a gold standard and where false negatives may occur, is the motivation for this work. False negatives occur when access of the antibody to the binding site is blocked. Based on the mechanism of the assay, a mixture of four bivariate normal distributions is proposed with the mixture probabilities depending on a two-stage latent variable model including the prevalence of the antibody in the population and the probabilities of blocking on each test. There is prior information on the prevalence of the antibody, and also on the probability of false negatives, and so a Bayesian analysis is used. The dependence between the two tests is modeled to be consistent with the biological mechanism. Bayesian decision theory is utilized for classification. The proposed method is applied to the motivating data set to classify the data into two groups: those with and those without the antibody. Simulation studies describe the properties of the estimation and the classification. Sensitivity to the choice of the prior distribution is also addressed by simulation. The same model with two levels of latent variables is applicable in other testing procedures such as quantitative polymerase chain reaction tests where false negatives occur when there is a mutation in the primer sequence. PMID:23592433
Theory of Mind: Did Evolution Fool Us?
Devaine, Marie; Hollard, Guillaume; Daunizeau, Jean
2014-01-01
Theory of Mind (ToM) is the ability to attribute mental states (e.g., beliefs and desires) to other people in order to understand and predict their behaviour. If others are rewarded to compete or cooperate with you, then what they will do depends upon what they believe about you. This is the reason why social interaction induces recursive ToM, of the sort “I think that you think that I think, etc.”. Critically, recursion is the common notion behind the definition of sophistication of human language, strategic thinking in games, and, arguably, ToM. Although sophisticated ToM is believed to have high adaptive fitness, broad experimental evidence from behavioural economics, experimental psychology and linguistics point towards limited recursivity in representing other’s beliefs. In this work, we test whether such apparent limitation may not in fact be proven to be adaptive, i.e. optimal in an evolutionary sense. First, we propose a meta-Bayesian approach that can predict the behaviour of ToM sophistication phenotypes who engage in social interactions. Second, we measure their adaptive fitness using evolutionary game theory. Our main contribution is to show that one does not have to appeal to biological costs to explain our limited ToM sophistication. In fact, the evolutionary cost/benefit ratio of ToM sophistication is non trivial. This is partly because an informational cost prevents highly sophisticated ToM phenotypes to fully exploit less sophisticated ones (in a competitive context). In addition, cooperation surprisingly favours lower levels of ToM sophistication. Taken together, these quantitative corollaries of the “social Bayesian brain” hypothesis provide an evolutionary account for both the limitation of ToM sophistication in humans as well as the persistence of low ToM sophistication levels. PMID:24505296
Gogoshin, Grigoriy; Boerwinkle, Eric
2017-01-01
Abstract Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology—type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types—single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc. PMID:27681505
Research implications of science-informed, value-based decision making.
Dowie, Jack
2004-01-01
In 'Hard' science, scientists correctly operate as the 'guardians of certainty', using hypothesis testing formulations and value judgements about error rates and time discounting that make classical inferential methods appropriate. But these methods can neither generate most of the inputs needed by decision makers in their time frame, nor generate them in a form that allows them to be integrated into the decision in an analytically coherent and transparent way. The need for transparent accountability in public decision making under uncertainty and value conflict means the analytical coherence provided by the stochastic Bayesian decision analytic approach, drawing on the outputs of Bayesian science, is needed. If scientific researchers are to play the role they should be playing in informing value-based decision making, they need to see themselves also as 'guardians of uncertainty', ensuring that the best possible current posterior distributions on relevant parameters are made available for decision making, irrespective of the state of the certainty-seeking research. The paper distinguishes the actors employing different technologies in terms of the focus of the technology (knowledge, values, choice); the 'home base' mode of their activity on the cognitive continuum of varying analysis-to-intuition ratios; and the underlying value judgements of the activity (especially error loss functions and time discount rates). Those who propose any principle of decision making other than the banal 'Best Principle', including the 'Precautionary Principle', are properly interpreted as advocates seeking to have their own value judgements and preferences regarding mode location apply. The task for accountable decision makers, and their supporting technologists, is to determine the best course of action under the universal conditions of uncertainty and value difference/conflict.
Theory of mind: did evolution fool us?
Devaine, Marie; Hollard, Guillaume; Daunizeau, Jean
2014-01-01
Theory of Mind (ToM) is the ability to attribute mental states (e.g., beliefs and desires) to other people in order to understand and predict their behaviour. If others are rewarded to compete or cooperate with you, then what they will do depends upon what they believe about you. This is the reason why social interaction induces recursive ToM, of the sort "I think that you think that I think, etc.". Critically, recursion is the common notion behind the definition of sophistication of human language, strategic thinking in games, and, arguably, ToM. Although sophisticated ToM is believed to have high adaptive fitness, broad experimental evidence from behavioural economics, experimental psychology and linguistics point towards limited recursivity in representing other's beliefs. In this work, we test whether such apparent limitation may not in fact be proven to be adaptive, i.e. optimal in an evolutionary sense. First, we propose a meta-Bayesian approach that can predict the behaviour of ToM sophistication phenotypes who engage in social interactions. Second, we measure their adaptive fitness using evolutionary game theory. Our main contribution is to show that one does not have to appeal to biological costs to explain our limited ToM sophistication. In fact, the evolutionary cost/benefit ratio of ToM sophistication is non trivial. This is partly because an informational cost prevents highly sophisticated ToM phenotypes to fully exploit less sophisticated ones (in a competitive context). In addition, cooperation surprisingly favours lower levels of ToM sophistication. Taken together, these quantitative corollaries of the "social Bayesian brain" hypothesis provide an evolutionary account for both the limitation of ToM sophistication in humans as well as the persistence of low ToM sophistication levels.
Xie, Lei; Yang, Zhi-Yun; Wen, Jun; Li, De-Zhu; Yi, Ting-Shuang
2014-08-01
Pistacia L. exhibits a disjunct distribution in Mediterranean Eurasia and adjacent North Africa, eastern Asia, and North to Central America. The spatio-temporal diversification history of Pistacia was assessed to test hypotheses on the Madrean-Tethyan and the Eurasian Tethyan disjunctions through phylogenetic and biogeographic analyses. Maximum parsimony and Bayesian methods were employed to analyze sequences of multiple nuclear and plastid loci of Pistacia species. Bayesian dating analysis was conducted to estimate the divergence times of clades. The likelihood method LAGRANGE was used to infer ancestral areas. The New World species of Pistacia formed a clade sister to the Old World clade in all phylogenetic analyses. The eastern Asian Pistacia weinmannifolia-P. cucphuongensis clade was sister to a clade of the remaining Old World species, which were further resolved into three subclades. Pistacia was estimated to have originated at 37.60 mya (with 95% highest posterior density interval (HPD): 25.42-48.51 mya). A vicariance event in the early Miocene (19.79 mya with 95% HPD: 10.88-30.36 mya) was inferred to account for the intercontinental disjunction between the New World and the Old World species, which is consistent with the Madrean-Tethyan hypothesis. The two Old World eastern Asian-Tethyan disjunctions are best explained by one vicariance event in the early Miocene (15.87 mya with 95% HPD: 8.36-24.36 mya) and one dispersal event in late Miocene (5.89 mya with 95% HPD: 2.68-9.16 mya). The diversification of the Old World Pistacia species was significantly affected by extensive geological and climatic changes in the Qinghai-Tibetan plateau (QTP) and in the Mediterranean region. Copyright © 2014 Elsevier Inc. All rights reserved.
Lira-Noriega, Andrés; Toro-Núñez, Oscar; Oaks, Jamie R; Mort, Mark E
2015-01-01
• A recurrent explanation for phylogeographic discontinuities in the Baja California Peninsula and the Sonoran Desert Region has been the association of vicariant events with Pliocene and Pleistocene seaway breaks. Nevertheless, despite its relevance for plant dispersal, other explanations such as ecological and paleoclimatic factors have received little attention. Here, we analyzed the role of several of these factors to describe the phylogeographic patterns of the desert mistletoe, Phoradendron californicum.• Using noncoding chloroplast regions, we assess the marginal probability of 19 a priori hypotheses related to geological and ecological factors to predict the cpDNA variation in P. californicum using a Bayesian coalescent framework. Complementarily, we used the macrofossil record and niche model projections on Last Glacial Maximum climatic conditions for hosts, mistletoe, and a bird specialist to interpret phylogeographic patterns.• Genealogical reconstructions revealed five clades, which suggest a combination of cryptic divergence, long-distance seed dispersal, and isolating postdivergence events. Bayesian hypothesis test favored a series of Pliocene and Pleistocene geological events related to the formation of the Baja California Peninsula and seaways across the peninsula as the most supported explanation for this genealogical pattern. However, age estimates, niche projections, and fossil records show dynamic host-mistletoe interactions and evidence of host races, indicating that ecological and geological factors have been interacting during the formation and structuring of phylogeographic divergence.• Variation in cpDNA across the species range results from the interplay of vicariant events, past climatic oscillations, and more dynamic factors related to ecological processes at finer temporal and spatial scales. © 2015 Botanical Society of America, Inc.
Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S
2017-04-01
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc.
Romero-Severson, Ethan O.; Bulla, Ingo; Hengartner, Nick; Bártolo, Inês; Abecasis, Ana; Azevedo-Pereira, José M.; Taveira, Nuno; Leitner, Thomas
2017-01-01
Diversity of the founding population of Human Immunodeficiency Virus Type 1 (HIV-1) transmissions raises many important biological, clinical, and epidemiological issues. In up to 40% of sexual infections, there is clear evidence for multiple founding variants, which can influence the efficacy of putative prevention methods, and the reconstruction of epidemiologic histories. To infer who-infected-whom, and to compute the probability of alternative transmission scenarios while explicitly taking phylogenetic uncertainty into account, we created an approximate Bayesian computation (ABC) method based on a set of statistics measuring phylogenetic topology, branch lengths, and genetic diversity. We applied our method to a suspected heterosexual transmission case involving three individuals, showing a complex monophyletic-paraphyletic-polyphyletic phylogenetic topology. We detected that seven phylogenetic lineages had been transmitted between two of the individuals based on the available samples, implying that many more unsampled lineages had also been transmitted. Testing whether the lineages had been transmitted at one time or over some length of time suggested that an ongoing superinfection process over several years was most likely. While one individual was found unlinked to the other two, surprisingly, when evaluating two competing epidemiological priors, the donor of the two that did infect each other was not identified by the host root-label, and was also not the primary suspect in that transmission. This highlights that it is important to take epidemiological information into account when analyzing support for one transmission hypothesis over another, as results may be nonintuitive and sensitive to details about sampling dates relative to possible infection dates. Our study provides a formal inference framework to include information on infection and sampling times, and to investigate ancestral node-label states, transmission direction, transmitted genetic diversity, and frequency of transmission. PMID:28912340
Bayesian analysis of multiple direct detection experiments
NASA Astrophysics Data System (ADS)
Arina, Chiara
2014-12-01
Bayesian methods offer a coherent and efficient framework for implementing uncertainties into induction problems. In this article, we review how this approach applies to the analysis of dark matter direct detection experiments. In particular we discuss the exclusion limit of XENON100 and the debated hints of detection under the hypothesis of a WIMP signal. Within parameter inference, marginalizing consistently over uncertainties to extract robust posterior probability distributions, we find that the claimed tension between XENON100 and the other experiments can be partially alleviated in isospin violating scenario, while elastic scattering model appears to be compatible with the frequentist statistical approach. We then move to model comparison, for which Bayesian methods are particularly well suited. Firstly, we investigate the annual modulation seen in CoGeNT data, finding that there is weak evidence for a modulation. Modulation models due to other physics compare unfavorably with the WIMP models, paying the price for their excessive complexity. Secondly, we confront several coherent scattering models to determine the current best physical scenario compatible with the experimental hints. We find that exothermic and inelastic dark matter are moderatly disfavored against the elastic scenario, while the isospin violating model has a similar evidence. Lastly the Bayes' factor gives inconclusive evidence for an incompatibility between the data sets of XENON100 and the hints of detection. The same question assessed with goodness of fit would indicate a 2 σ discrepancy. This suggests that more data are therefore needed to settle this question.
Explorations in statistics: hypothesis tests and P values.
Curran-Everett, Douglas
2009-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This second installment of Explorations in Statistics delves into test statistics and P values, two concepts fundamental to the test of a scientific null hypothesis. The essence of a test statistic is that it compares what we observe in the experiment to what we expect to see if the null hypothesis is true. The P value associated with the magnitude of that test statistic answers this question: if the null hypothesis is true, what proportion of possible values of the test statistic are at least as extreme as the one I got? Although statisticians continue to stress the limitations of hypothesis tests, there are two realities we must acknowledge: hypothesis tests are ingrained within science, and the simple test of a null hypothesis can be useful. As a result, it behooves us to explore the notions of hypothesis tests, test statistics, and P values.
Cox, Tony; Popken, Douglas; Ricci, Paolo F
2013-01-01
Exposures to fine particulate matter (PM2.5) in air (C) have been suspected of contributing causally to increased acute (e.g., same-day or next-day) human mortality rates (R). We tested this causal hypothesis in 100 United States cities using the publicly available NMMAPS database. Although a significant, approximately linear, statistical C-R association exists in simple statistical models, closer analysis suggests that it is not causal. Surprisingly, conditioning on other variables that have been extensively considered in previous analyses (usually using splines or other smoothers to approximate their effects), such as month of the year and mean daily temperature, suggests that they create strong, nonlinear confounding that explains the statistical association between PM2.5 and mortality rates in this data set. As this finding disagrees with conventional wisdom, we apply several different techniques to examine it. Conditional independence tests for potential causation, non-parametric classification tree analysis, Bayesian Model Averaging (BMA), and Granger-Sims causality testing, show no evidence that PM2.5 concentrations have any causal impact on increasing mortality rates. This apparent absence of a causal C-R relation, despite their statistical association, has potentially important implications for managing and communicating the uncertain health risks associated with, but not necessarily caused by, PM2.5 exposures. PMID:23983662
Bayesian non-parametric inference for stochastic epidemic models using Gaussian Processes.
Xu, Xiaoguang; Kypraios, Theodore; O'Neill, Philip D
2016-10-01
This paper considers novel Bayesian non-parametric methods for stochastic epidemic models. Many standard modeling and data analysis methods use underlying assumptions (e.g. concerning the rate at which new cases of disease will occur) which are rarely challenged or tested in practice. To relax these assumptions, we develop a Bayesian non-parametric approach using Gaussian Processes, specifically to estimate the infection process. The methods are illustrated with both simulated and real data sets, the former illustrating that the methods can recover the true infection process quite well in practice, and the latter illustrating that the methods can be successfully applied in different settings. © The Author 2016. Published by Oxford University Press.
Cross-view gait recognition using joint Bayesian
NASA Astrophysics Data System (ADS)
Li, Chao; Sun, Shouqian; Chen, Xiaoyu; Min, Xin
2017-07-01
Human gait, as a soft biometric, helps to recognize people by walking. To further improve the recognition performance under cross-view condition, we propose Joint Bayesian to model the view variance. We evaluated our prosed method with the largest population (OULP) dataset which makes our result reliable in a statically way. As a result, we confirmed our proposed method significantly outperformed state-of-the-art approaches for both identification and verification tasks. Finally, sensitivity analysis on the number of training subjects was conducted, we find Joint Bayesian could achieve competitive results even with a small subset of training subjects (100 subjects). For further comparison, experimental results, learning models, and test codes are available.
NASA Astrophysics Data System (ADS)
Kiyan, Duygu; Rath, Volker; Delhaye, Robert
2017-04-01
The frequency- and time-domain airborne electromagnetic (AEM) data collected under the Tellus projects of the Geological Survey of Ireland (GSI) which represent a wealth of information on the multi-dimensional electrical structure of Ireland's near-surface. Our project, which was funded by GSI under the framework of their Short Call Research Programme, aims to develop and implement inverse techniques based on various Bayesian methods for these densely sampled data. We have developed a highly flexible toolbox using Python language for the one-dimensional inversion of AEM data along the flight lines. The computational core is based on an adapted frequency- and time-domain forward modelling core derived from the well-tested open-source code AirBeo, which was developed by the CSIRO (Australia) and the AMIRA consortium. Three different inversion methods have been implemented: (i) Tikhonov-type inversion including optimal regularisation methods (Aster el al., 2012; Zhdanov, 2015), (ii) Bayesian MAP inversion in parameter and data space (e.g. Tarantola, 2005), and (iii) Full Bayesian inversion with Markov Chain Monte Carlo (Sambridge and Mosegaard, 2002; Mosegaard and Sambridge, 2002), all including different forms of spatial constraints. The methods have been tested on synthetic and field data. This contribution will introduce the toolbox and present case studies on the AEM data from the Tellus projects.
Li, Xi; Jang, Tae-Soo; Temsch, Eva M; Kato, Hidetoshi; Takayama, Koji; Schneeweiss, Gerald M
2017-03-01
Molecular phylogenetic studies have greatly improved our understanding of phylogenetic relationships of non-photosynthetic parasitic broomrapes (Orobanche and related genera, Orobanchaceae), but a few genera have remained unstudied. One of those is Platypholis, whose sole species, Platypholis boninsimae, is restricted to the Bonin-Islands (Ogasawara Islands) about 1000 km southeast of Japan. Based on overall morphological similarity, Platypholis has been merged with Orobanche, but this hypothesis has never been tested with molecular data. Employing maximum likelihood and Bayesian analyses on a family-wide data set (two plastid markers, matK and rps2, and three nuclear markers, ITS, phyA and phyB) as well as on an ITS data set focusing on Orobanche s. str., it is shown that P. boninsimae Maxim. is phylogenetically closely linked to or even nested within Orobanche s. str. This position is supported both by morphological evidence and by the newly obtained chromosome number of 2n = 38, which is characteristic for the genus Orobanche s. str.
NASA Astrophysics Data System (ADS)
Jones, Bernard J. T.
2017-04-01
Preface; Notation and conventions; Part I. 100 Years of Cosmology: 1. Emerging cosmology; 2. The cosmic expansion; 3. The cosmic microwave background; 4. Recent cosmology; Part II. Newtonian Cosmology: 5. Newtonian cosmology; 6. Dark energy cosmological models; 7. The early universe; 8. The inhomogeneous universe; 9. The inflationary universe; Part III. Relativistic Cosmology: 10. Minkowski space; 11. The energy momentum tensor; 12. General relativity; 13. Space-time geometry and calculus; 14. The Einstein field equations; 15. Solutions of the Einstein equations; 16. The Robertson-Walker solution; 17. Congruences, curvature and Raychaudhuri; 18. Observing and measuring the universe; Part IV. The Physics of Matter and Radiation: 19. Physics of the CMB radiation; 20. Recombination of the primeval plasma; 21. CMB polarisation; 22. CMB anisotropy; Part V. Precision Tools for Precision Cosmology: 23. Likelihood; 24. Frequentist hypothesis testing; 25. Statistical inference: Bayesian; 26. CMB data processing; 27. Parametrising the universe; 28. Precision cosmology; 29. Epilogue; Appendix A. SI, CGS and Planck units; Appendix B. Magnitudes and distances; Appendix C. Representing vectors and tensors; Appendix D. The electromagnetic field; Appendix E. Statistical distributions; Appendix F. Functions on a sphere; Appendix G. Acknowledgements; References; Index.
Are resting state spectral power measures related to executive functions in healthy young adults?
Gordon, Shirley; Todder, Doron; Deutsch, Inbal; Garbi, Dror; Getter, Nir; Meiran, Nachshon
2018-01-08
Resting-state electroencephalogram (rsEEG) has been found to be associated with psychopathology, intelligence, problem solving, academic performance and is sometimes used as a supportive physiological indicator of enhancement in cognitive training interventions (e.g. neurofeedback, working memory training). In the current study, we measured rsEEG spectral power measures (relative power, between-band ratios and asymmetry) in one hundred sixty five young adults who were also tested on a battery of executive function (EF). We specifically focused on upper Alpha, Theta and Beta frequency bands given their putative role in EF. Our indices enabled finding correlations since they had decent-to-excellent internal and retest reliability and very little range restriction relative to a nation-wide representative large sample. Nonetheless, Bayesian statistical inference indicated support for the null hypothesis concerning lack of monotonic correlation between EF and rsEEG spectral power measures. Therefore, we conclude that, contrary to the quite common interpretation, these rsEEG spectral power measures do not indicate individual differences in the measured EF abilities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Race, Ethnicity, and Exposure to Alcohol Outlets.
Morrison, Christopher; Gruenewald, Paul J; Ponicki, William R
2016-01-01
Prior studies suggest that Black and Hispanic minority populations are exposed to greater concentrations of alcohol outlets, potentially contributing to health disparities between these populations and the White majority. We tested the alternative hypothesis that urban economic systems cause outlets to concentrate in low-income areas and, controlling for these effects, lower demand among minority populations leads to fewer outlets. Market potential for alcohol sales, a surrogate for demand, was estimated from survey and census data across census block groups for 50 California cities. Hierarchical Bayesian conditional autoregressive Poisson models then estimated relationships between observed geographic distributions of outlets and the market potential for alcohol, income, population size, and racial and ethnic composition. Market potentials were significantly smaller among lower income Black, Hispanic, and Asian populations. Block groups with greater market potential and lower income had greater concentrations of outlets. When we controlled for these effects, the racial and ethnic group composition of block groups was mostly unrelated to outlet concentrations. Health disparities related to exposure to alcohol outlets are primarily driven by distributions of income and population density across neighborhoods.
Distinct Processes Drive Diversification in Different Clades of Gesneriaceae.
Roalson, Eric H; Roberts, Wade R
2016-07-01
Using a time-calibrated phylogenetic hypothesis including 768 Gesneriaceae species (out of [Formula: see text]3300 species) and more than 29,000 aligned bases from 26 gene regions, we test Gesneriaceae for diversification rate shifts and the possible proximal drivers of these shifts: geographic distributions, growth forms, and pollination syndromes. Bayesian Analysis of Macroevolutionary Mixtures analyses found five significant rate shifts in Beslerieae, core Nematanthus, core Columneinae, core Streptocarpus, and Pacific Cyrtandra These rate shifts correspond with shifts in diversification rates, as inferred by Binary State Speciation and Extinction Model and Geographic State Speciation and Extinction model, associated with hummingbird pollination, epiphytism, unifoliate growth, and geographic area. Our results suggest that diversification processes are extremely variable across Gesneriaceae clades with different combinations of characters influencing diversification rates in different clades. Diversification patterns between New and Old World lineages show dramatic differences, suggesting that the processes of diversification in Gesneriaceae are very different in these two geographic regions. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Race, Ethnicity, and Exposure to Alcohol Outlets
Morrison, Christopher; Gruenewald, Paul J.; Ponicki, William R.
2016-01-01
Objective: Prior studies suggest that Black and Hispanic minority populations are exposed to greater concentrations of alcohol outlets, potentially contributing to health disparities between these populations and the White majority. We tested the alternative hypothesis that urban economic systems cause outlets to concentrate in low-income areas and, controlling for these effects, lower demand among minority populations leads to fewer outlets. Method: Market potential for alcohol sales, a surrogate for demand, was estimated from survey and census data across census block groups for 50 California cities. Hierarchical Bayesian conditional autoregressive Poisson models then estimated relationships between observed geographic distributions of outlets and the market potential for alcohol, income, population size, and racial and ethnic composition. Results: Market potentials were significantly smaller among lower income Black, Hispanic, and Asian populations. Block groups with greater market potential and lower income had greater concentrations of outlets. When we controlled for these effects, the racial and ethnic group composition of block groups was mostly unrelated to outlet concentrations. Conclusions: Health disparities related to exposure to alcohol outlets are primarily driven by distributions of income and population density across neighborhoods. PMID:26751356
Livistona palms in Australia: ancient relics or opportunistic immigrants?
Crisp, Michael D; Isagi, Yuji; Kato, Yohei; Cook, Lyn G; Bowman, David M J S
2010-02-01
Eighteen of the 34 species of the fan palm genus Livistona (Arecaceae) are restricted to Australia and southern New Guinea, east of Wallace's Line, an ancient biogeographic boundary between the former supercontinents Laurasia and Gondwana. The remaining species extend from SE Asia to Africa, west of Wallace's Line. Competing hypotheses contend that Livistona is (a) ancient, its current distribution a relict of the supercontinents, or (b) a Miocene immigrant from the north into Australia as it drifted towards Asia. We have tested these hypotheses using Bayesian and penalized likelihood molecular dating based on 4Kb of nuclear and chloroplast DNA sequences with multiple fossil calibration points. Ancestral areas and biomes were reconstructed using parsimony and maximum likelihood. We found strong support for the second hypothesis, that a single Livistona ancestor colonized Australia from the north about 10-17Ma. Spread and diversification of the genus within Australia was likely favoured by a transition from the aseasonal wet to monsoonal biome, to which it could have been preadapted by fire-tolerance. Copyright (c) 2009 Elsevier Inc. All rights reserved.
Hrbek, Tomas; Stölting, Kai N; Bardakci, Fevzi; Küçük, Fahrettin; Wildekamp, Rudolf H; Meyer, Axel
2004-07-01
We investigated the phylogenetic relationships of Pseudophoxinus (Cyprinidae: Leuciscinae) species from central Anatolia, Turkey to test the hypothesis of geographic speciation driven by early Pliocene orogenic events. We analyzed 1141 aligned base pairs of the complete cytochrome b mitochondrial gene. Phylogenetic relationships reconstructed by maximum likelihood, Bayesian likelihood, and maximum parsimony methods are identical, and generally well supported. Species and clades are restricted to geologically well-defined units, and are deeply divergent from each other. The basal diversification of central Anatolian Pseudophoxinus is estimated to have occurred approximately 15 million years ago. Our results are in agreement with a previous study of the Anatolian fish genus Aphanius that also shows a diversification pattern driven by the Pliocene orogenic events. The distribution of clades of Aphanius and Pseudophoxinus overlap, and areas of distribution comprise the same geological units. The geological history of Anatolia is likely to have had a major impact on the diversification history of many taxa occupying central Anatolia; many of these taxa are likely to be still unrecognized as distinct. Copyright 2004 Elsevier Inc.
NASA Astrophysics Data System (ADS)
Massoud, E. C.; Vrugt, J. A.
2015-12-01
Trees and forests play a key role in controlling the water and energy balance at the land-air surface. This study reports on the calibration of an integrated soil-tree-atmosphere continuum (STAC) model using Bayesian inference with the DREAM algorithm and temporal observations of soil moisture content, matric head, sap flux, and leaf water potential from the King's River Experimental Watershed (KREW) in the southern Sierra Nevada mountain range in California. Water flow through the coupled system is described using the Richards' equation with both the soil and tree modeled as a porous medium with nonlinear soil and tree water relationships. Most of the model parameters appear to be reasonably well defined by calibration against the observed data. The posterior mean simulation reproduces the observed soil and tree data quite accurately, but a systematic mismatch is observed between early afternoon measured and simulated sap fluxes. We will show how this points to a structural error in the STAC-model and suggest and test an alternative hypothesis for root water uptake that alleviates this problem.
Testing for ontological errors in probabilistic forecasting models of natural systems
Marzocchi, Warner; Jordan, Thomas H.
2014-01-01
Probabilistic forecasting models describe the aleatory variability of natural systems as well as our epistemic uncertainty about how the systems work. Testing a model against observations exposes ontological errors in the representation of a system and its uncertainties. We clarify several conceptual issues regarding the testing of probabilistic forecasting models for ontological errors: the ambiguity of the aleatory/epistemic dichotomy, the quantification of uncertainties as degrees of belief, the interplay between Bayesian and frequentist methods, and the scientific pathway for capturing predictability. We show that testability of the ontological null hypothesis derives from an experimental concept, external to the model, that identifies collections of data, observed and not yet observed, that are judged to be exchangeable when conditioned on a set of explanatory variables. These conditional exchangeability judgments specify observations with well-defined frequencies. Any model predicting these behaviors can thus be tested for ontological error by frequentist methods; e.g., using P values. In the forecasting problem, prior predictive model checking, rather than posterior predictive checking, is desirable because it provides more severe tests. We illustrate experimental concepts using examples from probabilistic seismic hazard analysis. Severe testing of a model under an appropriate set of experimental concepts is the key to model validation, in which we seek to know whether a model replicates the data-generating process well enough to be sufficiently reliable for some useful purpose, such as long-term seismic forecasting. Pessimistic views of system predictability fail to recognize the power of this methodology in separating predictable behaviors from those that are not. PMID:25097265
Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration.
Conner, Mary M; Saunders, W Carl; Bouwes, Nicolaas; Jordan, Chris
2015-10-01
Before-after-control-impact (BACI) designs are an effective method to evaluate natural and human-induced perturbations on ecological variables when treatment sites cannot be randomly chosen. While effect sizes of interest can be tested with frequentist methods, using Bayesian Markov chain Monte Carlo (MCMC) sampling methods, probabilities of effect sizes, such as a ≥20 % increase in density after restoration, can be directly estimated. Although BACI and Bayesian methods are used widely for assessing natural and human-induced impacts for field experiments, the application of hierarchal Bayesian modeling with MCMC sampling to BACI designs is less common. Here, we combine these approaches and extend the typical presentation of results with an easy to interpret ratio, which provides an answer to the main study question-"How much impact did a management action or natural perturbation have?" As an example of this approach, we evaluate the impact of a restoration project, which implemented beaver dam analogs, on survival and density of juvenile steelhead. Results indicated the probabilities of a ≥30 % increase were high for survival and density after the dams were installed, 0.88 and 0.99, respectively, while probabilities for a higher increase of ≥50 % were variable, 0.17 and 0.82, respectively. This approach demonstrates a useful extension of Bayesian methods that can easily be generalized to other study designs from simple (e.g., single factor ANOVA, paired t test) to more complicated block designs (e.g., crossover, split-plot). This approach is valuable for estimating the probabilities of restoration impacts or other management actions.
Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan
2015-01-01
Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.
Boehm, Udo; Steingroever, Helen; Wagenmakers, Eric-Jan
2018-06-01
An important tool in the advancement of cognitive science are quantitative models that represent different cognitive variables in terms of model parameters. To evaluate such models, their parameters are typically tested for relationships with behavioral and physiological variables that are thought to reflect specific cognitive processes. However, many models do not come equipped with the statistical framework needed to relate model parameters to covariates. Instead, researchers often revert to classifying participants into groups depending on their values on the covariates, and subsequently comparing the estimated model parameters between these groups. Here we develop a comprehensive solution to the covariate problem in the form of a Bayesian regression framework. Our framework can be easily added to existing cognitive models and allows researchers to quantify the evidential support for relationships between covariates and model parameters using Bayes factors. Moreover, we present a simulation study that demonstrates the superiority of the Bayesian regression framework to the conventional classification-based approach.
Mushet, David M.; Euliss, Ned H.; Chen, Yongjiu; Stockwell, Craig A.
2013-01-01
In contrast to most local amphibian populations, northeastern populations of the Northern Leopard Frog (Lithobates pipiens) have displayed uncharacteristically high levels of genetic diversity that have been attributed to large, stable populations. However, this widely distributed species also occurs in areas known for great climatic fluctuations that should be reflected in corresponding fluctuations in population sizes and reduced genetic diversity. To test our hypothesis that Northern Leopard Frog genetic diversity would be reduced in areas subjected to significant climate variability, we examined the genetic diversity of L. pipiens collected from 12 sites within the Prairie Pothole Region of North Dakota. Despite the region's fluctuating climate that includes periods of recurring drought and deluge, we found unexpectedly high levels of genetic diversity approaching that of northeastern populations. Further, genetic structure at a landscape scale was strikingly homogeneous; genetic differentiation estimates (Dest) averaged 0.10 (SD = 0.036) across the six microsatellite loci we studied, and two Bayesian assignment tests (STRUCTURE and BAPS) failed to reveal the development of significant population structure across the 68 km breadth of our study area. These results suggest that L. pipiens in the Prairie Pothole Region consists of a large, panmictic population capable of maintaining high genetic diversity in the face of marked climate variability.
A data driven model for dune morphodynamics
NASA Astrophysics Data System (ADS)
Palmsten, M.; Brodie, K.; Spore, N.
2016-12-01
Dune morphology results from a number of competing feedbacks between wave, Aeolian, and biologic processes. Only now are conceptual and numerical models for dunes beginning to incorporate all aspects of the processes driving morphodynamics. Drawing on a 35-year record of observations of dune morphology and forcing conditions at the Army Corps of Engineers Field Research Facility (FRF) at Duck, NC, USA, we hypothesize that local dune morphology results from the competition between dune growth during dry windy periods and erosion during storms. We test our hypothesis by developing a data driven model using a Bayesian network to hindcast dune-crest elevation change, dune position change, and shoreline position change. Model inputs include a description of dune morphology from dune-crest elevation, dune-base elevation, dune width, and beach width. Wave forcing and the effect of moisture is parameterized in terms of the maximum total water level and period that waves impact the dunes, along with precipitation. Aeolian forcing is parameterized in terms of maximum wind speed, direction and period that wind exceeds a critical value for sediment transport. We test the sensitivity of our model to forcing parameters and hindcast the 35-year record of dune morphodynamics at the FRF. We also discuss the role of vegetation on dune morphologic differences observed at the FRF.
Attigala, Lakshmi; Wysocki, William P; Duvall, Melvin R; Clark, Lynn G
2016-08-01
We explored phylogenetic relationships among the twelve lineages of the temperate woody bamboo clade (tribe Arundinarieae) based on plastid genome (plastome) sequence data. A representative sample of 28 taxa was used and maximum parsimony, maximum likelihood and Bayesian inference analyses were conducted to estimate the Arundinarieae phylogeny. All the previously recognized clades of Arundinarieae were supported, with Ampelocalamus calcareus (Clade XI) as sister to the rest of the temperate woody bamboos. Well supported sister relationships between Bergbambos tessellata (Clade I) and Thamnocalamus spathiflorus (Clade VII) and between Kuruna (Clade XII) and Chimonocalmus (Clade III) were revealed by the current study. The plastome topology was tested by taxon removal experiments and alternative hypothesis testing and the results supported the current plastome phylogeny as robust. Neighbor-net analyses showed few phylogenetic signal conflicts, but suggested some potentially complex relationships among these taxa. Analyses of morphological character evolution of rhizomes and reproductive structures revealed that pachymorph rhizomes were most likely the ancestral state in Arundinarieae. In contrast leptomorph rhizomes either evolved once with reversions to the pachymorph condition or multiple times in Arundinarieae. Further, pseudospikelets evolved independently at least twice in the Arundinarieae, but the ancestral state is ambiguous. Copyright © 2016 Elsevier Inc. All rights reserved.
Feng, Hao; Conneely, Karen N.; Wu, Hao
2014-01-01
DNA methylation is an important epigenetic modification that has essential roles in cellular processes including gene regulation, development and disease and is widely dysregulated in most types of cancer. Recent advances in sequencing technology have enabled the measurement of DNA methylation at single nucleotide resolution through methods such as whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In DNA methylation studies, a key task is to identify differences under distinct biological contexts, for example, between tumor and normal tissue. A challenge in sequencing studies is that the number of biological replicates is often limited by the costs of sequencing. The small number of replicates leads to unstable variance estimation, which can reduce accuracy to detect differentially methylated loci (DML). Here we propose a novel statistical method to detect DML when comparing two treatment groups. The sequencing counts are described by a lognormal-beta-binomial hierarchical model, which provides a basis for information sharing across different CpG sites. A Wald test is developed for hypothesis testing at each CpG site. Simulation results show that the proposed method yields improved DML detection compared to existing methods, particularly when the number of replicates is low. The proposed method is implemented in the Bioconductor package DSS. PMID:24561809
Study of the top quark electric charge at the CDF experiment (in Slovak)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bartos, Pavol
We report on the measurement of the top quark electric charge using the jet charge tagging method on events containing a single lepton collected by the CDF II detector at Fermilab between February 2002 and February 2010 at the center-of-mass energy √s = 1.96 TeV. There are three main components to this measurement: determining the charge of the W (using the charge of the lepton), pairing the W with the b-jet to ensure that they are from the same top decay branch and finally determining the charge of the b-jet using the Jet Charge algorithm. We found, on a samplemore » of 5.6 fb -1 of data, that the p-value under the standard model hypothesis is equal to 13.4%, while the p-value under the exotic model hypothesis is equal to 0.014%. Using the a priori criteria generally accepted by the CDF collaboration, we can say that the result is consistent with the standard model, while we exclude an exotic quark hypothesis with 95% confidence. Using the Bayesian approach, we obtain for the Bayes factor (2ln(BF)) a value of 19.6, that favors very strongly the SM hypothesis over the XM one. The presented method has the highest sensitivity to the top quark electric charge among the presented so far top quark charge analysis.« less
Bayesian convolutional neural network based MRI brain extraction on nonhuman primates.
Zhao, Gengyan; Liu, Fang; Oler, Jonathan A; Meyerand, Mary E; Kalin, Ned H; Birn, Rasmus M
2018-07-15
Brain extraction or skull stripping of magnetic resonance images (MRI) is an essential step in neuroimaging studies, the accuracy of which can severely affect subsequent image processing procedures. Current automatic brain extraction methods demonstrate good results on human brains, but are often far from satisfactory on nonhuman primates, which are a necessary part of neuroscience research. To overcome the challenges of brain extraction in nonhuman primates, we propose a fully-automated brain extraction pipeline combining deep Bayesian convolutional neural network (CNN) and fully connected three-dimensional (3D) conditional random field (CRF). The deep Bayesian CNN, Bayesian SegNet, is used as the core segmentation engine. As a probabilistic network, it is not only able to perform accurate high-resolution pixel-wise brain segmentation, but also capable of measuring the model uncertainty by Monte Carlo sampling with dropout in the testing stage. Then, fully connected 3D CRF is used to refine the probability result from Bayesian SegNet in the whole 3D context of the brain volume. The proposed method was evaluated with a manually brain-extracted dataset comprising T1w images of 100 nonhuman primates. Our method outperforms six popular publicly available brain extraction packages and three well-established deep learning based methods with a mean Dice coefficient of 0.985 and a mean average symmetric surface distance of 0.220 mm. A better performance against all the compared methods was verified by statistical tests (all p-values < 10 -4 , two-sided, Bonferroni corrected). The maximum uncertainty of the model on nonhuman primate brain extraction has a mean value of 0.116 across all the 100 subjects. The behavior of the uncertainty was also studied, which shows the uncertainty increases as the training set size decreases, the number of inconsistent labels in the training set increases, or the inconsistency between the training set and the testing set increases. Copyright © 2018 Elsevier Inc. All rights reserved.
Optimal Bayesian Adaptive Design for Test-Item Calibration.
van der Linden, Wim J; Ren, Hao
2015-06-01
An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.
Learning oncogenetic networks by reducing to mixed integer linear programming.
Shahrabi Farahani, Hossein; Lagergren, Jens
2013-01-01
Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease. In order to model cancer progression, we propose Progression Networks, a special case of Bayesian networks, that are tailored to model disease progression. Progression networks have similarities with Conjunctive Bayesian Networks (CBNs) [1],a variation of Bayesian networks also proposed for modeling disease progression. We also describe a learning algorithm for learning Bayesian networks in general and progression networks in particular. We reduce the hard problem of learning the Bayesian and progression networks to Mixed Integer Linear Programming (MILP). MILP is a Non-deterministic Polynomial-time complete (NP-complete) problem for which very good heuristics exists. We tested our algorithm on synthetic and real cytogenetic data from renal cell carcinoma. We also compared our learned progression networks with the networks proposed in earlier publications. The software is available on the website https://bitbucket.org/farahani/diprog.
Kalil, Andre C; Sun, Junfeng
2014-10-01
To review Bayesian methodology and its utility to clinical decision making and research in the critical care field. Clinical, epidemiological, and biostatistical studies on Bayesian methods in PubMed and Embase from their inception to December 2013. Bayesian methods have been extensively used by a wide range of scientific fields, including astronomy, engineering, chemistry, genetics, physics, geology, paleontology, climatology, cryptography, linguistics, ecology, and computational sciences. The application of medical knowledge in clinical research is analogous to the application of medical knowledge in clinical practice. Bedside physicians have to make most diagnostic and treatment decisions on critically ill patients every day without clear-cut evidence-based medicine (more subjective than objective evidence). Similarly, clinical researchers have to make most decisions about trial design with limited available data. Bayesian methodology allows both subjective and objective aspects of knowledge to be formally measured and transparently incorporated into the design, execution, and interpretation of clinical trials. In addition, various degrees of knowledge and several hypotheses can be tested at the same time in a single clinical trial without the risk of multiplicity. Notably, the Bayesian technology is naturally suited for the interpretation of clinical trial findings for the individualized care of critically ill patients and for the optimization of public health policies. We propose that the application of the versatile Bayesian methodology in conjunction with the conventional statistical methods is not only ripe for actual use in critical care clinical research but it is also a necessary step to maximize the performance of clinical trials and its translation to the practice of critical care medicine.
Palmprint identification using FRIT
NASA Astrophysics Data System (ADS)
Kisku, D. R.; Rattani, A.; Gupta, P.; Hwang, C. J.; Sing, J. K.
2011-06-01
This paper proposes a palmprint identification system using Finite Ridgelet Transform (FRIT) and Bayesian classifier. FRIT is applied on the ROI (region of interest), which is extracted from palmprint image, to extract a set of distinctive features from palmprint image. These features are used to classify with the help of Bayesian classifier. The proposed system has been tested on CASIA and IIT Kanpur palmprint databases. The experimental results reveal better performance compared to all well known systems.
Impact of Federal drug law enforcement on the supply of heroin in Australia.
Smithson, Michael; McFadden, Michael; Mwesigye, Sue-Ellen
2005-08-01
To conduct an empirical investigation of the efficacy of law enforcement in reducing heroin supply in Australia. Specifically, this paper addresses the question of whether heroin purity levels in the Australian Capital Territory (ACT) could be predicted by heroin seizures at the national level by the Australian Federal Police (AFP) in the preceding year. We considered two forms of evidence. First, a Bayesian Markov Chain Monte Carlo (MCMC) change-point model was used to discover (a) if there was a substantial increase in heroin seizures by the AFP, (b) when the increase began and (c) whether it occurred after increased funding to the Australian Federal Police for the purpose of drug law enforcement. Second, standard time-series methods were used to ascertain whether fluctuations in heroin seizure weights or the frequency of large-scale seizures after the aforementioned changes in seizure levels predicted fluctuations in heroin purity levels in the ACT after autocorrelation had been removed from the purity series. A Bayesian MCMC change-point model supported the hypothesis that heroin seizures rapidly increased about a year before the estimated decline in heroin purity and after the increased funding of AFP. The autoregression models suggested that 10-20% of the variance in the residuals of the heroin purity series was predicted by appropriately lagged residuals of the seizure-number and log-weight series, after autocorrelation had been removed. The overall results are consistent with the hypothesis that large-scale heroin seizures by the AFP reduce street-level heroin supply a year or so later, although the short-term dynamics suggest an 'opponent' response to residual fluctuations in seizures. To our knowledge, this is first time a connection has been identified between large-scale heroin seizures and street-level supply.
NASA Astrophysics Data System (ADS)
Le Bras, Ronan; Kushida, Noriyuki; Mialle, Pierrick; Tomuta, Elena; Arora, Nimar
2017-04-01
The Preparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO) has been developing a Bayesian method and software to perform the key step of automatic association of seismological, hydroacoustic, and infrasound (SHI) parametric data. In our preliminary testing in the CTBTO, NET_VISA shows much better performance than its currently operating automatic association module, with a rate for automatic events matching the analyst-reviewed events increased by 10%, signifying that the percentage of missed events is lowered by 40%. Initial tests involving analysts also showed that the new software will complete the automatic bulletins of the CTBTO by adding previously missed events. Because products by the CTBTO are also widely distributed to its member States as well as throughout the seismological community, the introduction of a new technology must be carried out carefully, and the first step of operational integration is to first use NET-VISA results within the interactive analysts' software so that the analysts can check the robustness of the Bayesian approach. We report on the latest results both on the progress for automatic processing and for the initial introduction of NET-VISA results in the analyst review process
Clinical trial designs for testing biomarker-based personalized therapies
Lai, Tze Leung; Lavori, Philip W; Shih, Mei-Chiung I; Sikic, Branimir I
2014-01-01
Background Advances in molecular therapeutics in the past decade have opened up new possibilities for treating cancer patients with personalized therapies, using biomarkers to determine which treatments are most likely to benefit them, but there are difficulties and unresolved issues in the development and validation of biomarker-based personalized therapies. We develop a new clinical trial design to address some of these issues. The goal is to capture the strengths of the frequentist and Bayesian approaches to address this problem in the recent literature and to circumvent their limitations. Methods We use generalized likelihood ratio tests of the intersection null and enriched strategy null hypotheses to derive a novel clinical trial design for the problem of advancing promising biomarker-guided strategies toward eventual validation. We also investigate the usefulness of adaptive randomization (AR) and futility stopping proposed in the recent literature. Results Simulation studies demonstrate the advantages of testing both the narrowly focused enriched strategy null hypothesis related to validating a proposed strategy and the intersection null hypothesis that can accommodate to a potentially successful strategy. AR and early termination of ineffective treatments offer increased probability of receiving the preferred treatment and better response rates for patients in the trial, at the expense of more complicated inference under small-to-moderate total sample sizes and some reduction in power. Limitations The binary response used in the development phase may not be a reliable indicator of treatment benefit on long-term clinical outcomes. In the proposed design, the biomarker-guided strategy (BGS) is not compared to ‘standard of care’, such as physician’s choice that may be informed by patient characteristics. Therefore, a positive result does not imply superiority of the BGS to ‘standard of care’. The proposed design and tests are valid asymptotically. Simulations are used to examine small-to-moderate sample properties. Conclusion Innovative clinical trial designs are needed to address the difficulties and issues in the development and validation of biomarker-based personalized therapies. The article shows the advantages of using likelihood inference and interim analysis to meet the challenges in the sample size needed and in the constantly evolving biomarker landscape and genomic and proteomic technologies. PMID:22397801
Probabilistic Model for Untargeted Peak Detection in LC-MS Using Bayesian Statistics.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2015-07-21
We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography-mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a chromatogram are affected by a chromatographic peak and which ones are only affected by noise. The use of probabilities contrasts with the traditional method in which a binary answer is given, relying on a threshold. By contrast, with the Bayesian peak detection presented here, the values of probability can be further propagated into other preprocessing steps, which will increase (or decrease) the importance of chromatographic regions into the final results. The present work is based on the use of the statistical overlap theory of component overlap from Davis and Giddings (Davis, J. M.; Giddings, J. Anal. Chem. 1983, 55, 418-424) as prior probability in the Bayesian formulation. The algorithm was tested on LC-MS Orbitrap data and was able to successfully distinguish chemical noise from actual peaks without any data preprocessing.
Seeking health information on the web: positive hypothesis testing.
Kayhan, Varol Onur
2013-04-01
The goal of this study is to investigate positive hypothesis testing among consumers of health information when they search the Web. After demonstrating the extent of positive hypothesis testing using Experiment 1, we conduct Experiment 2 to test the effectiveness of two debiasing techniques. A total of 60 undergraduate students searched a tightly controlled online database developed by the authors to test the validity of a hypothesis. The database had four abstracts that confirmed the hypothesis and three abstracts that disconfirmed it. Findings of Experiment 1 showed that majority of participants (85%) exhibited positive hypothesis testing. In Experiment 2, we found that the recommendation technique was not effective in reducing positive hypothesis testing since none of the participants assigned to this server could retrieve disconfirming evidence. Experiment 2 also showed that the incorporation technique successfully reduced positive hypothesis testing since 75% of the participants could retrieve disconfirming evidence. Positive hypothesis testing on the Web is an understudied topic. More studies are needed to validate the effectiveness of the debiasing techniques discussed in this study and develop new techniques. Search engine developers should consider developing new options for users so that both confirming and disconfirming evidence can be presented in search results as users test hypotheses using search engines. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Combining evidence using likelihood ratios in writer verification
NASA Astrophysics Data System (ADS)
Srihari, Sargur; Kovalenko, Dimitry; Tang, Yi; Ball, Gregory
2013-01-01
Forensic identification is the task of determining whether or not observed evidence arose from a known source. It involves determining a likelihood ratio (LR) - the ratio of the joint probability of the evidence and source under the identification hypothesis (that the evidence came from the source) and under the exclusion hypothesis (that the evidence did not arise from the source). In LR- based decision methods, particularly handwriting comparison, a variable number of input evidences is used. A decision based on many pieces of evidence can result in nearly the same LR as one based on few pieces of evidence. We consider methods for distinguishing between such situations. One of these is to provide confidence intervals together with the decisions and another is to combine the inputs using weights. We propose a new method that generalizes the Bayesian approach and uses an explicitly defined discount function. Empirical evaluation with several data sets including synthetically generated ones and handwriting comparison shows greater flexibility of the proposed method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Baeza, J Antonio
2013-10-01
The 'Tomlinson-Ghiselin' hypothesis (TGh) predicts that outcrossing simultaneous hermaphroditism (SH) is advantageous when population density is low because the probability of finding sexual partners is negligible. In shrimps from the family Lysmatidae, Bauer's historical contingency hypothesis (HCh) suggests that SH evolved in an ancestral tropical species that adopted a symbiotic lifestyle with, e.g., sea anemones and became a specialized fish-cleaner. Restricted mobility of shrimps due to their association with a host, and hence, reduced probability of encountering mating partners, would have favored SH. The HCh is a special case of the TGh. Herein, I examined within a phylogenetic framework whether the TGh/HCh explains the origin of SH in shrimps. A phylogeny of caridean broken-back shrimps in the families Lysmatidae, Barbouriidae, Merguiidae was first developed using nuclear and mitochondrial makers. Complete evidence phylogenetic analyses using maximum likelihood (ML) and Bayesian inference (BI) demonstrated that Lysmatidae+Barbouriidae are monophyletic. In turn, Merguiidae is sister to the Lysmatidae+Barbouriidae. ML and BI ancestral character-state reconstruction in the resulting phylogenetic trees indicated that the ancestral Lysmatidae was either gregarious or lived in small groups and was not symbiotic. Four different evolutionary transitions from a free-living to a symbiotic lifestyle occurred in shrimps. Therefore, the evolution of SH in shrimps cannot be explained by the TGh/HCh; reduced probability of encountering mating partners in an ancestral species due to its association with a sessile host did not favor SH in the Lysmatidae. It is proposed that two conditions acting together in the past; low male mating opportunities and brooding constraints, might have favored SH in the ancestral Lysmatidae+Barbouridae. Additional studies on the life history and phylogenetics of broken-back shrimps are needed to understand the evolution of SH in the ecologically diverse Caridea. Copyright © 2013 Elsevier Inc. All rights reserved.
Testing the Millennial-Scale Holocene Solar-Climate Connection in the Indo-Pacific Warm Pool
NASA Astrophysics Data System (ADS)
Khider, D.; Emile-Geay, J.; McKay, N.; Jackson, C. S.; Routson, C.
2016-12-01
The existence of 1000 and 2500-year periodicities found in reconstructions of total solar irradiance (TSI) and a number of Holocene climate records has led to the hypothesis of a causal relationship. However, attributing Holocene millennial-scale variability to solar forcing requires a mechanism by which small changes in total irradiance can influence a global climate response. One possible amplifier within the climate system is the ocean. If this is the case, then we need to know more about where and how this may be occurring. On the other hand, the similarity in spectral peaks could be merely coincidental, and this should be made apparent by a lack of coherence in how that power and phasing are distributed in time and space. The plausibility of the solar forcing hypothesis is assessed through a Bayesian model of the age uncertainties affecting marine sedimentary records that is propagated through spectral analysis of the climate and forcing signals at key frequencies. Preliminary work on Mg/Ca and alkenone records from the Indo-Pacific Warm Pool suggests that despite large uncertainties in the location of the spectral peaks within each individual record arising from age model uncertainty, sea surface variability on timescales of 1025±36 years and 2427±133 years (±standard error of the mean of the median periodicity in each record) are present in at least 95% and 70% of the ensemble spectra, respectively. However, we find a long phase delay between the peak in forcing and the maximum response in at least one of the records, challenging the solar forcing hypothesis and requiring further investigation between low- and high-latitude signals. Remarkably, all records suggest a periodicity near 1470±85 years, reminiscent of the cycles characteristic of Marine Isotope Stage 3; these cycles are absent from existing records of TSI, further questioning the millennial solar-climate connection.
NASA Astrophysics Data System (ADS)
Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.; Hammond, Glenn E.; Rockhold, Mark L.; Zachara, John M.; Rubin, Yoram
2012-06-01
Tracer tests performed under natural or forced gradient flow conditions can provide useful information for characterizing subsurface properties, through monitoring, modeling, and interpretation of the tracer plume migration in an aquifer. Nonreactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter tests. A Bayesian data assimilation technique, the method of anchored distributions (MAD) (Rubin et al., 2010), was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of the Hanford formation.In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using constant-rate injection and borehole flowmeter test data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field tracer data at the Hanford 300 Area demonstrates that inverting for spatial heterogeneity of hydraulic conductivity under transient flow conditions is challenging and more work is needed.
Two-Stage Bayesian Model Averaging in Endogenous Variable Models*
Lenkoski, Alex; Eicher, Theo S.; Raftery, Adrian E.
2013-01-01
Economic modeling in the presence of endogeneity is subject to model uncertainty at both the instrument and covariate level. We propose a Two-Stage Bayesian Model Averaging (2SBMA) methodology that extends the Two-Stage Least Squares (2SLS) estimator. By constructing a Two-Stage Unit Information Prior in the endogenous variable model, we are able to efficiently combine established methods for addressing model uncertainty in regression models with the classic technique of 2SLS. To assess the validity of instruments in the 2SBMA context, we develop Bayesian tests of the identification restriction that are based on model averaged posterior predictive p-values. A simulation study showed that 2SBMA has the ability to recover structure in both the instrument and covariate set, and substantially improves the sharpness of resulting coefficient estimates in comparison to 2SLS using the full specification in an automatic fashion. Due to the increased parsimony of the 2SBMA estimate, the Bayesian Sargan test had a power of 50 percent in detecting a violation of the exogeneity assumption, while the method based on 2SLS using the full specification had negligible power. We apply our approach to the problem of development accounting, and find support not only for institutions, but also for geography and integration as development determinants, once both model uncertainty and endogeneity have been jointly addressed. PMID:24223471
A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
Antoneli, Fernando; Passos, Fernando M.; Lopes, Luciano R.
2018-01-01
Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power. PMID:29300759
Saha, Sreemanti; Narang, Rahul; Deshmukh, Pradeep; Pote, Kiran; Anvikar, Anup; Narang, Pratibha
2017-01-01
The diagnostic techniques for malaria are undergoing a change depending on the availability of newer diagnostics and annual parasite index of infection in a particular area. At the country level, guidelines are available for selection of diagnostic tests; however, at the local level, this decision is made based on malaria situation in the area. The tests are evaluated against the gold standard, and if that standard has limitations, it becomes difficult to compare other available tests. Bayesian latent class analysis computes its internal standard rather than using the conventional gold standard and helps comparison of various tests including the conventional gold standard. In a cross-sectional study conducted in a tertiary care hospital setting, we have evaluated smear microscopy, rapid diagnostic test (RDT), and polymerase chain reaction (PCR) for diagnosis of malaria using Bayesian latent class analysis. We found the magnitude of malaria to be 17.7% (95% confidence interval: 12.5%-23.9%) among the study subjects. In the present study, the sensitivity of microscopy was 63%, but it had very high specificity (99.4%). Sensitivity and specificity of RDT and PCR were high with RDT having a marginally higher sensitivity (94% vs. 90%) and specificity (99% vs. 95%). On comparison of likelihood ratios (LRs), RDT had the highest LR for positive test result (175) and the lowest LR for negative test result (0.058) among the three tests. In settings like ours conventional smear microscopy may be replaced with RDT and as we move toward elimination and facilities become available PCR may be roped into detect cases with lower parasitaemia.
Bayesian Tracking of Emerging Epidemics Using Ensemble Optimal Statistical Interpolation
Cobb, Loren; Krishnamurthy, Ashok; Mandel, Jan; Beezley, Jonathan D.
2014-01-01
We present a preliminary test of the Ensemble Optimal Statistical Interpolation (EnOSI) method for the statistical tracking of an emerging epidemic, with a comparison to its popular relative for Bayesian data assimilation, the Ensemble Kalman Filter (EnKF). The spatial data for this test was generated by a spatial susceptible-infectious-removed (S-I-R) epidemic model of an airborne infectious disease. Both tracking methods in this test employed Poisson rather than Gaussian noise, so as to handle epidemic data more accurately. The EnOSI and EnKF tracking methods worked well on the main body of the simulated spatial epidemic, but the EnOSI was able to detect and track a distant secondary focus of infection that the EnKF missed entirely. PMID:25113590
Bayesian inference to identify parameters in viscoelasticity
NASA Astrophysics Data System (ADS)
Rappel, Hussein; Beex, Lars A. A.; Bordas, Stéphane P. A.
2017-08-01
This contribution discusses Bayesian inference (BI) as an approach to identify parameters in viscoelasticity. The aims are: (i) to show that the prior has a substantial influence for viscoelasticity, (ii) to show that this influence decreases for an increasing number of measurements and (iii) to show how different types of experiments influence the identified parameters and their uncertainties. The standard linear solid model is the material description of interest and a relaxation test, a constant strain-rate test and a creep test are the tensile experiments focused on. The experimental data are artificially created, allowing us to make a one-to-one comparison between the input parameters and the identified parameter values. Besides dealing with the aforementioned issues, we believe that this contribution forms a comprehensible start for those interested in applying BI in viscoelasticity.
Trigram-based algorithms for OCR result correction
NASA Astrophysics Data System (ADS)
Bulatov, Konstantin; Manzhikov, Temudzhin; Slavin, Oleg; Faradjev, Igor; Janiszewski, Igor
2017-03-01
In this paper we consider a task of improving optical character recognition (OCR) results of document fields on low-quality and average-quality images using N-gram models. Cyrillic fields of Russian Federation internal passport are analyzed as an example. Two approaches are presented: the first one is based on hypothesis of dependence of a symbol from two adjacent symbols and the second is based on calculation of marginal distributions and Bayesian networks computation. A comparison of the algorithms and experimental results within a real document OCR system are presented, it's showed that the document field OCR accuracy can be improved by more than 6% for low-quality images.
Identification of transmissivity fields using a Bayesian strategy and perturbative approach
NASA Astrophysics Data System (ADS)
Zanini, Andrea; Tanda, Maria Giovanna; Woodbury, Allan D.
2017-10-01
The paper deals with the crucial problem of the groundwater parameter estimation that is the basis for efficient modeling and reclamation activities. A hierarchical Bayesian approach is developed: it uses the Akaike's Bayesian Information Criteria in order to estimate the hyperparameters (related to the covariance model chosen) and to quantify the unknown noise variance. The transmissivity identification proceeds in two steps: the first, called empirical Bayesian interpolation, uses Y* (Y = lnT) observations to interpolate Y values on a specified grid; the second, called empirical Bayesian update, improve the previous Y estimate through the addition of hydraulic head observations. The relationship between the head and the lnT has been linearized through a perturbative solution of the flow equation. In order to test the proposed approach, synthetic aquifers from literature have been considered. The aquifers in question contain a variety of boundary conditions (both Dirichelet and Neuman type) and scales of heterogeneities (σY2 = 1.0 and σY2 = 5.3). The estimated transmissivity fields were compared to the true one. The joint use of Y* and head measurements improves the estimation of Y considering both degrees of heterogeneity. Even if the variance of the strong transmissivity field can be considered high for the application of the perturbative approach, the results show the same order of approximation of the non-linear methods proposed in literature. The procedure allows to compute the posterior probability distribution of the target quantities and to quantify the uncertainty in the model prediction. Bayesian updating has advantages related both to the Monte-Carlo (MC) and non-MC approaches. In fact, as the MC methods, Bayesian updating allows computing the direct posterior probability distribution of the target quantities and as non-MC methods it has computational times in the order of seconds.
NASA Astrophysics Data System (ADS)
Chung, Hye Won; Guha, Saikat; Zheng, Lizhong
2017-07-01
We study the problem of designing optical receivers to discriminate between multiple coherent states using coherent processing receivers—i.e., one that uses arbitrary coherent feedback control and quantum-noise-limited direct detection—which was shown by Dolinar to achieve the minimum error probability in discriminating any two coherent states. We first derive and reinterpret Dolinar's binary-hypothesis minimum-probability-of-error receiver as the one that optimizes the information efficiency at each time instant, based on recursive Bayesian updates within the receiver. Using this viewpoint, we propose a natural generalization of Dolinar's receiver design to discriminate M coherent states, each of which could now be a codeword, i.e., a sequence of N coherent states, each drawn from a modulation alphabet. We analyze the channel capacity of the pure-loss optical channel with a general coherent-processing receiver in the low-photon number regime and compare it with the capacity achievable with direct detection and the Holevo limit (achieving the latter would require a quantum joint-detection receiver). We show compelling evidence that despite the optimal performance of Dolinar's receiver for the binary coherent-state hypothesis test (either in error probability or mutual information), the asymptotic communication rate achievable by such a coherent-processing receiver is only as good as direct detection. This suggests that in the infinitely long codeword limit, all potential benefits of coherent processing at the receiver can be obtained by designing a good code and direct detection, with no feedback within the receiver.
Bayesian characterization of uncertainty in species interaction strengths.
Wolf, Christopher; Novak, Mark; Gitelman, Alix I
2017-06-01
Considerable effort has been devoted to the estimation of species interaction strengths. This effort has focused primarily on statistical significance testing and obtaining point estimates of parameters that contribute to interaction strength magnitudes, leaving the characterization of uncertainty associated with those estimates unconsidered. We consider a means of characterizing the uncertainty of a generalist predator's interaction strengths by formulating an observational method for estimating a predator's prey-specific per capita attack rates as a Bayesian statistical model. This formulation permits the explicit incorporation of multiple sources of uncertainty. A key insight is the informative nature of several so-called non-informative priors that have been used in modeling the sparse data typical of predator feeding surveys. We introduce to ecology a new neutral prior and provide evidence for its superior performance. We use a case study to consider the attack rates in a New Zealand intertidal whelk predator, and we illustrate not only that Bayesian point estimates can be made to correspond with those obtained by frequentist approaches, but also that estimation uncertainty as described by 95% intervals is more useful and biologically realistic using the Bayesian method. In particular, unlike in bootstrap confidence intervals, the lower bounds of the Bayesian posterior intervals for attack rates do not include zero when a predator-prey interaction is in fact observed. We conclude that the Bayesian framework provides a straightforward, probabilistic characterization of interaction strength uncertainty, enabling future considerations of both the deterministic and stochastic drivers of interaction strength and their impact on food webs.
NASA Astrophysics Data System (ADS)
Kushida, N.; Kebede, F.; Feitio, P.; Le Bras, R.
2016-12-01
The Preparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO) has been developing and testing NET-VISA (Arora et al., 2013), a Bayesian automatic event detection and localization program, and evaluating its performance in a realistic operational mode. In our preliminary testing at the CTBTO, NET-VISA shows better performance than its currently operating automatic localization program. However, given CTBTO's role and its international context, a new technology should be introduced cautiously when it replaces a key piece of the automatic processing. We integrated the results of NET-VISA into the Analyst Review Station, extensively used by the analysts so that they can check the accuracy and robustness of the Bayesian approach. We expect the workload of the analysts to be reduced because of the better performance of NET-VISA in finding missed events and getting a more complete set of stations than the current system which has been operating for nearly twenty years. The results of a series of tests indicate that the expectations born from the automatic tests, which show an overall overlap improvement of 11%, meaning that the missed events rate is cut by 42%, hold for the integrated interactive module as well. New events are found by analysts, which qualify for the CTBTO Reviewed Event Bulletin, beyond the ones analyzed through the standard procedures. Arora, N., Russell, S., and Sudderth, E., NET-VISA: Network Processing Vertically Integrated Seismic Analysis, 2013, Bull. Seismol. Soc. Am., 103, 709-729.
Analytical study to define a helicopter stability derivative extraction method, volume 1
NASA Technical Reports Server (NTRS)
Molusis, J. A.
1973-01-01
A method is developed for extracting six degree-of-freedom stability and control derivatives from helicopter flight data. Different combinations of filtering and derivative estimate are investigated and used with a Bayesian approach for derivative identification. The combination of filtering and estimate found to yield the most accurate time response match to flight test data is determined and applied to CH-53A and CH-54B flight data. The method found to be most accurate consists of (1) filtering flight test data with a digital filter, followed by an extended Kalman filter (2) identifying a derivative estimate with a least square estimator, and (3) obtaining derivatives with the Bayesian derivative extraction method.
A Bayesian observer replicates convexity context effects in figure-ground perception.
Goldreich, Daniel; Peterson, Mary A
2012-01-01
Peterson and Salvagio (2008) demonstrated convexity context effects in figure-ground perception. Subjects shown displays consisting of unfamiliar alternating convex and concave regions identified the convex regions as foreground objects progressively more frequently as the number of regions increased; this occurred only when the concave regions were homogeneously colored. The origins of these effects have been unclear. Here, we present a two-free-parameter Bayesian observer that replicates convexity context effects. The Bayesian observer incorporates two plausible expectations regarding three-dimensional scenes: (1) objects tend to be convex rather than concave, and (2) backgrounds tend (more than foreground objects) to be homogeneously colored. The Bayesian observer estimates the probability that a depicted scene is three-dimensional, and that the convex regions are figures. It responds stochastically by sampling from its posterior distributions. Like human observers, the Bayesian observer shows convexity context effects only for images with homogeneously colored concave regions. With optimal parameter settings, it performs similarly to the average human subject on the four display types tested. We propose that object convexity and background color homogeneity are environmental regularities exploited by human visual perception; vision achieves figure-ground perception by interpreting ambiguous images in light of these and other expected regularities in natural scenes.
Tracking composite material damage evolution using Bayesian filtering and flash thermography data
NASA Astrophysics Data System (ADS)
Gregory, Elizabeth D.; Holland, Steve D.
2016-05-01
We propose a method for tracking the condition of a composite part using Bayesian filtering of ash thermography data over the lifetime of the part. In this demonstration, composite panels were fabricated; impacted to induce subsurface delaminations; and loaded in compression over multiple time steps, causing the delaminations to grow in size. Flash thermography data was collected between each damage event to serve as a time history of the part. The ash thermography indicated some areas of damage but provided little additional information as to the exact nature or depth of the damage. Computed tomography (CT) data was also collected after each damage event and provided a high resolution volume model of damage that acted as truth. After each cycle, the condition estimate, from the ash thermography data and the Bayesian filter, was compared to 'ground truth'. The Bayesian process builds on the lifetime history of ash thermography scans and can give better estimates of material condition as compared to the most recent scan alone, which is common practice in the aerospace industry. Bayesian inference provides probabilistic estimates of damage condition that are updated as each new set of data becomes available. The method was tested on simulated data and then on an experimental data set.
Natural frequencies facilitate diagnostic inferences of managers
Hoffrage, Ulrich; Hafenbrädl, Sebastian; Bouquet, Cyril
2015-01-01
In Bayesian inference tasks, information about base rates as well as hit rate and false-alarm rate needs to be integrated according to Bayes’ rule after the result of a diagnostic test became known. Numerous studies have found that presenting information in a Bayesian inference task in terms of natural frequencies leads to better performance compared to variants with information presented in terms of probabilities or percentages. Natural frequencies are the tallies in a natural sample in which hit rate and false-alarm rate are not normalized with respect to base rates. The present research replicates the beneficial effect of natural frequencies with four tasks from the domain of management, and with management students as well as experienced executives as participants. The percentage of Bayesian responses was almost twice as high when information was presented in natural frequencies compared to a presentation in terms of percentages. In contrast to most tasks previously studied, the majority of numerical responses were lower than the Bayesian solutions. Having heard of Bayes’ rule prior to the study did not affect Bayesian performance. An implication of our work is that textbooks explaining Bayes’ rule should teach how to represent information in terms of natural frequencies instead of how to plug probabilities or percentages into a formula. PMID:26157397
Bayesian approach for counting experiment statistics applied to a neutrino point source analysis
NASA Astrophysics Data System (ADS)
Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.
2013-12-01
In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.
Moran, Rosalyn J; Symmonds, Mkael; Dolan, Raymond J; Friston, Karl J
2014-01-01
The aging brain shows a progressive loss of neuropil, which is accompanied by subtle changes in neuronal plasticity, sensory learning and memory. Neurophysiologically, aging attenuates evoked responses--including the mismatch negativity (MMN). This is accompanied by a shift in cortical responsivity from sensory (posterior) regions to executive (anterior) regions, which has been interpreted as a compensatory response for cognitive decline. Theoretical neurobiology offers a simpler explanation for all of these effects--from a Bayesian perspective, as the brain is progressively optimized to model its world, its complexity will decrease. A corollary of this complexity reduction is an attenuation of Bayesian updating or sensory learning. Here we confirmed this hypothesis using magnetoencephalographic recordings of the mismatch negativity elicited in a large cohort of human subjects, in their third to ninth decade. Employing dynamic causal modeling to assay the synaptic mechanisms underlying these non-invasive recordings, we found a selective age-related attenuation of synaptic connectivity changes that underpin rapid sensory learning. In contrast, baseline synaptic connectivity strengths were consistently strong over the decades. Our findings suggest that the lifetime accrual of sensory experience optimizes functional brain architectures to enable efficient and generalizable predictions of the world.
The performance of matched-field track-before-detect methods using shallow-water Pacific data.
Tantum, Stacy L; Nolte, Loren W; Krolik, Jeffrey L; Harmanci, Kerem
2002-07-01
Matched-field track-before-detect processing, which extends the concept of matched-field processing to include modeling of the source dynamics, has recently emerged as a promising approach for maintaining the track of a moving source. In this paper, optimal Bayesian and minimum variance beamforming track-before-detect algorithms which incorporate a priori knowledge of the source dynamics in addition to the underlying uncertainties in the ocean environment are presented. A Markov model is utilized for the source motion as a means of capturing the stochastic nature of the source dynamics without assuming uniform motion. In addition, the relationship between optimal Bayesian track-before-detect processing and minimum variance track-before-detect beamforming is examined, revealing how an optimal tracking philosophy may be used to guide the modification of existing beamforming techniques to incorporate track-before-detect capabilities. Further, the benefits of implementing an optimal approach over conventional methods are illustrated through application of these methods to shallow-water Pacific data collected as part of the SWellEX-1 experiment. The results show that incorporating Markovian dynamics for the source motion provides marked improvement in the ability to maintain target track without the use of a uniform velocity hypothesis.
NASA Astrophysics Data System (ADS)
Kopka, Piotr; Wawrzynczak, Anna; Borysiewicz, Mieczyslaw
2016-11-01
In this paper the Bayesian methodology, known as Approximate Bayesian Computation (ABC), is applied to the problem of the atmospheric contamination source identification. The algorithm input data are on-line arriving concentrations of the released substance registered by the distributed sensors network. This paper presents the Sequential ABC algorithm in detail and tests its efficiency in estimation of probabilistic distributions of atmospheric release parameters of a mobile contamination source. The developed algorithms are tested using the data from Over-Land Atmospheric Diffusion (OLAD) field tracer experiment. The paper demonstrates estimation of seven parameters characterizing the contamination source, i.e.: contamination source starting position (x,y), the direction of the motion of the source (d), its velocity (v), release rate (q), start time of release (ts) and its duration (td). The online-arriving new concentrations dynamically update the probability distributions of search parameters. The atmospheric dispersion Second-order Closure Integrated PUFF (SCIPUFF) Model is used as the forward model to predict the concentrations at the sensors locations.
Pidlisecky, Adam; Haines, S.S.
2011-01-01
Conventional processing methods for seismic cone penetrometer data present several shortcomings, most notably the absence of a robust velocity model uncertainty estimate. We propose a new seismic cone penetrometer testing (SCPT) data-processing approach that employs Bayesian methods to map measured data errors into quantitative estimates of model uncertainty. We first calculate travel-time differences for all permutations of seismic trace pairs. That is, we cross-correlate each trace at each measurement location with every trace at every other measurement location to determine travel-time differences that are not biased by the choice of any particular reference trace and to thoroughly characterize data error. We calculate a forward operator that accounts for the different ray paths for each measurement location, including refraction at layer boundaries. We then use a Bayesian inversion scheme to obtain the most likely slowness (the reciprocal of velocity) and a distribution of probable slowness values for each model layer. The result is a velocity model that is based on correct ray paths, with uncertainty bounds that are based on the data error. ?? NRC Research Press 2011.
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
The multicategory case of the sequential Bayesian pixel selection and estimation procedure
NASA Technical Reports Server (NTRS)
Pore, M. D.; Dennis, T. B. (Principal Investigator)
1980-01-01
A Bayesian technique for stratified proportion estimation and a sampling based on minimizing the mean squared error of this estimator were developed and tested on LANDSAT multispectral scanner data using the beta density function to model the prior distribution in the two-class case. An extention of this procedure to the k-class case is considered. A generalization of the beta function is shown to be a density function for the general case which allows the procedure to be extended.
Exploiting Cross-sensitivity by Bayesian Decoding of Mixed Potential Sensor Arrays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kreller, Cortney
LANL mixed-potential electrochemical sensor (MPES) device arrays were coupled with advanced Bayesian inference treatment of the physical model of relevant sensor-analyte interactions. We demonstrated that our approach could be used to uniquely discriminate the composition of ternary gas sensors with three discreet MPES sensors with an average error of less than 2%. We also observed that the MPES exhibited excellent stability over a year of operation at elevated temperatures in the presence of test gases.
Bayesian networks and statistical analysis application to analyze the diagnostic test accuracy
NASA Astrophysics Data System (ADS)
Orzechowski, P.; Makal, Jaroslaw; Onisko, A.
2005-02-01
The computer aided BPH diagnosis system based on Bayesian network is described in the paper. First result are compared to a given statistical method. Different statistical methods are used successfully in medicine for years. However, the undoubted advantages of probabilistic methods make them useful in application in newly created systems which are frequent in medicine, but do not have full and competent knowledge. The article presents advantages of the computer aided BPH diagnosis system in clinical practice for urologists.
New methods of testing nonlinear hypothesis using iterative NLLS estimator
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper discusses the method of testing nonlinear hypothesis using iterative Nonlinear Least Squares (NLLS) estimator. Takeshi Amemiya [1] explained this method. However in the present research paper, a modified Wald test statistic due to Engle, Robert [6] is proposed to test the nonlinear hypothesis using iterative NLLS estimator. An alternative method for testing nonlinear hypothesis using iterative NLLS estimator based on nonlinear hypothesis using iterative NLLS estimator based on nonlinear studentized residuals has been proposed. In this research article an innovative method of testing nonlinear hypothesis using iterative restricted NLLS estimator is derived. Pesaran and Deaton [10] explained the methods of testing nonlinear hypothesis. This paper uses asymptotic properties of nonlinear least squares estimator proposed by Jenrich [8]. The main purpose of this paper is to provide very innovative methods of testing nonlinear hypothesis using iterative NLLS estimator, iterative NLLS estimator based on nonlinear studentized residuals and iterative restricted NLLS estimator. Eakambaram et al. [12] discussed least absolute deviation estimations versus nonlinear regression model with heteroscedastic errors and also they studied the problem of heteroscedasticity with reference to nonlinear regression models with suitable illustration. William Grene [13] examined the interaction effect in nonlinear models disused by Ai and Norton [14] and suggested ways to examine the effects that do not involve statistical testing. Peter [15] provided guidelines for identifying composite hypothesis and addressing the probability of false rejection for multiple hypotheses.
A program for the Bayesian Neural Network in the ROOT framework
NASA Astrophysics Data System (ADS)
Zhong, Jiahang; Huang, Run-Sheng; Lee, Shih-Chang
2011-12-01
We present a Bayesian Neural Network algorithm implemented in the TMVA package (Hoecker et al., 2007 [1]), within the ROOT framework (Brun and Rademakers, 1997 [2]). Comparing to the conventional utilization of Neural Network as discriminator, this new implementation has more advantages as a non-parametric regression tool, particularly for fitting probabilities. It provides functionalities including cost function selection, complexity control and uncertainty estimation. An example of such application in High Energy Physics is shown. The algorithm is available with ROOT release later than 5.29. Program summaryProgram title: TMVA-BNN Catalogue identifier: AEJX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: BSD license No. of lines in distributed program, including test data, etc.: 5094 No. of bytes in distributed program, including test data, etc.: 1,320,987 Distribution format: tar.gz Programming language: C++ Computer: Any computer system or cluster with C++ compiler and UNIX-like operating system Operating system: Most UNIX/Linux systems. The application programs were thoroughly tested under Fedora and Scientific Linux CERN. Classification: 11.9 External routines: ROOT package version 5.29 or higher ( http://root.cern.ch) Nature of problem: Non-parametric fitting of multivariate distributions Solution method: An implementation of Neural Network following the Bayesian statistical interpretation. Uses Laplace approximation for the Bayesian marginalizations. Provides the functionalities of automatic complexity control and uncertainty estimation. Running time: Time consumption for the training depends substantially on the size of input sample, the NN topology, the number of training iterations, etc. For the example in this manuscript, about 7 min was used on a PC/Linux with 2.0 GHz processors.
A Bayesian Framework for Reliability Analysis of Spacecraft Deployments
NASA Technical Reports Server (NTRS)
Evans, John W.; Gallo, Luis; Kaminsky, Mark
2012-01-01
Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a two stage sequential Bayesian framework for reliability estimation of spacecraft deployment was developed for this purpose. This process was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the Optical Telescope Element. Initially, detailed studies of NASA deployment history, "heritage information", were conducted, extending over 45 years of spacecraft launches. This information was then coupled to a non-informative prior and a binomial likelihood function to create a posterior distribution for deployments of various subsystems uSing Monte Carlo Markov Chain sampling. Select distributions were then coupled to a subsequent analysis, using test data and anomaly occurrences on successive ground test deployments of scale model test articles of JWST hardware, to update the NASA heritage data. This allowed for a realistic prediction for the reliability of the complex Sunshield deployment, with credibility limits, within this two stage Bayesian framework.
Phylogeny and Evolutionary Patterns in the Dwarf Crayfish Subfamily (Decapoda: Cambarellinae)
Pedraza-Lara, Carlos; Doadrio, Ignacio; Breinholt, Jesse W.; Crandall, Keith A.
2012-01-01
The Dwarf crayfish or Cambarellinae, is a morphologically singular subfamily of decapod crustaceans that contains only one genus, Cambarellus. Its intriguing distribution, along the river basins of the Gulf Coast of United States (Gulf Group) and into Central México (Mexican Group), has until now lacked of satisfactory explanation. This study provides a comprehensive sampling of most of the extant species of Cambarellus and sheds light on its evolutionary history, systematics and biogeography. We tested the impact of Gulf Group versus Mexican Group geography on rates of cladogenesis using a maximum likelihood framework, testing different models of birth/extinction of lineages. We propose a comprehensive phylogenetic hypothesis for the subfamily based on mitochondrial and nuclear loci (3,833 bp) using Bayesian and Maximum Likelihood methods. The phylogenetic structure found two phylogenetic groups associated to the two main geographic components (Gulf Group and Mexican Group) and is partially consistent with the historical structure of river basins. The previous hypothesis, which divided the genus into three subgenera based on genitalia morphology was only partially supported (P = 0.047), resulting in a paraphyletic subgenus Pandicambarus. We found at least two cases in which phylogenetic structure failed to recover monophyly of recognized species while detecting several cases of cryptic diversity, corresponding to lineages not assigned to any described species. Cladogenetic patterns in the entire subfamily are better explained by an allopatric model of speciation. Diversification analyses showed similar cladogenesis patterns between both groups and did not significantly differ from the constant rate models. While cladogenesis in the Gulf Group is coincident in time with changes in the sea levels, in the Mexican Group, cladogenesis is congruent with the formation of the Trans-Mexican Volcanic Belt. Our results show how similar allopatric divergence in freshwater organisms can be promoted through diverse vicariant factors. PMID:23155379
Ritchie, Andrew M; Lo, Nathan; Ho, Simon Y W
2017-05-01
In Bayesian phylogenetic analyses of genetic data, prior probability distributions need to be specified for the model parameters, including the tree. When Bayesian methods are used for molecular dating, available tree priors include those designed for species-level data, such as the pure-birth and birth-death priors, and coalescent-based priors designed for population-level data. However, molecular dating methods are frequently applied to data sets that include multiple individuals across multiple species. Such data sets violate the assumptions of both the speciation and coalescent-based tree priors, making it unclear which should be chosen and whether this choice can affect the estimation of node times. To investigate this problem, we used a simulation approach to produce data sets with different proportions of within- and between-species sampling under the multispecies coalescent model. These data sets were then analyzed under pure-birth, birth-death, constant-size coalescent, and skyline coalescent tree priors. We also explored the ability of Bayesian model testing to select the best-performing priors. We confirmed the applicability of our results to empirical data sets from cetaceans, phocids, and coregonid whitefish. Estimates of node times were generally robust to the choice of tree prior, but some combinations of tree priors and sampling schemes led to large differences in the age estimates. In particular, the pure-birth tree prior frequently led to inaccurate estimates for data sets containing a mixture of inter- and intraspecific sampling, whereas the birth-death and skyline coalescent priors produced stable results across all scenarios. Model testing provided an adequate means of rejecting inappropriate tree priors. Our results suggest that tree priors do not strongly affect Bayesian molecular dating results in most cases, even when severely misspecified. However, the choice of tree prior can be significant for the accuracy of dating results in the case of data sets with mixed inter- and intraspecies sampling. [Bayesian phylogenetic methods; model testing; molecular dating; node time; tree prior.]. © The authors 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please e-mail: journals.permission@oup.com.
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.