NASA Astrophysics Data System (ADS)
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
ERIC Educational Resources Information Center
Vos, Hans J.
An approach to simultaneous optimization of assignments of subjects to treatments followed by an end-of-mastery test is presented using the framework of Bayesian decision theory. Focus is on demonstrating how rules for the simultaneous optimization of sequences of decisions can be found. The main advantages of the simultaneous approach, compared…
Groth, Katrina M.; Smith, Curtis L.; Swiler, Laura P.
2014-04-05
In the past several years, several international agencies have begun to collect data on human performance in nuclear power plant simulators [1]. This data provides a valuable opportunity to improve human reliability analysis (HRA), but there improvements will not be realized without implementation of Bayesian methods. Bayesian methods are widely used in to incorporate sparse data into models in many parts of probabilistic risk assessment (PRA), but Bayesian methods have not been adopted by the HRA community. In this article, we provide a Bayesian methodology to formally use simulator data to refine the human error probabilities (HEPs) assigned by existingmore » HRA methods. We demonstrate the methodology with a case study, wherein we use simulator data from the Halden Reactor Project to update the probability assignments from the SPAR-H method. The case study demonstrates the ability to use performance data, even sparse data, to improve existing HRA methods. Furthermore, this paper also serves as a demonstration of the value of Bayesian methods to improve the technical basis of HRA.« less
Porter, Teresita M; Gibson, Joel F; Shokralla, Shadi; Baird, Donald J; Golding, G Brian; Hajibabaei, Mehrdad
2014-01-01
Current methods to identify unknown insect (class Insecta) cytochrome c oxidase (COI barcode) sequences often rely on thresholds of distances that can be difficult to define, sequence similarity cut-offs, or monophyly. Some of the most commonly used metagenomic classification methods do not provide a measure of confidence for the taxonomic assignments they provide. The aim of this study was to use a naïve Bayesian classifier (Wang et al. Applied and Environmental Microbiology, 2007; 73: 5261) to automate taxonomic assignments for large batches of insect COI sequences such as data obtained from high-throughput environmental sequencing. This method provides rank-flexible taxonomic assignments with an associated bootstrap support value, and it is faster than the blast-based methods commonly used in environmental sequence surveys. We have developed and rigorously tested the performance of three different training sets using leave-one-out cross-validation, two field data sets, and targeted testing of Lepidoptera, Diptera and Mantodea sequences obtained from the Barcode of Life Data system. We found that type I error rates, incorrect taxonomic assignments with a high bootstrap support, were already relatively low but could be lowered further by ensuring that all query taxa are actually present in the reference database. Choosing bootstrap support cut-offs according to query length and summarizing taxonomic assignments to more inclusive ranks can also help to reduce error while retaining the maximum number of assignments. Additionally, we highlight gaps in the taxonomic and geographic representation of insects in public sequence databases that will require further work by taxonomists to improve the quality of assignments generated using any method.
Optimal Bayesian Adaptive Design for Test-Item Calibration.
van der Linden, Wim J; Ren, Hao
2015-06-01
An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.
Williams, Mary R; Sigman, Michael E; Lewis, Jennifer; Pitan, Kelly McHugh
2012-10-10
A bayesian soft classification method combined with target factor analysis (TFA) is described and tested for the analysis of fire debris data. The method relies on analysis of the average mass spectrum across the chromatographic profile (i.e., the total ion spectrum, TIS) from multiple samples taken from a single fire scene. A library of TIS from reference ignitable liquids with assigned ASTM classification is used as the target factors in TFA. The class-conditional distributions of correlations between the target and predicted factors for each ASTM class are represented by kernel functions and analyzed by bayesian decision theory. The soft classification approach assists in assessing the probability that ignitable liquid residue from a specific ASTM E1618 class, is present in a set of samples from a single fire scene, even in the presence of unspecified background contributions from pyrolysis products. The method is demonstrated with sample data sets and then tested on laboratory-scale burn data and large-scale field test burns. The overall performance achieved in laboratory and field test of the method is approximately 80% correct classification of fire debris samples. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Laminar fMRI and computational theories of brain function.
Stephan, K E; Petzschner, F H; Kasper, L; Bayer, J; Wellstein, K V; Stefanics, G; Pruessmann, K P; Heinzle, J
2017-11-02
Recently developed methods for functional MRI at the resolution of cortical layers (laminar fMRI) offer a novel window into neurophysiological mechanisms of cortical activity. Beyond physiology, laminar fMRI also offers an unprecedented opportunity to test influential theories of brain function. Specifically, hierarchical Bayesian theories of brain function, such as predictive coding, assign specific computational roles to different cortical layers. Combined with computational models, laminar fMRI offers a unique opportunity to test these proposals noninvasively in humans. This review provides a brief overview of predictive coding and related hierarchical Bayesian theories, summarises their predictions with regard to layered cortical computations, examines how these predictions could be tested by laminar fMRI, and considers methodological challenges. We conclude by discussing the potential of laminar fMRI for clinically useful computational assays of layer-specific information processing. Copyright © 2017 Elsevier Inc. All rights reserved.
Luce, Bryan R; Connor, Jason T; Broglio, Kristine R; Mullins, C Daniel; Ishak, K Jack; Saunders, Elijah; Davis, Barry R
2016-09-20
Bayesian and adaptive clinical trial designs offer the potential for more efficient processes that result in lower sample sizes and shorter trial durations than traditional designs. To explore the use and potential benefits of Bayesian adaptive clinical trial designs in comparative effectiveness research. Virtual execution of ALLHAT (Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial) as if it had been done according to a Bayesian adaptive trial design. Comparative effectiveness trial of antihypertensive medications. Patient data sampled from the more than 42 000 patients enrolled in ALLHAT with publicly available data. Number of patients randomly assigned between groups, trial duration, observed numbers of events, and overall trial results and conclusions. The Bayesian adaptive approach and original design yielded similar overall trial conclusions. The Bayesian adaptive trial randomly assigned more patients to the better-performing group and would probably have ended slightly earlier. This virtual trial execution required limited resampling of ALLHAT patients for inclusion in RE-ADAPT (REsearch in ADAptive methods for Pragmatic Trials). Involvement of a data monitoring committee and other trial logistics were not considered. In a comparative effectiveness research trial, Bayesian adaptive trial designs are a feasible approach and potentially generate earlier results and allocate more patients to better-performing groups. National Heart, Lung, and Blood Institute.
A critique of statistical hypothesis testing in clinical research
Raha, Somik
2011-01-01
Many have documented the difficulty of using the current paradigm of Randomized Controlled Trials (RCTs) to test and validate the effectiveness of alternative medical systems such as Ayurveda. This paper critiques the applicability of RCTs for all clinical knowledge-seeking endeavors, of which Ayurveda research is a part. This is done by examining statistical hypothesis testing, the underlying foundation of RCTs, from a practical and philosophical perspective. In the philosophical critique, the two main worldviews of probability are that of the Bayesian and the frequentist. The frequentist worldview is a special case of the Bayesian worldview requiring the unrealistic assumptions of knowing nothing about the universe and believing that all observations are unrelated to each other. Many have claimed that the first belief is necessary for science, and this claim is debunked by comparing variations in learning with different prior beliefs. Moving beyond the Bayesian and frequentist worldviews, the notion of hypothesis testing itself is challenged on the grounds that a hypothesis is an unclear distinction, and assigning a probability on an unclear distinction is an exercise that does not lead to clarity of action. This critique is of the theory itself and not any particular application of statistical hypothesis testing. A decision-making frame is proposed as a way of both addressing this critique and transcending ideological debates on probability. An example of a Bayesian decision-making approach is shown as an alternative to statistical hypothesis testing, utilizing data from a past clinical trial that studied the effect of Aspirin on heart attacks in a sample population of doctors. As a big reason for the prevalence of RCTs in academia is legislation requiring it, the ethics of legislating the use of statistical methods for clinical research is also examined. PMID:22022152
A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2013-01-01
The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…
A combined Fuzzy and Naive Bayesian strategy can be used to assign event codes to injury narratives.
Marucci-Wellman, H; Lehto, M; Corns, H
2011-12-01
Bayesian methods show promise for classifying injury narratives from large administrative datasets into cause groups. This study examined a combined approach where two Bayesian models (Fuzzy and Naïve) were used to either classify a narrative or select it for manual review. Injury narratives were extracted from claims filed with a worker's compensation insurance provider between January 2002 and December 2004. Narratives were separated into a training set (n=11,000) and prediction set (n=3,000). Expert coders assigned two-digit Bureau of Labor Statistics Occupational Injury and Illness Classification event codes to each narrative. Fuzzy and Naïve Bayesian models were developed using manually classified cases in the training set. Two semi-automatic machine coding strategies were evaluated. The first strategy assigned cases for manual review if the Fuzzy and Naïve models disagreed on the classification. The second strategy selected additional cases for manual review from the Agree dataset using prediction strength to reach a level of 50% computer coding and 50% manual coding. When agreement alone was used as the filtering strategy, the majority were coded by the computer (n=1,928, 64%) leaving 36% for manual review. The overall combined (human plus computer) sensitivity was 0.90 and positive predictive value (PPV) was >0.90 for 11 of 18 2-digit event categories. Implementing the 2nd strategy improved results with an overall sensitivity of 0.95 and PPV >0.90 for 17 of 18 categories. A combined Naïve-Fuzzy Bayesian approach can classify some narratives with high accuracy and identify others most beneficial for manual review, reducing the burden on human coders.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
Tracing Asian Seabass Individuals to Single Fish Farms Using Microsatellites
Yue, Gen Hua; Xia, Jun Hong; Liu, Peng; Liu, Feng; Sun, Fei; Lin, Grace
2012-01-01
Traceability through physical labels is well established, but it is not highly reliable as physical labels can be easily changed or lost. Application of DNA markers to the traceability of food plays an increasingly important role for consumer protection and confidence building. In this study, we tested the efficiency of 16 polymorphic microsatellites and their combinations for tracing 368 fish to four populations where they originated. Using the maximum likelihood and Bayesian methods, three most efficient microsatellites were required to assign over 95% of fish to the correct populations. Selection of markers based on the assignment score estimated with the software WHICHLOCI was most effective in choosing markers for individual assignment, followed by the selection based on the allele number of individual markers. By combining rapid DNA extraction, and high-throughput genotyping of selected microsatellites, it is possible to conduct routine genetic traceability with high accuracy in Asian seabass. PMID:23285169
NASA Astrophysics Data System (ADS)
Arnst, M.; Abello Álvarez, B.; Ponthot, J.-P.; Boman, R.
2017-11-01
This paper is concerned with the characterization and the propagation of errors associated with data limitations in polynomial-chaos-based stochastic methods for uncertainty quantification. Such an issue can arise in uncertainty quantification when only a limited amount of data is available. When the available information does not suffice to accurately determine the probability distributions that must be assigned to the uncertain variables, the Bayesian method for assigning these probability distributions becomes attractive because it allows the stochastic model to account explicitly for insufficiency of the available information. In previous work, such applications of the Bayesian method had already been implemented by using the Metropolis-Hastings and Gibbs Markov Chain Monte Carlo (MCMC) methods. In this paper, we present an alternative implementation, which uses an alternative MCMC method built around an Itô stochastic differential equation (SDE) that is ergodic for the Bayesian posterior. We draw together from the mathematics literature a number of formal properties of this Itô SDE that lend support to its use in the implementation of the Bayesian method, and we describe its discretization, including the choice of the free parameters, by using the implicit Euler method. We demonstrate the proposed methodology on a problem of uncertainty quantification in a complex nonlinear engineering application relevant to metal forming.
A test of geographic assignment using isotope tracers in feathers of known origin
Wunder, Michael B.; Kester, C.L.; Knopf, F.L.; Rye, R.O.
2005-01-01
We used feathers of known origin collected from across the breeding range of a migratory shorebird to test the use of isotope tracers for assigning breeding origins. We analyzed δD, δ13C, and δ15N in feathers from 75 mountain plover (Charadrius montanus) chicks sampled in 2001 and from 119 chicks sampled in 2002. We estimated parameters for continuous-response inverse regression models and for discrete-response Bayesian probability models from data for each year independently. We evaluated model predictions with both the training data and by using the alternate year as an independent test dataset. Our results provide weak support for modeling latitude and isotope values as monotonic functions of one another, especially when data are pooled over known sources of variation such as sample year or location. We were unable to make even qualitative statements, such as north versus south, about the likely origin of birds using both δD and δ13C in inverse regression models; results were no better than random assignment. Probability models provided better results and a more natural framework for the problem. Correct assignment rates were highest when considering all three isotopes in the probability framework, but the use of even a single isotope was better than random assignment. The method appears relatively robust to temporal effects and is most sensitive to the isotope discrimination gradients over which samples are taken. We offer that the problem of using isotope tracers to infer geographic origin is best framed as one of assignment, rather than prediction.
Instruction in information structuring improves Bayesian judgment in intelligence analysts.
Mandel, David R
2015-01-01
An experiment was conducted to test the effectiveness of brief instruction in information structuring (i.e., representing and integrating information) for improving the coherence of probability judgments and binary choices among intelligence analysts. Forty-three analysts were presented with comparable sets of Bayesian judgment problems before and immediately after instruction. After instruction, analysts' probability judgments were more coherent (i.e., more additive and compliant with Bayes theorem). Instruction also improved the coherence of binary choices regarding category membership: after instruction, subjects were more likely to invariably choose the category to which they assigned the higher probability of a target's membership. The research provides a rare example of evidence-based validation of effectiveness in instruction to improve the statistical assessment skills of intelligence analysts. Such instruction could also be used to improve the assessment quality of other types of experts who are required to integrate statistical information or make probabilistic assessments.
de Nazelle, Audrey; Arunachalam, Saravanan; Serre, Marc L
2010-08-01
States in the USA are required to demonstrate future compliance of criteria air pollutant standards by using both air quality monitors and model outputs. In the case of ozone, the demonstration tests aim at relying heavily on measured values, due to their perceived objectivity and enforceable quality. Weight given to numerical models is diminished by integrating them in the calculations only in a relative sense. For unmonitored locations, the EPA has suggested the use of a spatial interpolation technique to assign current values. We demonstrate that this approach may lead to erroneous assignments of nonattainment and may make it difficult for States to establish future compliance. We propose a method that combines different sources of information to map air pollution, using the Bayesian Maximum Entropy (BME) Framework. The approach gives precedence to measured values and integrates modeled data as a function of model performance. We demonstrate this approach in North Carolina, using the State's ozone monitoring network in combination with outputs from the Multiscale Air Quality Simulation Platform (MAQSIP) modeling system. We show that the BME data integration approach, compared to a spatial interpolation of measured data, improves the accuracy and the precision of ozone estimations across the state.
Bayesian network interface for assisting radiology interpretation and education
NASA Astrophysics Data System (ADS)
Duda, Jeffrey; Botzolakis, Emmanuel; Chen, Po-Hao; Mohan, Suyash; Nasrallah, Ilya; Rauschecker, Andreas; Rudie, Jeffrey; Bryan, R. Nick; Gee, James; Cook, Tessa
2018-03-01
In this work, we present the use of Bayesian networks for radiologist decision support during clinical interpretation. This computational approach has the advantage of avoiding incorrect diagnoses that result from known human cognitive biases such as anchoring bias, framing effect, availability bias, and premature closure. To integrate Bayesian networks into clinical practice, we developed an open-source web application that provides diagnostic support for a variety of radiology disease entities (e.g., basal ganglia diseases, bone lesions). The Clinical tool presents the user with a set of buttons representing clinical and imaging features of interest. These buttons are used to set the value for each observed feature. As features are identified, the conditional probabilities for each possible diagnosis are updated in real time. Additionally, using sensitivity analysis, the interface may be set to inform the user which remaining imaging features provide maximum discriminatory information to choose the most likely diagnosis. The Case Submission tools allow the user to submit a validated case and the associated imaging features to a database, which can then be used for future tuning/testing of the Bayesian networks. These submitted cases are then reviewed by an assigned expert using the provided QC tool. The Research tool presents users with cases with previously labeled features and a chosen diagnosis, for the purpose of performance evaluation. Similarly, the Education page presents cases with known features, but provides real time feedback on feature selection.
Liu, Zhihong; Zheng, Minghao; Yan, Xin; Gu, Qiong; Gasteiger, Johann; Tijhuis, Johan; Maas, Peter; Li, Jiabo; Xu, Jun
2014-09-01
Predicting compound chemical stability is important because unstable compounds can lead to either false positive or to false negative conclusions in bioassays. Experimental data (COMDECOM) measured from DMSO/H2O solutions stored at 50 °C for 105 days were used to predicted stability by applying rule-embedded naïve Bayesian learning, based upon atom center fragment (ACF) features. To build the naïve Bayesian classifier, we derived ACF features from 9,746 compounds in the COMDECOM dataset. By recursively applying naïve Bayesian learning from the data set, each ACF is assigned with an expected stable probability (p(s)) and an unstable probability (p(uns)). 13,340 ACFs, together with their p(s) and p(uns) data, were stored in a knowledge base for use by the Bayesian classifier. For a given compound, its ACFs were derived from its structure connection table with the same protocol used to drive ACFs from the training data. Then, the Bayesian classifier assigned p(s) and p(uns) values to the compound ACFs by a structural pattern recognition algorithm, which was implemented in-house. Compound instability is calculated, with Bayes' theorem, based upon the p(s) and p(uns) values of the compound ACFs. We were able to achieve performance with an AUC value of 84% and a tenfold cross validation accuracy of 76.5%. To reduce false negatives, a rule-based approach has been embedded in the classifier. The rule-based module allows the program to improve its predictivity by expanding its compound instability knowledge base, thus further reducing the possibility of false negatives. To our knowledge, this is the first in silico prediction service for the prediction of the stabilities of organic compounds.
Diagnosis of combined faults in Rotary Machinery by Non-Naive Bayesian approach
NASA Astrophysics Data System (ADS)
Asr, Mahsa Yazdanian; Ettefagh, Mir Mohammad; Hassannejad, Reza; Razavi, Seyed Naser
2017-02-01
When combined faults happen in different parts of the rotating machines, their features are profoundly dependent. Experts are completely familiar with individuals faults characteristics and enough data are available from single faults but the problem arises, when the faults combined and the separation of characteristics becomes complex. Therefore, the experts cannot declare exact information about the symptoms of combined fault and its quality. In this paper to overcome this drawback, a novel method is proposed. The core idea of the method is about declaring combined fault without using combined fault features as training data set and just individual fault features are applied in training step. For this purpose, after data acquisition and resampling the obtained vibration signals, Empirical Mode Decomposition (EMD) is utilized to decompose multi component signals to Intrinsic Mode Functions (IMFs). With the use of correlation coefficient, proper IMFs for feature extraction are selected. In feature extraction step, Shannon energy entropy of IMFs was extracted as well as statistical features. It is obvious that most of extracted features are strongly dependent. To consider this matter, Non-Naive Bayesian Classifier (NNBC) is appointed, which release the fundamental assumption of Naive Bayesian, i.e., the independence among features. To demonstrate the superiority of NNBC, other counterpart methods, include Normal Naive Bayesian classifier, Kernel Naive Bayesian classifier and Back Propagation Neural Networks were applied and the classification results are compared. An experimental vibration signals, collected from automobile gearbox, were used to verify the effectiveness of the proposed method. During the classification process, only the features, related individually to healthy state, bearing failure and gear failures, were assigned for training the classifier. But, combined fault features (combined gear and bearing failures) were examined as test data. The achieved probabilities for the test data show that the combined fault can be identified with high success rate.
Moran, Paul; Bromaghin, Jeffrey F.; Masuda, Michele
2014-01-01
Many applications in ecological genetics involve sampling individuals from a mixture of multiple biological populations and subsequently associating those individuals with the populations from which they arose. Analytical methods that assign individuals to their putative population of origin have utility in both basic and applied research, providing information about population-specific life history and habitat use, ecotoxins, pathogen and parasite loads, and many other non-genetic ecological, or phenotypic traits. Although the question is initially directed at the origin of individuals, in most cases the ultimate desire is to investigate the distribution of some trait among populations. Current practice is to assign individuals to a population of origin and study properties of the trait among individuals within population strata as if they constituted independent samples. It seemed that approach might bias population-specific trait inference. In this study we made trait inferences directly through modeling, bypassing individual assignment. We extended a Bayesian model for population mixture analysis to incorporate parameters for the phenotypic trait and compared its performance to that of individual assignment with a minimum probability threshold for assignment. The Bayesian mixture model outperformed individual assignment under some trait inference conditions. However, by discarding individuals whose origins are most uncertain, the individual assignment method provided a less complex analytical technique whose performance may be adequate for some common trait inference problems. Our results provide specific guidance for method selection under various genetic relationships among populations with different trait distributions.
Moran, Paul; Bromaghin, Jeffrey F.; Masuda, Michele
2014-01-01
Many applications in ecological genetics involve sampling individuals from a mixture of multiple biological populations and subsequently associating those individuals with the populations from which they arose. Analytical methods that assign individuals to their putative population of origin have utility in both basic and applied research, providing information about population-specific life history and habitat use, ecotoxins, pathogen and parasite loads, and many other non-genetic ecological, or phenotypic traits. Although the question is initially directed at the origin of individuals, in most cases the ultimate desire is to investigate the distribution of some trait among populations. Current practice is to assign individuals to a population of origin and study properties of the trait among individuals within population strata as if they constituted independent samples. It seemed that approach might bias population-specific trait inference. In this study we made trait inferences directly through modeling, bypassing individual assignment. We extended a Bayesian model for population mixture analysis to incorporate parameters for the phenotypic trait and compared its performance to that of individual assignment with a minimum probability threshold for assignment. The Bayesian mixture model outperformed individual assignment under some trait inference conditions. However, by discarding individuals whose origins are most uncertain, the individual assignment method provided a less complex analytical technique whose performance may be adequate for some common trait inference problems. Our results provide specific guidance for method selection under various genetic relationships among populations with different trait distributions. PMID:24905464
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
Assignment of a non-informative prior when using a calibration function
NASA Astrophysics Data System (ADS)
Lira, I.; Grientschnig, D.
2012-01-01
The evaluation of measurement uncertainty associated with the use of calibration functions was addressed in a talk at the 19th IMEKO World Congress 2009 in Lisbon (Proceedings, pp 2346-51). Therein, an example involving a cubic function was analysed by a Bayesian approach and by the Monte Carlo method described in Supplement 1 to the 'Guide to the Expression of Uncertainty in Measurement'. Results were found to be discrepant. In this paper we examine a simplified version of the example and show that the reported discrepancy is caused by the choice of the prior in the Bayesian analysis, which does not conform to formal rules for encoding the absence of prior knowledge. Two options for assigning a non-informative prior free from this shortcoming are considered; they are shown to be equivalent.
Drake, Brandon Lee; Wills, Wirt H.; Hamilton, Marian I.; Dorshow, Wetherbee
2014-01-01
Strontium isotope sourcing has become a common and useful method for assigning sources to archaeological artifacts. In Chaco Canyon, an Ancestral Pueblo regional center in New Mexico, previous studies using these methods have suggested that significant portion of maize and wood originate in the Chuska Mountains region, 75 km to the East. In the present manuscript, these results were tested using both frequentist methods (to determine if geochemical sources can truly be differentiated) and Bayesian methods (to address uncertainty in geochemical source attribution). It was found that Chaco Canyon and the Chuska Mountain region are not easily distinguishable based on radiogenic strontium isotope values. The strontium profiles of many geochemical sources in the region overlap, making it difficult to definitively identify any one particular geochemical source for the canyon's pre-historic maize. Bayesian mixing models support the argument that some spruce and fir wood originated in the San Mateo Mountains, but that this cannot explain all 87Sr/86Sr values in Chaco timber. Overall radiogenic strontium isotope data do not clearly identify a single major geochemical source for maize, ponderosa, and most spruce/fir timber. As such, the degree to which Chaco Canyon relied upon outside support for both food and construction material is still ambiguous. PMID:24854352
Hao, Jie; Astle, William; De Iorio, Maria; Ebbels, Timothy M D
2012-08-01
Nuclear Magnetic Resonance (NMR) spectra are widely used in metabolomics to obtain metabolite profiles in complex biological mixtures. Common methods used to assign and estimate concentrations of metabolites involve either an expert manual peak fitting or extra pre-processing steps, such as peak alignment and binning. Peak fitting is very time consuming and is subject to human error. Conversely, alignment and binning can introduce artefacts and limit immediate biological interpretation of models. We present the Bayesian automated metabolite analyser for NMR spectra (BATMAN), an R package that deconvolutes peaks from one-dimensional NMR spectra, automatically assigns them to specific metabolites from a target list and obtains concentration estimates. The Bayesian model incorporates information on characteristic peak patterns of metabolites and is able to account for shifts in the position of peaks commonly seen in NMR spectra of biological samples. It applies a Markov chain Monte Carlo algorithm to sample from a joint posterior distribution of the model parameters and obtains concentration estimates with reduced error compared with conventional numerical integration and comparable to manual deconvolution by experienced spectroscopists. http://www1.imperial.ac.uk/medicine/people/t.ebbels/ t.ebbels@imperial.ac.uk.
The impossibility of probabilities
NASA Astrophysics Data System (ADS)
Zimmerman, Peter D.
2017-11-01
This paper discusses the problem of assigning probabilities to the likelihood of nuclear terrorism events, in particular examining the limitations of using Bayesian priors for this purpose. It suggests an alternate approach to analyzing the threat of nuclear terrorism.
Chaves, Camila L; Degen, Bernd; Pakull, Birte; Mader, Malte; Honorio, Euridice; Ruas, Paulo; Tysklind, Niklas; Sebbenn, Alexandre M
2018-06-27
Deforestation-reinforced by illegal logging-is a serious problem in many tropical regions and causes pervasive environmental and economic damage. Existing laws that intend to reduce illegal logging need efficient, fraud resistant control methods. We developed a genetic reference database for Jatoba (Hymenaea courbaril), an important, high value timber species from the Neotropics. The data set can be used for controls on declarations of wood origin. Samples from 308 Hymenaea trees from 12 locations in Brazil, Bolivia, Peru, and French Guiana have been collected and genotyped on 10 nuclear microsatellites (nSSRs), 13 chloroplast SNPs (cpSNP), and 1 chloroplast indel marker. The chloroplast gene markers have been developed using Illumina DNA sequencing. Bayesian cluster analysis divided the individuals based on the nSSRs into 8 genetic groups. Using self-assignment tests, the power of the genetic reference database to judge on declarations on the location has been tested for 3 different assignment methods. We observed a strong genetic differentiation among locations leading to high and reliable self-assignment rates for the locations between 50% to 100% (average of 88%). Although all 3 assignment methods came up with similar mean self-assignment rates, there were differences for some locations linked to the level of genetic diversity, differentiation, and heterozygosity. Our results show that the nuclear and chloroplast gene markers are effective to be used for a genetic certification system and can provide national and international authorities with a robust tool to confirm legality of timber.
UNIFORMLY MOST POWERFUL BAYESIAN TESTS
Johnson, Valen E.
2014-01-01
Uniformly most powerful tests are statistical hypothesis tests that provide the greatest power against a fixed null hypothesis among all tests of a given size. In this article, the notion of uniformly most powerful tests is extended to the Bayesian setting by defining uniformly most powerful Bayesian tests to be tests that maximize the probability that the Bayes factor, in favor of the alternative hypothesis, exceeds a specified threshold. Like their classical counterpart, uniformly most powerful Bayesian tests are most easily defined in one-parameter exponential family models, although extensions outside of this class are possible. The connection between uniformly most powerful tests and uniformly most powerful Bayesian tests can be used to provide an approximate calibration between p-values and Bayes factors. Finally, issues regarding the strong dependence of resulting Bayes factors and p-values on sample size are discussed. PMID:24659829
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
A Bayesian Nonparametric Approach to Test Equating
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2009-01-01
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…
Zeng, Jianyang; Roberts, Kyle E.; Zhou, Pei
2011-01-01
Abstract A major bottleneck in protein structure determination via nuclear magnetic resonance (NMR) is the lengthy and laborious process of assigning resonances and nuclear Overhauser effect (NOE) cross peaks. Recent studies have shown that accurate backbone folds can be determined using sparse NMR data, such as residual dipolar couplings (RDCs) or backbone chemical shifts. This opens a question of whether we can also determine the accurate protein side-chain conformations using sparse or unassigned NMR data. We attack this question by using unassigned nuclear Overhauser effect spectroscopy (NOESY) data, which records the through-space dipolar interactions between protons nearby in three-dimensional (3D) space. We propose a Bayesian approach with a Markov random field (MRF) model to integrate the likelihood function derived from observed experimental data, with prior information (i.e., empirical molecular mechanics energies) about the protein structures. We unify the side-chain structure prediction problem with the side-chain structure determination problem using unassigned NMR data, and apply the deterministic dead-end elimination (DEE) and A* search algorithms to provably find the global optimum solution that maximizes the posterior probability. We employ a Hausdorff-based measure to derive the likelihood of a rotamer or a pairwise rotamer interaction from unassigned NOESY data. In addition, we apply a systematic and rigorous approach to estimate the experimental noise in NMR data, which also determines the weighting factor of the data term in the scoring function derived from the Bayesian framework. We tested our approach on real NMR data of three proteins: the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), and human ubiquitin. The promising results indicate that our algorithm can be applied in high-resolution protein structure determination. Since our approach does not require any NOE assignment, it can accelerate the NMR structure determination process. PMID:21970619
A default Bayesian hypothesis test for mediation.
Nuijten, Michèle B; Wetzels, Ruud; Matzke, Dora; Dolan, Conor V; Wagenmakers, Eric-Jan
2015-03-01
In order to quantify the relationship between multiple variables, researchers often carry out a mediation analysis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estimation and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301-322, 2009), who outlined a Bayesian parameter estimation procedure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the Jeffreys-Zellner-Siow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo techniques implemented in JAGS. All Bayesian tests are implemented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014).
Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests
ERIC Educational Resources Information Center
Veldkamp, Bernard P.
2010-01-01
Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…
A Bayesian test for Hardy–Weinberg equilibrium of biallelic X-chromosomal markers
Puig, X; Ginebra, J; Graffelman, J
2017-01-01
The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy–Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy–Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used. PMID:28900292
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood
ERIC Educational Resources Information Center
Karabatsos, George
2017-01-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon…
Awad, Lara; Fady, Bruno; Khater, Carla; Roig, Anne; Cheddadi, Rachid
2014-01-01
The threatened conifer Abies cilicica currently persists in Lebanon in geographically isolated forest patches. The impact of demographic and evolutionary processes on population genetic diversity and structure were assessed using 10 nuclear microsatellite loci. All remnant 15 local populations revealed a low genetic variation but a high recent effective population size. FST-based measures of population genetic differentiation revealed a low spatial genetic structure, but Bayesian analysis of population structure identified a significant Northeast-Southwest population structure. Populations showed significant but weak isolation-by-distance, indicating non-equilibrium conditions between dispersal and genetic drift. Bayesian assignment tests detected an asymmetric Northeast-Southwest migration involving some long-distance dispersal events. We suggest that the persistence and Northeast-Southwest geographic structure of Abies cilicica in Lebanon is the result of at least two demographic processes during its recent evolutionary history: (1) recent migration to currently marginal populations and (2) local persistence through altitudinal shifts along a mountainous topography. These results might help us better understand the mechanisms involved in the species response to expected climate change. PMID:24587219
Dembo, Mana; Radovčić, Davorka; Garvin, Heather M; Laird, Myra F; Schroeder, Lauren; Scott, Jill E; Brophy, Juliet; Ackermann, Rebecca R; Musiba, Chares M; de Ruiter, Darryl J; Mooers, Arne Ø; Collard, Mark
2016-08-01
Homo naledi is a recently discovered species of fossil hominin from South Africa. A considerable amount is already known about H. naledi but some important questions remain unanswered. Here we report a study that addressed two of them: "Where does H. naledi fit in the hominin evolutionary tree?" and "How old is it?" We used a large supermatrix of craniodental characters for both early and late hominin species and Bayesian phylogenetic techniques to carry out three analyses. First, we performed a dated Bayesian analysis to generate estimates of the evolutionary relationships of fossil hominins including H. naledi. Then we employed Bayes factor tests to compare the strength of support for hypotheses about the relationships of H. naledi suggested by the best-estimate trees. Lastly, we carried out a resampling analysis to assess the accuracy of the age estimate for H. naledi yielded by the dated Bayesian analysis. The analyses strongly supported the hypothesis that H. naledi forms a clade with the other Homo species and Australopithecus sediba. The analyses were more ambiguous regarding the position of H. naledi within the (Homo, Au. sediba) clade. A number of hypotheses were rejected, but several others were not. Based on the available craniodental data, Homo antecessor, Asian Homo erectus, Homo habilis, Homo floresiensis, Homo sapiens, and Au. sediba could all be the sister taxon of H. naledi. According to the dated Bayesian analysis, the most likely age for H. naledi is 912 ka. This age estimate was supported by the resampling analysis. Our findings have a number of implications. Most notably, they support the assignment of the new specimens to Homo, cast doubt on the claim that H. naledi is simply a variant of H. erectus, and suggest H. naledi is younger than has been previously proposed. Copyright © 2016 Elsevier Ltd. All rights reserved.
Abdul-Latiff, Muhammad Abu Bakar; Ruslin, Farhani; Fui, Vun Vui; Abu, Mohd-Hashim; Rovie-Ryan, Jeffrine Japning; Abdul-Patah, Pazil; Lakim, Maklarin; Roos, Christian; Yaakop, Salmah; Md-Zain, Badrul Munir
2014-01-01
Abstract Phylogenetic relationships among Malaysia’s long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b) sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian) portray a consistent clustering paradigm as Borneo’s population was distinguished from Peninsula’s population (99% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees). The East coast population was separated from other Peninsula populations (64% in NJ, 66% in MP and 0.53 posterior probability in Bayesian). West coast populations were divided into 2 clades: the North-South (47%/54% in NJ, 26/26% in MP and 1.00/0.80 posterior probability in Bayesian) and Island-Mainland (93% in NJ, 90% in MP and 1.00 posterior probability in Bayesian). The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia’s M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia. PMID:24899832
Abdul-Latiff, Muhammad Abu Bakar; Ruslin, Farhani; Fui, Vun Vui; Abu, Mohd-Hashim; Rovie-Ryan, Jeffrine Japning; Abdul-Patah, Pazil; Lakim, Maklarin; Roos, Christian; Yaakop, Salmah; Md-Zain, Badrul Munir
2014-01-01
Phylogenetic relationships among Malaysia's long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b) sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian) portray a consistent clustering paradigm as Borneo's population was distinguished from Peninsula's population (99% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees). The East coast population was separated from other Peninsula populations (64% in NJ, 66% in MP and 0.53 posterior probability in Bayesian). West coast populations were divided into 2 clades: the North-South (47%/54% in NJ, 26/26% in MP and 1.00/0.80 posterior probability in Bayesian) and Island-Mainland (93% in NJ, 90% in MP and 1.00 posterior probability in Bayesian). The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia's M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia.
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
Bayesian Approaches to Imputation, Hypothesis Testing, and Parameter Estimation
ERIC Educational Resources Information Center
Ross, Steven J.; Mackey, Beth
2015-01-01
This chapter introduces three applications of Bayesian inference to common and novel issues in second language research. After a review of the critiques of conventional hypothesis testing, our focus centers on ways Bayesian inference can be used for dealing with missing data, for testing theory-driven substantive hypotheses without a default null…
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Development of dynamic Bayesian models for web application test management
NASA Astrophysics Data System (ADS)
Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.
2018-03-01
The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.
An agglomerative hierarchical clustering approach to visualisation in Bayesian clustering problems
Dawson, Kevin J.; Belkhir, Khalid
2009-01-01
Clustering problems (including the clustering of individuals into outcrossing populations, hybrid generations, full-sib families and selfing lines) have recently received much attention in population genetics. In these clustering problems, the parameter of interest is a partition of the set of sampled individuals, - the sample partition. In a fully Bayesian approach to clustering problems of this type, our knowledge about the sample partition is represented by a probability distribution on the space of possible sample partitions. Since the number of possible partitions grows very rapidly with the sample size, we can not visualise this probability distribution in its entirety, unless the sample is very small. As a solution to this visualisation problem, we recommend using an agglomerative hierarchical clustering algorithm, which we call the exact linkage algorithm. This algorithm is a special case of the maximin clustering algorithm that we introduced previously. The exact linkage algorithm is now implemented in our software package Partition View. The exact linkage algorithm takes the posterior co-assignment probabilities as input, and yields as output a rooted binary tree, - or more generally, a forest of such trees. Each node of this forest defines a set of individuals, and the node height is the posterior co-assignment probability of this set. This provides a useful visual representation of the uncertainty associated with the assignment of individuals to categories. It is also a useful starting point for a more detailed exploration of the posterior distribution in terms of the co-assignment probabilities. PMID:19337306
Adjaye-Gbewonyo, Dzifa; Bednarczyk, Robert A; Davis, Robert L; Omer, Saad B
2014-02-01
To validate classification of race/ethnicity based on the Bayesian Improved Surname Geocoding method (BISG) and assess variations in validity by gender and age. Secondary data on members of Kaiser Permanente Georgia, an integrated managed care organization, through 2010. For 191,494 members with self-reported race/ethnicity, probabilities for belonging to each of six race/ethnicity categories predicted from the BISG algorithm were used to assign individuals to a race/ethnicity category over a range of cutoffs greater than a probability of 0.50. Overall as well as gender- and age-stratified sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curves were generated and used to identify optimal cutoffs for race/ethnicity assignment. The overall cutoffs for assignment that optimized sensitivity and specificity ranged from 0.50 to 0.57 for the four main racial/ethnic categories (White, Black, Asian/Pacific Islander, Hispanic). Corresponding sensitivity, specificity, PPV, and NPV ranged from 64.4 to 81.4 percent, 80.8 to 99.7 percent, 75.0 to 91.6 percent, and 79.4 to 98.0 percent, respectively. Accuracy of assignment was better among males and individuals of 65 years or older. BISG may be useful for classifying race/ethnicity of health plan members when needed for health care studies. © Health Research and Educational Trust.
Tian, Ting; McLachlan, Geoffrey J.; Dieters, Mark J.; Basford, Kaye E.
2015-01-01
It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances. PMID:26689369
Tian, Ting; McLachlan, Geoffrey J; Dieters, Mark J; Basford, Kaye E
2015-01-01
It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.
On the robustness of a Bayes estimate. [in reliability theory
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1974-01-01
This paper examines the robustness of a Bayes estimator with respect to the assigned prior distribution. A Bayesian analysis for a stochastic scale parameter of a Weibull failure model is summarized in which the natural conjugate is assigned as the prior distribution of the random parameter. The sensitivity analysis is carried out by the Monte Carlo method in which, although an inverted gamma is the assigned prior, realizations are generated using distribution functions of varying shape. For several distributional forms and even for some fixed values of the parameter, simulated mean squared errors of Bayes and minimum variance unbiased estimators are determined and compared. Results indicate that the Bayes estimator remains squared-error superior and appears to be largely robust to the form of the assigned prior distribution.
Spertus, Jacob V; Normand, Sharon-Lise T
2018-04-23
High-dimensional data provide many potential confounders that may bolster the plausibility of the ignorability assumption in causal inference problems. Propensity score methods are powerful causal inference tools, which are popular in health care research and are particularly useful for high-dimensional data. Recent interest has surrounded a Bayesian treatment of propensity scores in order to flexibly model the treatment assignment mechanism and summarize posterior quantities while incorporating variance from the treatment model. We discuss methods for Bayesian propensity score analysis of binary treatments, focusing on modern methods for high-dimensional Bayesian regression and the propagation of uncertainty. We introduce a novel and simple estimator for the average treatment effect that capitalizes on conjugacy of the beta and binomial distributions. Through simulations, we show the utility of horseshoe priors and Bayesian additive regression trees paired with our new estimator, while demonstrating the importance of including variance from the treatment regression model. An application to cardiac stent data with almost 500 confounders and 9000 patients illustrates approaches and facilitates comparison with existing alternatives. As measured by a falsifiability endpoint, we improved confounder adjustment compared with past observational research of the same problem. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.
Torruella, Guifré; Derelle, Romain; Paps, Jordi; Lang, B. Franz; Roger, Andrew J.; Shalchian-Tabrizi, Kamran; Ruiz-Trillo, Iñaki
2012-01-01
Many of the eukaryotic phylogenomic analyses published to date were based on alignments of hundreds to thousands of genes. Frequently, in such analyses, the most realistic evolutionary models currently available are often used to minimize the impact of systematic error. However, controversy remains over whether or not idiosyncratic gene family dynamics (i.e., gene duplications and losses) and incorrect orthology assignments are always appropriately taken into account. In this paper, we present an innovative strategy for overcoming orthology assignment problems. Rather than identifying and eliminating genes with paralogy problems, we have constructed a data set comprised exclusively of conserved single-copy protein domains that, unlike most of the commonly used phylogenomic data sets, should be less confounded by orthology miss-assignments. To evaluate the power of this approach, we performed maximum likelihood and Bayesian analyses to infer the evolutionary relationships within the opisthokonts (which includes Metazoa, Fungi, and related unicellular lineages). We used this approach to test 1) whether Filasterea and Ichthyosporea form a clade, 2) the interrelationships of early-branching metazoans, and 3) the relationships among early-branching fungi. We also assessed the impact of some methods that are known to minimize systematic error, including reducing the distance between the outgroup and ingroup taxa or using the CAT evolutionary model. Overall, our analyses support the Filozoa hypothesis in which Ichthyosporea are the first holozoan lineage to emerge followed by Filasterea, Choanoflagellata, and Metazoa. Blastocladiomycota appears as a lineage separate from Chytridiomycota, although this result is not strongly supported. These results represent independent tests of previous phylogenetic hypotheses, highlighting the importance of sophisticated approaches for orthology assignment in phylogenomic analyses. PMID:21771718
Eurasiaplex: a forensic SNP assay for differentiating European and South Asian ancestries.
Phillips, C; Freire Aradas, A; Kriegel, A K; Fondevila, M; Bulbul, O; Santos, C; Serrulla Rech, F; Perez Carceles, M D; Carracedo, Á; Schneider, P M; Lareu, M V
2013-05-01
We have selected a set of single nucleotide polymorphisms (SNPs) with the specific aim of differentiating European and South Asian ancestries. The SNPs were combined into a 23-plex SNaPshot primer extension assay: Eurasiaplex, designed to complement an existing 34-plex forensic ancestry test with both marker sets occupying well-spaced genomic positions, enabling their combination as single profile submissions to the Bayesian Snipper forensic ancestry inference system. We analyzed the ability of Eurasiaplex plus 34plex SNPs to assign ancestry to a total 1648 profiles from 16 European, 7 Middle East, 13 Central-South Asian and 21 East Asian populations. Ancestry assignment likelihoods were estimated from Snipper using training sets of five-group data (three Eurasian groups, East Asian and African genotypes) and four-group data (Middle East genotypes removed). Five-group differentiations gave assignment success of 91% for NW European populations, 72% for Middle East populations and 39% for Central-South Asian populations, indicating Middle East individuals are not reliably differentiated from either Europeans or Central-South Asians. Four-group differentiations provided markedly improved assignment success rates of 97% for most continental Europeans tested (excluding Turkish and Adygei at the far eastern edge of Europe) and 95% for Central-South Asians, despite applying a probability threshold for the highest likelihood ratio above '100 times more likely'. As part of the assessment of the sensitivity of Eurasiaplex to analyze challenging forensic material we detail Eurasiaplex and 34-plex SNP typing to infer ancestry of a cranium recovered from the sea, achieving 82% SNP genotype completeness. Therefore, Eurasiaplex provides an informative and forensically robust approach to the differentiation of European and South Asian ancestries amongst Eurasian populations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Revised standards for statistical evidence.
Johnson, Valen E
2013-11-26
Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25-50:1, and to 100-200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.
Probabilistic Cross-identification of Cosmic Events
NASA Astrophysics Data System (ADS)
Budavári, Tamás
2011-08-01
I discuss a novel approach to identifying cosmic events in separate and independent observations. The focus is on the true events, such as supernova explosions, that happen once and, hence, whose measurements are not repeatable. Their classification and analysis must make the best use of all available data. Bayesian hypothesis testing is used to associate streams of events in space and time. Probabilities are assigned to the matches by studying their rates of occurrence. A case study of Type Ia supernovae illustrates how to use light curves in the cross-identification process. Constraints from realistic light curves happen to be well approximated by Gaussians in time, which makes the matching process very efficient. Model-dependent associations are computationally more demanding but can further boost one's confidence.
NASA Astrophysics Data System (ADS)
Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens
2018-06-01
This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.
Bayesian Model Comparison for the Order Restricted RC Association Model
ERIC Educational Resources Information Center
Iliopoulos, G.; Kateri, M.; Ntzoufras, I.
2009-01-01
Association models constitute an attractive alternative to the usual log-linear models for modeling the dependence between classification variables. They impose special structure on the underlying association by assigning scores on the levels of each classification variable, which can be fixed or parametric. Under the general row-column (RC)…
NASA Astrophysics Data System (ADS)
Ogorodnikov, Yuri; Khachay, Michael; Pljonkin, Anton
2018-04-01
We describe the possibility of employing the special case of the 3-SAT problem stemming from the well known integer factorization problem for the quantum cryptography. It is known, that for every instance of our 3-SAT setting the given 3-CNF is satisfiable by a unique truth assignment, and the goal is to find this assignment. Since the complexity status of the factorization problem is still undefined, development of approximation algorithms and heuristics adopts interest of numerous researchers. One of promising approaches to construction of approximation techniques is based on real-valued relaxation of the given 3-CNF followed by minimizing of the appropriate differentiable loss function, and subsequent rounding of the fractional minimizer obtained. Actually, algorithms developed this way differ by the rounding scheme applied on their final stage. We propose a new rounding scheme based on Bayesian learning. The article shows that the proposed method can be used to determine the security in quantum key distribution systems. In the quantum distribution the Shannon rules is applied and the factorization problem is paramount when decrypting secret keys.
Bayesian modeling of consumer behavior in the presence of anonymous visits
NASA Astrophysics Data System (ADS)
Novak, Julie Esther
Tailoring content to consumers has become a hallmark of marketing and digital media, particularly as it has become easier to identify customers across usage or purchase occasions. However, across a wide variety of contexts, companies find that customers do not consistently identify themselves, leaving a substantial fraction of anonymous visits. We develop a Bayesian hierarchical model that allows us to probabilistically assign anonymous sessions to users. These probabilistic assignments take into account a customer's demographic information, frequency of visitation, activities taken when visiting, and times of arrival. We present two studies, one with synthetic and one with real data, where we demonstrate improved performance over two popular practices (nearest-neighbor matching and deleting the anonymous visits) due to increased efficiency and reduced bias driven by the non-ignorability of which types of events are more likely to be anonymous. Using our proposed model, we avoid potential bias in understanding the effect of a firm's marketing on its customers, improve inference about the total number of customers in the dataset, and provide more precise targeted marketing to both previously observed and unobserved customers.
Harsch, Tobias; Schneider, Philipp; Kieninger, Bärbel; Donaubauer, Harald; Kalbitzer, Hans Robert
2017-02-01
Side chain amide protons of asparagine and glutamine residues in random-coil peptides are characterized by large chemical shift differences and can be stereospecifically assigned on the basis of their chemical shift values only. The bimodal chemical shift distributions stored in the biological magnetic resonance data bank (BMRB) do not allow such an assignment. However, an analysis of the BMRB shows, that a substantial part of all stored stereospecific assignments is not correct. We show here that in most cases stereospecific assignment can also be done for folded proteins using an unbiased artificial chemical shift data base (UACSB). For a separation of the chemical shifts of the two amide resonance lines with differences ≥0.40 ppm for asparagine and differences ≥0.42 ppm for glutamine, the downfield shifted resonance lines can be assigned to H δ21 and H ε21 , respectively, at a confidence level >95%. A classifier derived from UASCB can also be used to correct the BMRB data. The program tool AssignmentChecker implemented in AUREMOL calculates the Bayesian probability for a given stereospecific assignment and automatically corrects the assignments for a given list of chemical shifts.
NASA Astrophysics Data System (ADS)
Plant, N. G.; Thieler, E. R.; Gutierrez, B.; Lentz, E. E.; Zeigler, S. L.; Van Dongeren, A.; Fienen, M. N.
2016-12-01
We evaluate the strengths and weaknesses of Bayesian networks that have been used to address scientific and decision-support questions related to coastal geomorphology. We will provide an overview of coastal geomorphology research that has used Bayesian networks and describe what this approach can do and when it works (or fails to work). Over the past decade, Bayesian networks have been formulated to analyze the multi-variate structure and evolution of coastal morphology and associated human and ecological impacts. The approach relates observable system variables to each other by estimating discrete correlations. The resulting Bayesian-networks make predictions that propagate errors, conduct inference via Bayes rule, or both. In scientific applications, the model results are useful for hypothesis testing, using confidence estimates to gage the strength of tests while applications to coastal resource management are aimed at decision-support, where the probabilities of desired ecosystems outcomes are evaluated. The range of Bayesian-network applications to coastal morphology includes emulation of high-resolution wave transformation models to make oceanographic predictions, morphologic response to storms and/or sea-level rise, groundwater response to sea-level rise and morphologic variability, habitat suitability for endangered species, and assessment of monetary or human-life risk associated with storms. All of these examples are based on vast observational data sets, numerical model output, or both. We will discuss the progression of our experiments, which has included testing whether the Bayesian-network approach can be implemented and is appropriate for addressing basic and applied scientific problems and evaluating the hindcast and forecast skill of these implementations. We will present and discuss calibration/validation tests that are used to assess the robustness of Bayesian-network models and we will compare these results to tests of other models. This will demonstrate how Bayesian networks are used to extract new insights about coastal morphologic behavior, assess impacts to societal and ecological systems, and communicate probabilistic predictions to decision makers.
Precise Network Modeling of Systems Genetics Data Using the Bayesian Network Webserver.
Ziebarth, Jesse D; Cui, Yan
2017-01-01
The Bayesian Network Webserver (BNW, http://compbio.uthsc.edu/BNW ) is an integrated platform for Bayesian network modeling of biological datasets. It provides a web-based network modeling environment that seamlessly integrates advanced algorithms for probabilistic causal modeling and reasoning with Bayesian networks. BNW is designed for precise modeling of relatively small networks that contain less than 20 nodes. The structure learning algorithms used by BNW guarantee the discovery of the best (most probable) network structure given the data. To facilitate network modeling across multiple biological levels, BNW provides a very flexible interface that allows users to assign network nodes into different tiers and define the relationships between and within the tiers. This function is particularly useful for modeling systems genetics datasets that often consist of multiscalar heterogeneous genotype-to-phenotype data. BNW enables users to, within seconds or minutes, go from having a simply formatted input file containing a dataset to using a network model to make predictions about the interactions between variables and the potential effects of experimental interventions. In this chapter, we will introduce the functions of BNW and show how to model systems genetics datasets with BNW.
Bayesian reconstruction and use of anatomical a priori information for emission tomography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowsher, J.E.; Johnson, V.E.; Turkington, T.G.
1996-10-01
A Bayesian method is presented for simultaneously segmenting and reconstructing emission computed tomography (ECT) images and for incorporating high-resolution, anatomical information into those reconstructions. The anatomical information is often available from other imaging modalities such as computed tomography (CT) or magnetic resonance imaging (MRI). The Bayesian procedure models the ECT radiopharmaceutical distribution as consisting of regions, such that radiopharmaceutical activity is similar throughout each region. It estimates the number of regions, the mean activity of each region, and the region classification and mean activity of each voxel. Anatomical information is incorporated by assigning higher prior probabilities to ECT segmentations inmore » which each ECT region stays within a single anatomical region. This approach is effective because anatomical tissue type often strongly influences radiopharmaceutical uptake. The Bayesian procedure is evaluated using physically acquired single-photon emission computed tomography (SPECT) projection data and MRI for the three-dimensional (3-D) Hoffman brain phantom. A clinically realistic count level is used. A cold lesion within the brain phantom is created during the SPECT scan but not during the MRI to demonstrate that the estimation procedure can detect ECT structure that is not present anatomically.« less
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
NASA Astrophysics Data System (ADS)
Fuchs, Christopher A.; Schack, Rüdiger
2013-10-01
In the quantum-Bayesian interpretation of quantum theory (or QBism), the Born rule cannot be interpreted as a rule for setting measurement-outcome probabilities from an objective quantum state. But if not, what is the role of the rule? In this paper, the argument is given that it should be seen as an empirical addition to Bayesian reasoning itself. Particularly, it is shown how to view the Born rule as a normative rule in addition to usual Dutch-book coherence. It is a rule that takes into account how one should assign probabilities to the consequences of various intended measurements on a physical system, but explicitly in terms of prior probabilities for and conditional probabilities consequent upon the imagined outcomes of a special counterfactual reference measurement. This interpretation is exemplified by representing quantum states in terms of probabilities for the outcomes of a fixed, fiducial symmetric informationally complete measurement. The extent to which the general form of the new normative rule implies the full state-space structure of quantum mechanics is explored.
Model-based Bayesian inference for ROC data analysis
NASA Astrophysics Data System (ADS)
Lei, Tianhu; Bae, K. Ty
2013-03-01
This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
Informative priors on fetal fraction increase power of the noninvasive prenatal screen.
Xu, Hanli; Wang, Shaowei; Ma, Lin-Lin; Huang, Shuai; Liang, Lin; Liu, Qian; Liu, Yang-Yang; Liu, Ke-Di; Tan, Ze-Min; Ban, Hao; Guan, Yongtao; Lu, Zuhong
2017-11-09
PurposeNoninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction.MethodOur Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values.ResultsOur Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives.ConclusionBayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.Genetics in Medicine advance online publication, 9 November 2017; doi:10.1038/gim.2017.186.
Application of Bayesian Methods for Detecting Fraudulent Behavior on Tests
ERIC Educational Resources Information Center
Sinharay, Sandip
2018-01-01
Producers and consumers of test scores are increasingly concerned about fraudulent behavior before and during the test. There exist several statistical or psychometric methods for detecting fraudulent behavior on tests. This paper provides a review of the Bayesian approaches among them. Four hitherto-unpublished real data examples are provided to…
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood.
Karabatsos, George
2018-06-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon previous methods because it provides an omnibus test of the entire hierarchy of cancellation axioms, beyond double cancellation. It does so while accounting for the posterior uncertainty that is inherent in the empirical orderings that are implied by these axioms, together. The new method is illustrated through a test of the cancellation axioms on a classic survey data set, and through the analysis of simulated data.
Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina
2013-01-01
In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
Bayesian inference for disease prevalence using negative binomial group testing
Pritchard, Nicholas A.; Tebbs, Joshua M.
2011-01-01
Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308
Zhao, Rui; Catalano, Paul; DeGruttola, Victor G.; Michor, Franziska
2017-01-01
The dynamics of tumor burden, secreted proteins or other biomarkers over time, is often used to evaluate the effectiveness of therapy and to predict outcomes for patients. Many methods have been proposed to investigate longitudinal trends to better characterize patients and to understand disease progression. However, most approaches assume a homogeneous patient population and a uniform response trajectory over time and across patients. Here, we present a mixture piecewise linear Bayesian hierarchical model, which takes into account both population heterogeneity and nonlinear relationships between biomarkers and time. Simulation results show that our method was able to classify subjects according to their patterns of treatment response with greater than 80% accuracy in the three scenarios tested. We then applied our model to a large randomized controlled phase III clinical trial of multiple myeloma patients. Analysis results suggest that the longitudinal tumor burden trajectories in multiple myeloma patients are heterogeneous and nonlinear, even among patients assigned to the same treatment cohort. In addition, between cohorts, there are distinct differences in terms of the regression parameters and the distributions among categories in the mixture. Those results imply that longitudinal data from clinical trials may harbor unobserved subgroups and nonlinear relationships; accounting for both may be important for analyzing longitudinal data. PMID:28723910
Using Bayesian Learning to Classify College Algebra Students by Understanding in Real-Time
ERIC Educational Resources Information Center
Cousino, Andrew
2013-01-01
The goal of this work is to provide instructors with detailed information about their classes at each assignment during the term. The information is both on an individual level and at the aggregate level. We used the large number of grades, which are available online these days, along with data-mining techniques to build our models. This enabled…
A fast combination method in DSmT and its application to recommender system
Liu, Yihai
2018-01-01
In many applications involving epistemic uncertainties usually modeled by belief functions, it is often necessary to approximate general (non-Bayesian) basic belief assignments (BBAs) to subjective probabilities (called Bayesian BBAs). This necessity occurs if one needs to embed the fusion result in a system based on the probabilistic framework and Bayesian inference (e.g. tracking systems), or if one needs to make a decision in the decision making problems. In this paper, we present a new fast combination method, called modified rigid coarsening (MRC), to obtain the final Bayesian BBAs based on hierarchical decomposition (coarsening) of the frame of discernment. Regarding this method, focal elements with probabilities are coarsened efficiently to reduce computational complexity in the process of combination by using disagreement vector and a simple dichotomous approach. In order to prove the practicality of our approach, this new approach is applied to combine users’ soft preferences in recommender systems (RSs). Additionally, in order to make a comprehensive performance comparison, the proportional conflict redistribution rule #6 (PCR6) is regarded as a baseline in a range of experiments. According to the results of experiments, MRC is more effective in accuracy of recommendations compared to original Rigid Coarsening (RC) method and comparable in computational time. PMID:29351297
Holm Hansen, Christian; Warner, Pamela; Parker, Richard A; Walker, Brian R; Critchley, Hilary Od; Weir, Christopher J
2017-12-01
It is often unclear what specific adaptive trial design features lead to an efficient design which is also feasible to implement. This article describes the preparatory simulation study for a Bayesian response-adaptive dose-finding trial design. Dexamethasone for Excessive Menstruation aims to assess the efficacy of Dexamethasone in reducing excessive menstrual bleeding and to determine the best dose for further study. To maximise learning about the dose response, patients receive placebo or an active dose with randomisation probabilities adapting based on evidence from patients already recruited. The dose-response relationship is estimated using a flexible Bayesian Normal Dynamic Linear Model. Several competing design options were considered including: number of doses, proportion assigned to placebo, adaptation criterion, and number and timing of adaptations. We performed a fractional factorial study using SAS software to simulate virtual trial data for candidate adaptive designs under a variety of scenarios and to invoke WinBUGS for Bayesian model estimation. We analysed the simulated trial results using Normal linear models to estimate the effects of each design feature on empirical type I error and statistical power. Our readily-implemented approach using widely available statistical software identified a final design which performed robustly across a range of potential trial scenarios.
Gu, Hairong; Kim, Woojae; Hou, Fang; Lesmes, Luis Andres; Pitt, Mark A; Lu, Zhong-Lin; Myung, Jay I
2016-01-01
Measurement efficiency is of concern when a large number of observations are required to obtain reliable estimates for parametric models of vision. The standard entropy-based Bayesian adaptive testing procedures addressed the issue by selecting the most informative stimulus in sequential experimental trials. Noninformative, diffuse priors were commonly used in those tests. Hierarchical adaptive design optimization (HADO; Kim, Pitt, Lu, Steyvers, & Myung, 2014) further improves the efficiency of the standard Bayesian adaptive testing procedures by constructing an informative prior using data from observers who have already participated in the experiment. The present study represents an empirical validation of HADO in estimating the human contrast sensitivity function. The results show that HADO significantly improves the accuracy and precision of parameter estimates, and therefore requires many fewer observations to obtain reliable inference about contrast sensitivity, compared to the method of quick contrast sensitivity function (Lesmes, Lu, Baek, & Albright, 2010), which uses the standard Bayesian procedure. The improvement with HADO was maintained even when the prior was constructed from heterogeneous populations or a relatively small number of observers. These results of this case study support the conclusion that HADO can be used in Bayesian adaptive testing by replacing noninformative, diffuse priors with statistically justified informative priors without introducing unwanted bias.
Gu, Hairong; Kim, Woojae; Hou, Fang; Lesmes, Luis Andres; Pitt, Mark A.; Lu, Zhong-Lin; Myung, Jay I.
2016-01-01
Measurement efficiency is of concern when a large number of observations are required to obtain reliable estimates for parametric models of vision. The standard entropy-based Bayesian adaptive testing procedures addressed the issue by selecting the most informative stimulus in sequential experimental trials. Noninformative, diffuse priors were commonly used in those tests. Hierarchical adaptive design optimization (HADO; Kim, Pitt, Lu, Steyvers, & Myung, 2014) further improves the efficiency of the standard Bayesian adaptive testing procedures by constructing an informative prior using data from observers who have already participated in the experiment. The present study represents an empirical validation of HADO in estimating the human contrast sensitivity function. The results show that HADO significantly improves the accuracy and precision of parameter estimates, and therefore requires many fewer observations to obtain reliable inference about contrast sensitivity, compared to the method of quick contrast sensitivity function (Lesmes, Lu, Baek, & Albright, 2010), which uses the standard Bayesian procedure. The improvement with HADO was maintained even when the prior was constructed from heterogeneous populations or a relatively small number of observers. These results of this case study support the conclusion that HADO can be used in Bayesian adaptive testing by replacing noninformative, diffuse priors with statistically justified informative priors without introducing unwanted bias. PMID:27105061
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
Exact Bayesian p-values for a test of independence in a 2 × 2 contingency table with missing data.
Lin, Yan; Lipsitz, Stuart R; Sinha, Debajyoti; Fitzmaurice, Garrett; Lipshultz, Steven
2017-01-01
Altham (Altham PME. Exact Bayesian analysis of a 2 × 2 contingency table, and Fisher's "exact" significance test. J R Stat Soc B 1969; 31: 261-269) showed that a one-sided p-value from Fisher's exact test of independence in a 2 × 2 contingency table is equal to the posterior probability of negative association in the 2 × 2 contingency table under a Bayesian analysis using an improper prior. We derive an extension of Fisher's exact test p-value in the presence of missing data, assuming the missing data mechanism is ignorable (i.e., missing at random or completely at random). Further, we propose Bayesian p-values for a test of independence in a 2 × 2 contingency table with missing data using alternative priors; we also present results from a simulation study exploring the Type I error rate and power of the proposed exact test p-values. An example, using data on the association between blood pressure and a cardiac enzyme, is presented to illustrate the methods.
Jiménez, Rosa Alicia
2016-01-01
The influence of geologic and Pleistocene glacial cycles might result in morphological and genetic complex scenarios in the biota of the Mesoamerican region. We tested whether berylline, blue-tailed and steely-blue hummingbirds, Amazilia beryllina, Amazilia cyanura and Amazilia saucerottei, show evidence of historical or current introgression as their plumage colour variation might suggest. We also analysed the role of past and present climatic events in promoting genetic introgression and species diversification. We collected mitochondrial DNA (mtDNA) sequence data and microsatellite loci scores for populations throughout the range of the three Amazilia species, as well as morphological and ecological data. Haplotype network, Bayesian phylogenetic and divergence time inference, historical demography, palaeodistribution modelling, and niche divergence tests were used to reconstruct the evolutionary history of this Amazilia species complex. An isolation-with-migration coalescent model and Bayesian assignment analysis were assessed to determine historical introgression and current genetic admixture. mtDNA haplotypes were geographically unstructured, with haplotypes from disparate areas interdispersed on a shallow tree and an unresolved haplotype network. Assignment analysis of the nuclear genome (nuDNA) supported three genetic groups with signs of genetic admixture, corresponding to: (1) A. beryllina populations located west of the Isthmus of Tehuantepec; (2) A. cyanura populations between the Isthmus of Tehuantepec and the Nicaraguan Depression (Nuclear Central America); and (3) A. saucerottei populations southeast of the Nicaraguan Depression. Gene flow and divergence time estimates, and demographic and palaeodistribution patterns suggest an evolutionary history of introgression mediated by Quaternary climatic fluctuations. High levels of gene flow were indicated by mtDNA and asymmetrical isolation-with-migration, whereas the microsatellite analyses found evidence for three genetic clusters with distributions corresponding to isolation by the Isthmus of Tehuantepec and the Nicaraguan Depression and signs of admixture. Historical levels of migration between genetically distinct groups estimated using microsatellites were higher than contemporary levels of migration. These results support the scenario of secondary contact and range contact during the glacial periods of the Pleistocene and strongly imply that the high levels of structure currently observed are a consequence of the limited dispersal of these hummingbirds across the isthmus and depression barriers. PMID:26788433
NASA Astrophysics Data System (ADS)
Richards, Vincent P.; Bernard, Andrea M.; Feldheim, Kevin A.; Shivji, Mahmood S.
2016-09-01
Sponges are one of the dominant fauna on Florida and Caribbean coral reefs, with species diversity often exceeding that of scleractinian corals. Despite the key role of sponges as structural components, habitat providers, and nutrient recyclers in reef ecosystems, their dispersal dynamics are little understood. We used ten microsatellite markers to study the population structure and dispersal patterns of a prominent reef species, the giant barrel sponge ( Xestospongia muta), the long-lived "redwood" of the reef, throughout Florida and the Caribbean. F-statistics, exact tests of population differentiation, and Bayesian multi-locus genotype analyses revealed high levels of overall genetic partitioning ( F ST = 0.12, P = 0.001) and grouped 363 individuals collected from the Bahamas, Honduras, US Virgin Islands, Key Largo (Florida), and the remainder of the Florida reef tract into at minimum five genetic clusters ( K = 5). Exact tests, however, revealed further differentiation, grouping sponges sampled from five locations across the Florida reef tract (~250 km) into three populations, suggesting a total of six genetic populations across the eight locations sampled. Assignment tests showed dispersal over ecological timescales to be limited to relatively short distances, as the only migration detected among populations was within the Florida reef tract. Consequently, populations of this major coral reef benthic constituent appear largely self-recruiting. A combination of levels of genetic differentiation, genetic distance, and assignment tests support the important role of the Caribbean and Florida currents in shaping patterns of contemporary and historical gene flow in this widespread coral reef species.
Using Bayesian Networks to Improve Knowledge Assessment
ERIC Educational Resources Information Center
Millan, Eva; Descalco, Luis; Castillo, Gladys; Oliveira, Paula; Diogo, Sandra
2013-01-01
In this paper, we describe the integration and evaluation of an existing generic Bayesian student model (GBSM) into an existing computerized testing system within the Mathematics Education Project (PmatE--Projecto Matematica Ensino) of the University of Aveiro. This generic Bayesian student model had been previously evaluated with simulated…
Using Alien Coins to Test Whether Simple Inference Is Bayesian
ERIC Educational Resources Information Center
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Uncertain deduction and conditional reasoning.
Evans, Jonathan St B T; Thompson, Valerie A; Over, David E
2015-01-01
There has been a paradigm shift in the psychology of deductive reasoning. Many researchers no longer think it is appropriate to ask people to assume premises and decide what necessarily follows, with the results evaluated by binary extensional logic. Most every day and scientific inference is made from more or less confidently held beliefs and not assumptions, and the relevant normative standard is Bayesian probability theory. We argue that the study of "uncertain deduction" should directly ask people to assign probabilities to both premises and conclusions, and report an experiment using this method. We assess this reasoning by two Bayesian metrics: probabilistic validity and coherence according to probability theory. On both measures, participants perform above chance in conditional reasoning, but they do much better when statements are grouped as inferences, rather than evaluated in separate tasks.
Bayesian Learning and the Psychology of Rule Induction
ERIC Educational Resources Information Center
Endress, Ansgar D.
2013-01-01
In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to…
The researcher and the consultant: from testing to probability statements.
Hamra, Ghassan B; Stang, Andreas; Poole, Charles
2015-09-01
In the first instalment of this series, Stang and Poole provided an overview of Fisher significance testing (ST), Neyman-Pearson null hypothesis testing (NHT), and their unfortunate and unintended offspring, null hypothesis significance testing. In addition to elucidating the distinction between the first two and the evolution of the third, the authors alluded to alternative models of statistical inference; namely, Bayesian statistics. Bayesian inference has experienced a revival in recent decades, with many researchers advocating for its use as both a complement and an alternative to NHT and ST. This article will continue in the direction of the first instalment, providing practicing researchers with an introduction to Bayesian inference. Our work will draw on the examples and discussion of the previous dialogue.
Bayesian Inference in the Modern Design of Experiments
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Bayesian Estimation Supersedes the "t" Test
ERIC Educational Resources Information Center
Kruschke, John K.
2013-01-01
Bayesian estimation for 2 groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their difference, and the normality of the data. The method handles outliers. The decision rule can accept the null value (unlike traditional "t" tests) when certainty in the estimate is…
Lim, Cherry; Wannapinij, Prapass; White, Lisa; Day, Nicholas P J; Cooper, Ben S; Peacock, Sharon J; Limmathurotsakul, Direk
2013-01-01
Estimates of the sensitivity and specificity for new diagnostic tests based on evaluation against a known gold standard are imprecise when the accuracy of the gold standard is imperfect. Bayesian latent class models (LCMs) can be helpful under these circumstances, but the necessary analysis requires expertise in computational programming. Here, we describe open-access web-based applications that allow non-experts to apply Bayesian LCMs to their own data sets via a user-friendly interface. Applications for Bayesian LCMs were constructed on a web server using R and WinBUGS programs. The models provided (http://mice.tropmedres.ac) include two Bayesian LCMs: the two-tests in two-population model (Hui and Walter model) and the three-tests in one-population model (Walter and Irwig model). Both models are available with simplified and advanced interfaces. In the former, all settings for Bayesian statistics are fixed as defaults. Users input their data set into a table provided on the webpage. Disease prevalence and accuracy of diagnostic tests are then estimated using the Bayesian LCM, and provided on the web page within a few minutes. With the advanced interfaces, experienced researchers can modify all settings in the models as needed. These settings include correlation among diagnostic test results and prior distributions for all unknown parameters. The web pages provide worked examples with both models using the original data sets presented by Hui and Walter in 1980, and by Walter and Irwig in 1988. We also illustrate the utility of the advanced interface using the Walter and Irwig model on a data set from a recent melioidosis study. The results obtained from the web-based applications were comparable to those published previously. The newly developed web-based applications are open-access and provide an important new resource for researchers worldwide to evaluate new diagnostic tests.
Steingroever, Helen; Pachur, Thorsten; Šmíra, Martin; Lee, Michael D
2018-06-01
The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.
Jakob, Sabine S.; Rödder, Dennis; Engler, Jan O.; Shaaf, Salar; Özkan, Hakan; Blattner, Frank R.; Kilian, Benjamin
2014-01-01
Studies of Hordeum vulgare subsp. spontaneum, the wild progenitor of cultivated barley, have mostly relied on materials collected decades ago and maintained since then ex situ in germplasm repositories. We analyzed spatial genetic variation in wild barley populations collected rather recently, exploring sequence variations at seven single-copy nuclear loci, and inferred the relationships among these populations and toward the genepool of the crop. The wild barley collection covers the whole natural distribution area from the Mediterranean to Middle Asia. In contrast to earlier studies, Bayesian assignment analyses revealed three population clusters, in the Levant, Turkey, and east of Turkey, respectively. Genetic diversity was exceptionally high in the Levant, while eastern populations were depleted of private alleles. Species distribution modeling based on climate parameters and extant occurrence points of the taxon inferred suitable habitat conditions during the ice-age, particularly in the Levant and Turkey. Together with the ecologically wide range of habitats, they might contribute to structured but long-term stable populations in this region and their high genetic diversity. For recently collected individuals, Bayesian assignment to geographic clusters was generally unambiguous, but materials from genebanks often showed accessions that were not placed according to their assumed geographic origin or showed traces of introgression from cultivated barley. We assign this to gene flow among accessions during ex situ maintenance. Evolutionary studies based on such materials might therefore result in wrong conclusions regarding the history of the species or the origin and mode of domestication of the crop, depending on the accessions included. PMID:24586028
Estimation of Post-Test Probabilities by Residents: Bayesian Reasoning versus Heuristics?
ERIC Educational Resources Information Center
Hall, Stacey; Phang, Sen Han; Schaefer, Jeffrey P.; Ghali, William; Wright, Bruce; McLaughlin, Kevin
2014-01-01
Although the process of diagnosing invariably begins with a heuristic, we encourage our learners to support their diagnoses by analytical cognitive processes, such as Bayesian reasoning, in an attempt to mitigate the effects of heuristics on diagnosing. There are, however, limited data on the use ± impact of Bayesian reasoning on the accuracy of…
Is Bayesian Estimation Proper for Estimating the Individual's Ability? Research Report 80-3.
ERIC Educational Resources Information Center
Samejima, Fumiko
The effect of prior information in Bayesian estimation is considered, mainly from the standpoint of objective testing. In the estimation of a parameter belonging to an individual, the prior information is, in most cases, the density function of the population to which the individual belongs. Bayesian estimation was compared with maximum likelihood…
ERIC Educational Resources Information Center
Wu, Haiyan
2013-01-01
General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…
Estimation of post-test probabilities by residents: Bayesian reasoning versus heuristics?
Hall, Stacey; Phang, Sen Han; Schaefer, Jeffrey P; Ghali, William; Wright, Bruce; McLaughlin, Kevin
2014-08-01
Although the process of diagnosing invariably begins with a heuristic, we encourage our learners to support their diagnoses by analytical cognitive processes, such as Bayesian reasoning, in an attempt to mitigate the effects of heuristics on diagnosing. There are, however, limited data on the use ± impact of Bayesian reasoning on the accuracy of disease probability estimates. In this study our objective was to explore whether Internal Medicine residents use a Bayesian process to estimate disease probabilities by comparing their disease probability estimates to literature-derived Bayesian post-test probabilities. We gave 35 Internal Medicine residents four clinical vignettes in the form of a referral letter and asked them to estimate the post-test probability of the target condition in each case. We then compared these to literature-derived probabilities. For each vignette the estimated probability was significantly different from the literature-derived probability. For the two cases with low literature-derived probability our participants significantly overestimated the probability of these target conditions being the correct diagnosis, whereas for the two cases with high literature-derived probability the estimated probability was significantly lower than the calculated value. Our results suggest that residents generate inaccurate post-test probability estimates. Possible explanations for this include ineffective application of Bayesian reasoning, attribute substitution whereby a complex cognitive task is replaced by an easier one (e.g., a heuristic), or systematic rater bias, such as central tendency bias. Further studies are needed to identify the reasons for inaccuracy of disease probability estimates and to explore ways of improving accuracy.
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Children Can Solve Bayesian Problems: The Role of Representation in Mental Computation
ERIC Educational Resources Information Center
Zhu, Liqi; Gigerenzer, Gerd
2006-01-01
Can children reason the Bayesian way? We argue that the answer to this question depends on how numbers are represented, because a representation can do part of the computation. We test, for the first time, whether Bayesian reasoning can be elicited in children by means of natural frequencies. We show that when information was presented to fourth,…
ERIC Educational Resources Information Center
Jenkins, Gavin W.; Samuelson, Larissa K.; Smith, Jodi R.; Spencer, John P.
2015-01-01
It is unclear how children learn labels for multiple overlapping categories such as "Labrador," "dog," and "animal." Xu and Tenenbaum (2007a) suggested that learners infer correct meanings with the help of Bayesian inference. They instantiated these claims in a Bayesian model, which they tested with preschoolers and…
With or without you: predictive coding and Bayesian inference in the brain
Aitchison, Laurence; Lengyel, Máté
2018-01-01
Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly. PMID:28942084
Uncertain deduction and conditional reasoning
Evans, Jonathan St. B. T.; Thompson, Valerie A.; Over, David E.
2015-01-01
There has been a paradigm shift in the psychology of deductive reasoning. Many researchers no longer think it is appropriate to ask people to assume premises and decide what necessarily follows, with the results evaluated by binary extensional logic. Most every day and scientific inference is made from more or less confidently held beliefs and not assumptions, and the relevant normative standard is Bayesian probability theory. We argue that the study of “uncertain deduction” should directly ask people to assign probabilities to both premises and conclusions, and report an experiment using this method. We assess this reasoning by two Bayesian metrics: probabilistic validity and coherence according to probability theory. On both measures, participants perform above chance in conditional reasoning, but they do much better when statements are grouped as inferences, rather than evaluated in separate tasks. PMID:25904888
Sa-Ngamuang, Chaitawat; Haddawy, Peter; Luvira, Viravarn; Piyaphanee, Watcharapong; Iamsirithaworn, Sopon; Lawpoolsri, Saranath
2018-06-18
Differentiating dengue patients from other acute febrile illness patients is a great challenge among physicians. Several dengue diagnosis methods are recommended by WHO. The application of specific laboratory tests is still limited due to high cost, lack of equipment, and uncertain validity. Therefore, clinical diagnosis remains a common practice especially in resource limited settings. Bayesian networks have been shown to be a useful tool for diagnostic decision support. This study aimed to construct Bayesian network models using basic demographic, clinical, and laboratory profiles of acute febrile illness patients to diagnose dengue. Data of 397 acute undifferentiated febrile illness patients who visited the fever clinic of the Bangkok Hospital for Tropical Diseases, Thailand, were used for model construction and validation. The two best final models were selected: one with and one without NS1 rapid test result. The diagnostic accuracy of the models was compared with that of physicians on the same set of patients. The Bayesian network models provided good diagnostic accuracy of dengue infection, with ROC AUC of 0.80 and 0.75 for models with and without NS1 rapid test result, respectively. The models had approximately 80% specificity and 70% sensitivity, similar to the diagnostic accuracy of the hospital's fellows in infectious disease. Including information on NS1 rapid test improved the specificity, but reduced the sensitivity, both in model and physician diagnoses. The Bayesian network model developed in this study could be useful to assist physicians in diagnosing dengue, particularly in regions where experienced physicians and laboratory confirmation tests are limited.
Moore, Jean-Sébastien; Bourret, Vincent; Dionne, Mélanie; Bradbury, Ian; O'Reilly, Patrick; Kent, Matthew; Chaput, Gérald; Bernatchez, Louis
2014-12-01
Anadromous Atlantic salmon (Salmo salar) is a species of major conservation and management concern in North America, where population abundance has been declining over the past 30 years. Effective conservation actions require the delineation of conservation units to appropriately reflect the spatial scale of intraspecific variation and local adaptation. Towards this goal, we used the most comprehensive genetic and genomic database for Atlantic salmon to date, covering the entire North American range of the species. The database included microsatellite data from 9142 individuals from 149 sampling locations and data from a medium-density SNP array providing genotypes for >3000 SNPs for 50 sampling locations. We used neutral and putatively selected loci to integrate adaptive information in the definition of conservation units. Bayesian clustering with the microsatellite data set and with neutral SNPs identified regional groupings largely consistent with previously published regional assessments. The use of outlier SNPs did not result in major differences in the regional groupings, suggesting that neutral markers can reflect the geographic scale of local adaptation despite not being under selection. We also performed assignment tests to compare power obtained from microsatellites, neutral SNPs and outlier SNPs. Using SNP data substantially improved power compared to microsatellites, and an assignment success of 97% to the population of origin and of 100% to the region of origin was achieved when all SNP loci were used. Using outlier SNPs only resulted in minor improvements to assignment success to the population of origin but improved regional assignment. We discuss the implications of these new genetic resources for the conservation and management of Atlantic salmon in North America. © 2014 John Wiley & Sons Ltd.
Sequential Probability Ratio Test for Collision Avoidance Maneuver Decisions
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis
2010-01-01
When facing a conjunction between space objects, decision makers must chose whether to maneuver for collision avoidance or not. We apply a well-known decision procedure, the sequential probability ratio test, to this problem. We propose two approaches to the problem solution, one based on a frequentist method, and the other on a Bayesian method. The frequentist method does not require any prior knowledge concerning the conjunction, while the Bayesian method assumes knowledge of prior probability densities. Our results show that both methods achieve desired missed detection rates, but the frequentist method's false alarm performance is inferior to the Bayesian method's
Meinzer, Caitlyn; Martin, Renee; Suarez, Jose I
2017-09-08
In phase II trials, the most efficacious dose is usually not known. Moreover, given limited resources, it is difficult to robustly identify a dose while also testing for a signal of efficacy that would support a phase III trial. Recent designs have sought to be more efficient by exploring multiple doses through the use of adaptive strategies. However, the added flexibility may potentially increase the risk of making incorrect assumptions and reduce the total amount of information available across the dose range as a function of imbalanced sample size. To balance these challenges, a novel placebo-controlled design is presented in which a restricted Bayesian response adaptive randomization (RAR) is used to allocate a majority of subjects to the optimal dose of active drug, defined as the dose with the lowest probability of poor outcome. However, the allocation between subjects who receive active drug or placebo is held constant to retain the maximum possible power for a hypothesis test of overall efficacy comparing the optimal dose to placebo. The design properties and optimization of the design are presented in the context of a phase II trial for subarachnoid hemorrhage. For a fixed total sample size, a trade-off exists between the ability to select the optimal dose and the probability of rejecting the null hypothesis. This relationship is modified by the allocation ratio between active and control subjects, the choice of RAR algorithm, and the number of subjects allocated to an initial fixed allocation period. While a responsive RAR algorithm improves the ability to select the correct dose, there is an increased risk of assigning more subjects to a worse arm as a function of ephemeral trends in the data. A subarachnoid treatment trial is used to illustrate how this design can be customized for specific objectives and available data. Bayesian adaptive designs are a flexible approach to addressing multiple questions surrounding the optimal dose for treatment efficacy within the context of limited resources. While the design is general enough to apply to many situations, future work is needed to address interim analyses and the incorporation of models for dose response.
Bayesian networks for satellite payload testing
NASA Astrophysics Data System (ADS)
Przytula, Krzysztof W.; Hagen, Frank; Yung, Kar
1999-11-01
Satellite payloads are fast increasing in complexity, resulting in commensurate growth in cost of manufacturing and operation. A need exists for a software tool, which would assist engineers in production and operation of satellite systems. We have designed and implemented a software tool, which performs part of this task. The tool aids a test engineer in debugging satellite payloads during system testing. At this stage of satellite integration and testing both the tested payload and the testing equipment represent complicated systems consisting of a very large number of components and devices. When an error is detected during execution of a test procedure, the tool presents to the engineer a ranked list of potential sources of the error and a list of recommended further tests. The engineer decides this on this basis if to perform some of the recommended additional test or replace the suspect component. The tool has been installed in payload testing facility. The tool is based on Bayesian networks, a graphical method of representing uncertainty in terms of probabilistic influences. The Bayesian network was configured using detailed flow diagrams of testing procedures and block diagrams of the payload and testing hardware. The conditional and prior probability values were initially obtained from experts and refined in later stages of design. The Bayesian network provided a very informative model of the payload and testing equipment and inspired many new ideas regarding the future test procedures and testing equipment configurations. The tool is the first step in developing a family of tools for various phases of satellite integration and operation.
Moscoso del Prado Martín, Fermín
2013-12-01
I introduce the Bayesian assessment of scaling (BAS), a simple but powerful Bayesian hypothesis contrast methodology that can be used to test hypotheses on the scaling regime exhibited by a sequence of behavioral data. Rather than comparing parametric models, as typically done in previous approaches, the BAS offers a direct, nonparametric way to test whether a time series exhibits fractal scaling. The BAS provides a simpler and faster test than do previous methods, and the code for making the required computations is provided. The method also enables testing of finely specified hypotheses on the scaling indices, something that was not possible with the previously available methods. I then present 4 simulation studies showing that the BAS methodology outperforms the other methods used in the psychological literature. I conclude with a discussion of methodological issues on fractal analyses in experimental psychology. PsycINFO Database Record (c) 2014 APA, all rights reserved.
A Bayesian Method for Evaluating Passing Scores: The PPoP Curve
ERIC Educational Resources Information Center
Wainer, Howard; Wang, X. A.; Skorupski, William P.; Bradlow, Eric T.
2005-01-01
In this note, we demonstrate an interesting use of the posterior distributions (and corresponding posterior samples of proficiency) that are yielded by fitting a fully Bayesian test scoring model to a complex assessment. Specifically, we examine the efficacy of the test in combination with the specific passing score that was chosen through expert…
Bayesian Ideal Types: Integration of Psychometric Data for Visually Impaired Persons.
ERIC Educational Resources Information Center
Jones, W. P.
1991-01-01
A model is proposed for the clinical synthesis of data from psychological tests of persons with visual impairments. The model integrates the concepts of the ideal type and Bayesian probability and compares actual test scores with ideal scores through use of a pattern similarity coefficient. A pilot study with Business Enterprise Program operators…
Fazzi-Gomes, P F; Melo, N; Palheta, G; Guerreiro, S; Amador, M; Ribeiro-Dos-Santos, A K; Santos, S; Hamoy, I
2017-02-08
Genetic variability is one of the important criteria for species conservation decisions. This study aimed to analyze the genetic diversity and the population differentiation of two natural populations of Arapaima gigas, a species with a long history of being commercially exploited. We collected 87 samples of A. gigas from Grande Curuai Lake and Paru Lake, located in the Lower Amazon region of Amazônia, Brazil, and genotyped these samples using a multiplex panel of microsatellite markers. Our results showed that the populations of A. gigas analyzed had high levels of genetic variability, which were similar to those described in previous studies. These two populations had a significant population differentiation supported by the estimates of F ST and R ST (0.06), by Bayesian analysis (K = 2), and by population assignment tests, which revealed a moderate genetic distance.
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Tenenbaum, Joshua B.
2011-01-01
Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
Bayesian data analysis tools for atomic physics
NASA Astrophysics Data System (ADS)
Trassinelli, Martino
2017-10-01
We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested_fit to calculate the different probability distributions and other related quantities. Nested_fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.
A Rapid Item-Search Procedure for Bayesian Adaptive Testing.
1977-05-01
properties of the • procedure , they migh t well introduce undesirable psychological effects on test scores (e.g., Betz & Weiss , 1976r.’ , 1976b...ge of results and adaptive ability test .~~~~ (Research Rep . 76—4). Minneapolis: University of Minnesota , Departmen t of Psychology , Psychometric...t~~[AH ~~~ ~~~~ r _ _ _ _ A RAPID ITEM -SEARC H PROCEDURE FOR BAYESIAN ADAPTIVE TESTING C. David Vale d D D Can David J . Weiss RESEARCH REPORT 77-n
Chapinal, Núria; Schumaker, Brant A; Joly, Damien O; Elkin, Brett T; Stephen, Craig
2015-07-01
We estimated the sensitivity and specificity of the caudal-fold skin test (CFT), the fluorescent polarization assay (FPA), and the rapid lateral-flow test (RT) for the detection of Mycobacterium bovis in free-ranging wild wood bison (Bison bison athabascae), in the absence of a gold standard, by using Bayesian analysis, and then used those estimates to forecast the performance of a pairwise combination of tests in parallel. In 1998-99, 212 wood bison from Wood Buffalo National Park (Canada) were tested for M. bovis infection using CFT and two serologic tests (FPA and RT). The sensitivity and specificity of each test were estimated using a three-test, one-population, Bayesian model allowing for conditional dependence between FPA and RT. The sensitivity and specificity of the combination of CFT and each serologic test in parallel were calculated assuming conditional independence. The test performance estimates were influenced by the prior values chosen. However, the rank of tests and combinations of tests based on those estimates remained constant. The CFT was the most sensitive test and the FPA was the least sensitive, whereas RT was the most specific test and CFT was the least specific. In conclusion, given the fact that gold standards for the detection of M. bovis are imperfect and difficult to obtain in the field, Bayesian analysis holds promise as a tool to rank tests and combinations of tests based on their performance. Combining a skin test with an animal-side serologic test, such as RT, increases sensitivity in the detection of M. bovis and is a good approach to enhance disease eradication or control in wild bison.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrews, Stephen A.; Sigeti, David E.
These are a set of slides about Bayesian hypothesis testing, where many hypotheses are tested. The conclusions are the following: The value of the Bayes factor obtained when using the median of the posterior marginal is almost the minimum value of the Bayes factor. The value of τ 2 which minimizes the Bayes factor is a reasonable choice for this parameter. This allows a likelihood ratio to be computed with is the least favorable to H 0.
Navarrete, Gorka; Correia, Rut; Sirota, Miroslav; Juanchich, Marie; Huepe, David
2015-01-01
Most of the research on Bayesian reasoning aims to answer theoretical questions about the extent to which people are able to update their beliefs according to Bayes' Theorem, about the evolutionary nature of Bayesian inference, or about the role of cognitive abilities in Bayesian inference. Few studies aim to answer practical, mainly health-related questions, such as, “What does it mean to have a positive test in a context of cancer screening?” or “What is the best way to communicate a medical test result so a patient will understand it?”. This type of research aims to translate empirical findings into effective ways of providing risk information. In addition, the applied research often adopts the paradigms and methods of the theoretically-motivated research. But sometimes it works the other way around, and the theoretical research borrows the importance of the practical question in the medical context. The study of Bayesian reasoning is relevant to risk communication in that, to be as useful as possible, applied research should employ specifically tailored methods and contexts specific to the recipients of the risk information. In this paper, we concentrate on the communication of the result of medical tests and outline the epidemiological and test parameters that affect the predictive power of a test—whether it is correct or not. Building on this, we draw up recommendations for better practice to convey the results of medical tests that could inform health policy makers (What are the drawbacks of mass screenings?), be used by health practitioners and, in turn, help patients to make better and more informed decisions. PMID:26441711
Numerical study on the sequential Bayesian approach for radioactive materials detection
NASA Astrophysics Data System (ADS)
Qingpei, Xiang; Dongfeng, Tian; Jianyu, Zhu; Fanhua, Hao; Ge, Ding; Jun, Zeng
2013-01-01
A new detection method, based on the sequential Bayesian approach proposed by Candy et al., offers new horizons for the research of radioactive detection. Compared with the commonly adopted detection methods incorporated with statistical theory, the sequential Bayesian approach offers the advantages of shorter verification time during the analysis of spectra that contain low total counts, especially in complex radionuclide components. In this paper, a simulation experiment platform implanted with the methodology of sequential Bayesian approach was developed. Events sequences of γ-rays associating with the true parameters of a LaBr3(Ce) detector were obtained based on an events sequence generator using Monte Carlo sampling theory to study the performance of the sequential Bayesian approach. The numerical experimental results are in accordance with those of Candy. Moreover, the relationship between the detection model and the event generator, respectively represented by the expected detection rate (Am) and the tested detection rate (Gm) parameters, is investigated. To achieve an optimal performance for this processor, the interval of the tested detection rate as a function of the expected detection rate is also presented.
Bayesian Estimation of Combined Accuracy for Tests with Verification Bias
Broemeling, Lyle D.
2011-01-01
This presentation will emphasize the estimation of the combined accuracy of two or more tests when verification bias is present. Verification bias occurs when some of the subjects are not subject to the gold standard. The approach is Bayesian where the estimation of test accuracy is based on the posterior distribution of the relevant parameter. Accuracy of two combined binary tests is estimated employing either “believe the positive” or “believe the negative” rule, then the true and false positive fractions for each rule are computed for two tests. In order to perform the analysis, the missing at random assumption is imposed, and an interesting example is provided by estimating the combined accuracy of CT and MRI to diagnose lung cancer. The Bayesian approach is extended to two ordinal tests when verification bias is present, and the accuracy of the combined tests is based on the ROC area of the risk function. An example involving mammography with two readers with extreme verification bias illustrates the estimation of the combined test accuracy for ordinal tests. PMID:26859487
Bayesian model checking: A comparison of tests
NASA Astrophysics Data System (ADS)
Lucy, L. B.
2018-06-01
Two procedures for checking Bayesian models are compared using a simple test problem based on the local Hubble expansion. Over four orders of magnitude, p-values derived from a global goodness-of-fit criterion for posterior probability density functions agree closely with posterior predictive p-values. The former can therefore serve as an effective proxy for the difficult-to-calculate posterior predictive p-values.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests
NASA Astrophysics Data System (ADS)
Shumway, R. H.
2001-10-01
- The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests
NASA Astrophysics Data System (ADS)
Shumway, R. H.
The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Luo, Shu-Jin; Johnson, Warren E; Martenson, Janice; Antunes, Agostinho; Martelli, Paolo; Uphyrkina, Olga; Traylor-Holzer, Kathy; Smith, James L D; O'Brien, Stephen J
2008-04-22
Tigers (Panthera tigris) are disappearing rapidly from the wild, from over 100,000 in the 1900s to as few as 3000. Javan (P.t. sondaica), Bali (P.t. balica), and Caspian (P.t. virgata) subspecies are extinct, whereas the South China tiger (P.t. amoyensis) persists only in zoos. By contrast, captive tigers are flourishing, with 15,000-20,000 individuals worldwide, outnumbering their wild relatives five to seven times. We assessed subspecies genetic ancestry of 105 captive tigers from 14 countries and regions by using Bayesian analysis and diagnostic genetic markers defined by a prior analysis of 134 voucher tigers of significant genetic distinctiveness. We assigned 49 tigers to one of five subspecies (Bengal P.t. tigris, Sumatran P.t. sumatrae, Indochinese P.t. corbetti, Amur P.t. altaica, and Malayan P.t. jacksoni tigers) and determined 52 had admixed subspecies origins. The tested captive tigers retain appreciable genomic diversity unobserved in their wild counterparts, perhaps a consequence of large population size, century-long introduction of new founders, and managed-breeding strategies to retain genetic variability. Assessment of verified subspecies ancestry offers a powerful tool that, if applied to tigers of uncertain background, may considerably increase the number of purebred tigers suitable for conservation management.
NASA Astrophysics Data System (ADS)
Strolger, Louis-Gregory; Porter, Sophia; Lagerstrom, Jill; Weissman, Sarah; Reid, I. Neill; Garcia, Michael
2017-04-01
The Proposal Auto-Categorizer and Manager (PACMan) tool was written to respond to concerns about subjective flaws and potential biases in some aspects of the proposal review process for time allocation for the Hubble Space Telescope (HST), and to partially alleviate some of the anticipated additional workload from the James Webb Space Telescope (JWST) proposal review. PACMan is essentially a mixed-method Naive Bayesian spam filtering routine, with multiple pools representing scientific categories, that utilizes the Robinson method for combining token (or word) probabilities. PACMan was trained to make similar programmatic decisions in science category sorting, panelist selection, and proposal-to-panelists assignments to those made by individuals and committees in the Science Policies Group (SPG) at the Space Telescope Science Institute. Based on training from the previous cycle’s proposals, at an average of 87%, PACMan made the same science category assignments for proposals in Cycle 24 as the SPG. Tests for similar science categorizations, based on training using proposals from additional cycles, show that this accuracy can be further improved, to the > 95 % level. This tool will be used to augment or replace key functions in the Time Allocation Committee review processes in future HST and JWST cycles.
None of the above: A Bayesian account of the detection of novel categories.
Navarro, Daniel J; Kemp, Charles
2017-10-01
Every time we encounter a new object, action, or event, there is some chance that we will need to assign it to a novel category. We describe and evaluate a class of probabilistic models that detect when an object belongs to a category that has not previously been encountered. The models incorporate a prior distribution that is influenced by the distribution of previous objects among categories, and we present 2 experiments that demonstrate that people are also sensitive to this distributional information. Two additional experiments confirm that distributional information is combined with similarity when both sources of information are available. We compare our approach to previous models of unsupervised categorization and to several heuristic-based models, and find that a hierarchical Bayesian approach provides the best account of our data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A Bayesian multi-stage cost-effectiveness design for animal studies in stroke research
Cai, Chunyan; Ning, Jing; Huang, Xuelin
2017-01-01
Much progress has been made in the area of adaptive designs for clinical trials. However, little has been done regarding adaptive designs to identify optimal treatment strategies in animal studies. Motivated by an animal study of a novel strategy for treating strokes, we propose a Bayesian multi-stage cost-effectiveness design to simultaneously identify the optimal dose and determine the therapeutic treatment window for administrating the experimental agent. We consider a non-monotonic pattern for the dose-schedule-efficacy relationship and develop an adaptive shrinkage algorithm to assign more cohorts to admissible strategies. We conduct simulation studies to evaluate the performance of the proposed design by comparing it with two standard designs. These simulation studies show that the proposed design yields a significantly higher probability of selecting the optimal strategy, while it is generally more efficient and practical in terms of resource usage. PMID:27405325
Entanglement-enhanced Neyman-Pearson target detection using quantum illumination
NASA Astrophysics Data System (ADS)
Zhuang, Quntao; Zhang, Zheshen; Shapiro, Jeffrey H.
2017-08-01
Quantum illumination (QI) provides entanglement-based target detection---in an entanglement-breaking environment---whose performance is significantly better than that of optimum classical-illumination target detection. QI's performance advantage was established in a Bayesian setting with the target presumed equally likely to be absent or present and error probability employed as the performance metric. Radar theory, however, eschews that Bayesian approach, preferring the Neyman-Pearson performance criterion to avoid the difficulties of accurately assigning prior probabilities to target absence and presence and appropriate costs to false-alarm and miss errors. We have recently reported an architecture---based on sum-frequency generation (SFG) and feedforward (FF) processing---for minimum error-probability QI target detection with arbitrary prior probabilities for target absence and presence. In this paper, we use our results for FF-SFG reception to determine the receiver operating characteristic---detection probability versus false-alarm probability---for optimum QI target detection under the Neyman-Pearson criterion.
A voxel-based investigation for MRI-only radiotherapy of the brain using ultra short echo times
NASA Astrophysics Data System (ADS)
Edmund, Jens M.; Kjer, Hans M.; Van Leemput, Koen; Hansen, Rasmus H.; Andersen, Jon AL; Andreasen, Daniel
2014-12-01
Radiotherapy (RT) based on magnetic resonance imaging (MRI) as the only modality, so-called MRI-only RT, would remove the systematic registration error between MR and computed tomography (CT), and provide co-registered MRI for assessment of treatment response and adaptive RT. Electron densities, however, need to be assigned to the MRI images for dose calculation and patient setup based on digitally reconstructed radiographs (DRRs). Here, we investigate the geometric and dosimetric performance for a number of popular voxel-based methods to generate a so-called pseudo CT (pCT). Five patients receiving cranial irradiation, each containing a co-registered MRI and CT scan, were included. An ultra short echo time MRI sequence for bone visualization was used. Six methods were investigated for three popular types of voxel-based approaches; (1) threshold-based segmentation, (2) Bayesian segmentation and (3) statistical regression. Each approach contained two methods. Approach 1 used bulk density assignment of MRI voxels into air, soft tissue and bone based on logical masks and the transverse relaxation time T2 of the bone. Approach 2 used similar bulk density assignments with Bayesian statistics including or excluding additional spatial information. Approach 3 used a statistical regression correlating MRI voxels with their corresponding CT voxels. A similar photon and proton treatment plan was generated for a target positioned between the nasal cavity and the brainstem for all patients. The CT agreement with the pCT of each method was quantified and compared with the other methods geometrically and dosimetrically using both a number of reported metrics and introducing some novel metrics. The best geometrical agreement with CT was obtained with the statistical regression methods which performed significantly better than the threshold and Bayesian segmentation methods (excluding spatial information). All methods agreed significantly better with CT than a reference water MRI comparison. The mean dosimetric deviation for photons and protons compared to the CT was about 2% and highest in the gradient dose region of the brainstem. Both the threshold based method and the statistical regression methods showed the highest dosimetrical agreement. Generation of pCTs using statistical regression seems to be the most promising candidate for MRI-only RT of the brain. Further, the total amount of different tissues needs to be taken into account for dosimetric considerations regardless of their correct geometrical position.
A Primer on Bayesian Analysis for Experimental Psychopathologists
Krypotos, Angelos-Miltiadis; Blanken, Tessa F.; Arnaudova, Inna; Matzke, Dora; Beckers, Tom
2016-01-01
The principal goals of experimental psychopathology (EPP) research are to offer insights into the pathogenic mechanisms of mental disorders and to provide a stable ground for the development of clinical interventions. The main message of the present article is that those goals are better served by the adoption of Bayesian statistics than by the continued use of null-hypothesis significance testing (NHST). In the first part of the article we list the main disadvantages of NHST and explain why those disadvantages limit the conclusions that can be drawn from EPP research. Next, we highlight the advantages of Bayesian statistics. To illustrate, we then pit NHST and Bayesian analysis against each other using an experimental data set from our lab. Finally, we discuss some challenges when adopting Bayesian statistics. We hope that the present article will encourage experimental psychopathologists to embrace Bayesian statistics, which could strengthen the conclusions drawn from EPP research. PMID:28748068
Hefke, Gwynneth; Davison, Sean; D'Amato, Maria Eugenia
2015-12-01
The utilization of binary markers in human individual identification is gaining ground in forensic genetics. We analyzed the polymorphisms from the first commercial indel kit Investigator DIPplex (Qiagen) in 512 individuals from Afrikaner, Indian, admixed Cape Colored, and the native Bantu Xhosa and Zulu origin in South Africa and evaluated forensic and population genetics parameters for their forensic application in South Africa. The levels of genetic diversity in population and forensic parameters in South Africa are similar to other published data, with lower diversity values for the native Bantu. Departures from Hardy-Weinberg expectations were observed in HLD97 in Indians, Admixed and Bantus, along with 6.83% null homozygotes in the Bantu populations. Sequencing of the flanking regions showed a previously reported transition G>A in rs17245568. Strong population structure was detected with Fst, AMOVA, and the Bayesian unsupervised clustering method in STRUCTURE. Therefore we evaluated the efficiency of individual assignments to population groups using the ancestral membership proportions from STRUCTURE and the Bayesian classification algorithm in Snipper App Suite. Both methods showed low cross-assignment error (0-4%) between Bantus and either Afrikaners or Indians. The differentiation between populations seems to be driven by four loci under positive selection pressure. Based on these results, we draw recommendations for the application of this kit in SA. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Ha, Taesung
A probabilistic risk assessment (PRA) was conducted for a loss of coolant accident, (LOCA) in the McMaster Nuclear Reactor (MNR). A level 1 PRA was completed including event sequence modeling, system modeling, and quantification. To support the quantification of the accident sequence identified, data analysis using the Bayesian method and human reliability analysis (HRA) using the accident sequence evaluation procedure (ASEP) approach were performed. Since human performance in research reactors is significantly different from that in power reactors, a time-oriented HRA model (reliability physics model) was applied for the human error probability (HEP) estimation of the core relocation. This model is based on two competing random variables: phenomenological time and performance time. The response surface and direct Monte Carlo simulation with Latin Hypercube sampling were applied for estimating the phenomenological time, whereas the performance time was obtained from interviews with operators. An appropriate probability distribution for the phenomenological time was assigned by statistical goodness-of-fit tests. The human error probability (HEP) for the core relocation was estimated from these two competing quantities: phenomenological time and operators' performance time. The sensitivity of each probability distribution in human reliability estimation was investigated. In order to quantify the uncertainty in the predicted HEPs, a Bayesian approach was selected due to its capability of incorporating uncertainties in model itself and the parameters in that model. The HEP from the current time-oriented model was compared with that from the ASEP approach. Both results were used to evaluate the sensitivity of alternative huinan reliability modeling for the manual core relocation in the LOCA risk model. This exercise demonstrated the applicability of a reliability physics model supplemented with a. Bayesian approach for modeling human reliability and its potential usefulness of quantifying model uncertainty as sensitivity analysis in the PRA model.
Bayesian analyses of time-interval data for environmental radiation monitoring.
Luo, Peng; Sharp, Julia L; DeVol, Timothy A
2013-01-01
Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.
Editorial: Bayesian benefits for child psychology and psychiatry researchers.
Oldehinkel, Albertine J
2016-09-01
For many scientists, performing statistical tests has become an almost automated routine. However, p-values are frequently used and interpreted incorrectly; and even when used appropriately, p-values tend to provide answers that do not match researchers' questions and hypotheses well. Bayesian statistics present an elegant and often more suitable alternative. The Bayesian approach has rarely been applied in child psychology and psychiatry research so far, but the development of user-friendly software packages and tutorials has placed it well within reach now. Because Bayesian analyses require a more refined definition of hypothesized probabilities of possible outcomes than the classical approach, going Bayesian may offer the additional benefit of sparkling the development and refinement of theoretical models in our field. © 2016 Association for Child and Adolescent Mental Health.
Cai, C; Rodet, T; Legoupil, S; Mohammad-Djafari, A
2013-11-01
Dual-energy computed tomography (DECT) makes it possible to get two fractions of basis materials without segmentation. One is the soft-tissue equivalent water fraction and the other is the hard-matter equivalent bone fraction. Practical DECT measurements are usually obtained with polychromatic x-ray beams. Existing reconstruction approaches based on linear forward models without counting the beam polychromaticity fail to estimate the correct decomposition fractions and result in beam-hardening artifacts (BHA). The existing BHA correction approaches either need to refer to calibration measurements or suffer from the noise amplification caused by the negative-log preprocessing and the ill-conditioned water and bone separation problem. To overcome these problems, statistical DECT reconstruction approaches based on nonlinear forward models counting the beam polychromaticity show great potential for giving accurate fraction images. This work proposes a full-spectral Bayesian reconstruction approach which allows the reconstruction of high quality fraction images from ordinary polychromatic measurements. This approach is based on a Gaussian noise model with unknown variance assigned directly to the projections without taking negative-log. Referring to Bayesian inferences, the decomposition fractions and observation variance are estimated by using the joint maximum a posteriori (MAP) estimation method. Subject to an adaptive prior model assigned to the variance, the joint estimation problem is then simplified into a single estimation problem. It transforms the joint MAP estimation problem into a minimization problem with a nonquadratic cost function. To solve it, the use of a monotone conjugate gradient algorithm with suboptimal descent steps is proposed. The performance of the proposed approach is analyzed with both simulated and experimental data. The results show that the proposed Bayesian approach is robust to noise and materials. It is also necessary to have the accurate spectrum information about the source-detector system. When dealing with experimental data, the spectrum can be predicted by a Monte Carlo simulator. For the materials between water and bone, less than 5% separation errors are observed on the estimated decomposition fractions. The proposed approach is a statistical reconstruction approach based on a nonlinear forward model counting the full beam polychromaticity and applied directly to the projections without taking negative-log. Compared to the approaches based on linear forward models and the BHA correction approaches, it has advantages in noise robustness and reconstruction accuracy.
NASA Astrophysics Data System (ADS)
Alehosseini, Ali; A. Hejazi, Maryam; Mokhtari, Ghassem; B. Gharehpetian, Gevork; Mohammadi, Mohammad
2015-06-01
In this paper, the Bayesian classifier is used to detect and classify the radial deformation and axial displacement of transformer windings. The proposed method is tested on a model of transformer for different volumes of radial deformation and axial displacement. In this method, ultra-wideband (UWB) signal is sent to the simplified model of the transformer winding. The received signal from the winding model is recorded and used for training and testing of Bayesian classifier in different axial displacement and radial deformation states of the winding. It is shown that the proposed method has a good accuracy to detect and classify the axial displacement and radial deformation of the winding.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
Huang, Jen-Pan; Knowles, L Lacey
2016-07-01
With the recent attention and focus on quantitative methods for species delimitation, an overlooked but equally important issue regards what has actually been delimited. This study investigates the apparent arbitrariness of some taxonomic distinctions, and in particular how species and subspecies are assigned. Specifically, we use a recently developed Bayesian model-based approach to show that in the Hercules beetles (genus Dynastes) there is no statistical difference in the probability that putative taxa represent different species, irrespective of whether they were given species or subspecies designations. By considering multiple data types, as opposed to relying exclusively on genetic data alone, we also show that both previously recognized species and subspecies represent a variety of points along the speciation spectrum (i.e., previously recognized species are not systematically further along the continuum than subspecies). For example, based on evolutionary models of divergence, some taxa are statistically distinguishable on more than one axis of differentiation (e.g., along both phenotypic and genetic dimensions), whereas other taxa can only be delimited statistically from a single data type. Because both phenotypic and genetic data are analyzed in a common Bayesian framework, our study provides a framework for investigating whether disagreements in species boundaries among data types reflect (i) actual discordance with the actual history of lineage splitting, or instead (ii) differences among data types in the amount of time required for differentiation to become apparent among the delimited taxa. We discuss what the answers to these questions imply about what characters are used to delimit species, as well as the diverse processes involved in the origin and maintenance of species boundaries. With this in mind, we then reflect more generally on how quantitative methods for species delimitation are used to assign taxonomic status. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Multiple murder and criminal careers: a latent class analysis of multiple homicide offenders.
Vaughn, Michael G; DeLisi, Matt; Beaver, Kevin M; Howard, Matthew O
2009-01-10
To construct an empirically rigorous typology of multiple homicide offenders (MHOs). The current study conducted latent class analysis of the official records of 160 MHOs sampled from eight states to evaluate their criminal careers. A 3-class solution best fit the data (-2LL=-1123.61, Bayesian Information Criterion (BIC)=2648.15, df=81, L(2)=1179.77). Class 1 (n=64, class assignment probability=.999) was the low-offending group marked by little criminal record and delayed arrest onset. Class 2 (n=51, class assignment probability=.957) was the severe group that represents the most violent and habitual criminals. Class 3 (n=45, class assignment probability=.959) was the moderate group whose offending careers were similar to Class 2. A sustained criminal career with involvement in versatile forms of crime was observed for two of three classes of MHOs. Linkages to extant typologies and recommendations for additional research that incorporates clinical constructs are proffered.
Pannacciulli, Federica G; Maltagliati, Ferruccio; de Guttry, Christian; Achituv, Yair
2017-01-01
The model marine broadcast-spawner barnacle Chthamalus montagui was investigated to understand its genetic structure and quantify levels of population divergence, and to make inference on historical demography in terms of time of divergence and changes in population size. We collected specimens from rocky shores of the north-east Atlantic Ocean (4 locations), Mediterranean Sea (8) and Black Sea (1). The 312 sequences 537 bp) of the mitochondrial cytochrome c oxidase I allowed to detect 130 haplotypes. High within-location genetic variability was recorded, with haplotype diversity ranging between h = 0.750 and 0.967. Parameters of genetic divergence, haplotype network and Bayesian assignment analysis were consistent in rejecting the hypothesis of panmixia. C. montagui is genetically structured in three geographically discrete populations, which corresponded to north-eastern Atlantic Ocean, western-central Mediterranean Sea, and Aegean Sea-Black Sea. These populations are separated by two main effective barriers to gene flow located at the Almeria-Oran Front and in correspondence of the Cyclades Islands. According to the 'isolation with migration' model, adjacent population pairs diverged during the early to middle Pleistocene transition, a period in which geological events provoked significant changes in the structure and composition of palaeocommunities. Mismatch distributions, neutrality tests and Bayesian skyline plots showed past population expansions, which started approximately in the Mindel-Riss interglacial, in which ecological conditions were favourable for temperate species and calcium-uptaking marine organisms.
Integration of individual and social information for decision-making in groups of different sizes.
Park, Seongmin A; Goïame, Sidney; O'Connor, David A; Dreher, Jean-Claude
2017-06-01
When making judgments in a group, individuals often revise their initial beliefs about the best judgment to make given what others believe. Despite the ubiquity of this phenomenon, we know little about how the brain updates beliefs when integrating personal judgments (individual information) with those of others (social information). Here, we investigated the neurocomputational mechanisms of how we adapt our judgments to those made by groups of different sizes, in the context of jury decisions for a criminal. By testing different theoretical models, we showed that a social Bayesian inference model captured changes in judgments better than 2 other models. Our results showed that participants updated their beliefs by appropriately weighting individual and social sources of information according to their respective credibility. When investigating 2 fundamental computations of Bayesian inference, belief updates and credibility estimates of social information, we found that the dorsal anterior cingulate cortex (dACC) computed the level of belief updates, while the bilateral frontopolar cortex (FPC) was more engaged in individuals who assigned a greater credibility to the judgments of a larger group. Moreover, increased functional connectivity between these 2 brain regions reflected a greater influence of group size on the relative credibility of social information. These results provide a mechanistic understanding of the computational roles of the FPC-dACC network in steering judgment adaptation to a group's opinion. Taken together, these findings provide a computational account of how the human brain integrates individual and social information for decision-making in groups.
Hosseini, Marjan; Kerachian, Reza
2017-09-01
This paper presents a new methodology for analyzing the spatiotemporal variability of water table levels and redesigning a groundwater level monitoring network (GLMN) using the Bayesian Maximum Entropy (BME) technique and a multi-criteria decision-making approach based on ordered weighted averaging (OWA). The spatial sampling is determined using a hexagonal gridding pattern and a new method, which is proposed to assign a removal priority number to each pre-existing station. To design temporal sampling, a new approach is also applied to consider uncertainty caused by lack of information. In this approach, different time lag values are tested by regarding another source of information, which is simulation result of a numerical groundwater flow model. Furthermore, to incorporate the existing uncertainties in available monitoring data, the flexibility of the BME interpolation technique is taken into account in applying soft data and improving the accuracy of the calculations. To examine the methodology, it is applied to the Dehgolan plain in northwestern Iran. Based on the results, a configuration of 33 monitoring stations for a regular hexagonal grid of side length 3600 m is proposed, in which the time lag between samples is equal to 5 weeks. Since the variance estimation errors of the BME method are almost identical for redesigned and existing networks, the redesigned monitoring network is more cost-effective and efficient than the existing monitoring network with 52 stations and monthly sampling frequency.
Pannacciulli, Federica G.; de Guttry, Christian; Achituv, Yair
2017-01-01
The model marine broadcast-spawner barnacle Chthamalus montagui was investigated to understand its genetic structure and quantify levels of population divergence, and to make inference on historical demography in terms of time of divergence and changes in population size. We collected specimens from rocky shores of the north-east Atlantic Ocean (4 locations), Mediterranean Sea (8) and Black Sea (1). The 312 sequences 537 bp) of the mitochondrial cytochrome c oxidase I allowed to detect 130 haplotypes. High within-location genetic variability was recorded, with haplotype diversity ranging between h = 0.750 and 0.967. Parameters of genetic divergence, haplotype network and Bayesian assignment analysis were consistent in rejecting the hypothesis of panmixia. C. montagui is genetically structured in three geographically discrete populations, which corresponded to north-eastern Atlantic Ocean, western-central Mediterranean Sea, and Aegean Sea-Black Sea. These populations are separated by two main effective barriers to gene flow located at the Almeria-Oran Front and in correspondence of the Cyclades Islands. According to the ‘isolation with migration’ model, adjacent population pairs diverged during the early to middle Pleistocene transition, a period in which geological events provoked significant changes in the structure and composition of palaeocommunities. Mismatch distributions, neutrality tests and Bayesian skyline plots showed past population expansions, which started approximately in the Mindel-Riss interglacial, in which ecological conditions were favourable for temperate species and calcium-uptaking marine organisms. PMID:28594840
Value assignment and uncertainty evaluation for single-element reference solutions
NASA Astrophysics Data System (ADS)
Possolo, Antonio; Bodnar, Olha; Butler, Therese A.; Molloy, John L.; Winchester, Michael R.
2018-06-01
A Bayesian statistical procedure is proposed for value assignment and uncertainty evaluation for the mass fraction of the elemental analytes in single-element solutions distributed as NIST standard reference materials. The principal novelty that we describe is the use of information about relative differences observed historically between the measured values obtained via gravimetry and via high-performance inductively coupled plasma optical emission spectrometry, to quantify the uncertainty component attributable to between-method differences. This information is encapsulated in a prior probability distribution for the between-method uncertainty component, and it is then used, together with the information provided by current measurement data, to produce a probability distribution for the value of the measurand from which an estimate and evaluation of uncertainty are extracted using established statistical procedures.
Gustafsson, Mats G; Wallman, Mikael; Wickenberg Bolin, Ulrika; Göransson, Hanna; Fryknäs, M; Andersson, Claes R; Isaksson, Anders
2010-06-01
Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (CI) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the CI is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice. It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples. Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets. An empirically derived ME prior seems promising for improving the Bayesian CI for the unknown error rate of a designed classifier. Copyright 2010 Elsevier B.V. All rights reserved.
A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.
Yu, Qingzhao; Zhu, Lin; Zhu, Han
2017-11-01
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics
Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.
2015-01-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564
Application of Poisson random effect models for highway network screening.
Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer
2014-02-01
In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification. Copyright © 2013 Elsevier Ltd. All rights reserved.
deWaard, Jeremy R; Mitchell, Andrew; Keena, Melody A; Gopurenko, David; Boykin, Laura M; Armstrong, Karen F; Pogue, Michael G; Lima, Joao; Floyd, Robin; Hanner, Robert H; Humble, Leland M
2010-12-09
Detecting and controlling the movements of invasive species, such as insect pests, relies upon rapid and accurate species identification in order to initiate containment procedures by the appropriate authorities. Many species in the tussock moth genus Lymantria are significant forestry pests, including the gypsy moth Lymantria dispar L., and consequently have been a focus for the development of molecular diagnostic tools to assist in identifying species and source populations. In this study we expand the taxonomic and geographic coverage of the DNA barcode reference library, and further test the utility of this diagnostic method, both for species/subspecies assignment and for determination of geographic provenance of populations. Cytochrome oxidase I (COI) barcodes were obtained from 518 individuals and 36 species of Lymantria, including sequences assembled and generated from previous studies, vouchered material in public collections, and intercepted specimens obtained from surveillance programs in Canada. A maximum likelihood tree was constructed, revealing high bootstrap support for 90% of species clusters. Bayesian species assignment was also tested, and resulted in correct assignment to species and subspecies in all instances. The performance of barcoding was also compared against the commonly employed NB restriction digest system (also based on COI); while the latter is informative for discriminating gypsy moth subspecies, COI barcode sequences provide greater resolution and generality by encompassing a greater number of haplotypes across all Lymantria species, none shared between species. This study demonstrates the efficacy of DNA barcodes for diagnosing species of Lymantria and reinforces the view that the approach is an under-utilized resource with substantial potential for biosecurity and surveillance. Biomonitoring agencies currently employing the NB restriction digest system would gather more information by transitioning to the use of DNA barcoding, a change which could be made relatively seamlessly as the same gene region underlies both protocols.
Development of a genetic tool for product regulation in the diverse British pig breed market.
Wilkinson, Samantha; Archibald, Alan L; Haley, Chris S; Megens, Hendrik-Jan; Crooijmans, Richard P M A; Groenen, Martien A M; Wiener, Pamela; Ogden, Rob
2012-11-15
The application of DNA markers for the identification of biological samples from both human and non-human species is widespread and includes use in food authentication. In the food industry the financial incentive to substituting the true name of a food product with a higher value alternative is driving food fraud. This applies to British pork products where products derived from traditional pig breeds are of premium value. The objective of this study was to develop a genetic assay for regulatory authentication of traditional pig breed-labelled products in the porcine food industry in the United Kingdom. The dataset comprised of a comprehensive coverage of breed types present in Britain: 460 individuals from 7 traditional breeds, 5 commercial purebreds, 1 imported European breed and 1 imported Asian breed were genotyped using the PorcineSNP60 beadchip. Following breed-informative SNP selection, assignment power was calculated for increasing SNP panel size. A 96-plex assay created using the most informative SNPs revealed remarkably high genetic differentiation between the British pig breeds, with an average FST of 0.54 and Bayesian clustering analysis also indicated that they were distinct homogenous populations. The posterior probability of assignment of any individual of a presumed origin actually originating from that breed given an alternative breed origin was > 99.5% in 174 out of 182 contrasts, at a test value of log(LR) > 0. Validation of the 96-plex assay using independent test samples of known origin was successful; a subsequent survey of market samples revealed a high level of breed label conformity. The newly created 96-plex assay using selected markers from the PorcineSNP60 beadchip enables powerful assignment of samples to traditional breed origin and can effectively identify mislabelling, providing a highly effective tool for DNA analysis in food forensics.
Development of a genetic tool for product regulation in the diverse British pig breed market
2012-01-01
Background The application of DNA markers for the identification of biological samples from both human and non-human species is widespread and includes use in food authentication. In the food industry the financial incentive to substituting the true name of a food product with a higher value alternative is driving food fraud. This applies to British pork products where products derived from traditional pig breeds are of premium value. The objective of this study was to develop a genetic assay for regulatory authentication of traditional pig breed-labelled products in the porcine food industry in the United Kingdom. Results The dataset comprised of a comprehensive coverage of breed types present in Britain: 460 individuals from 7 traditional breeds, 5 commercial purebreds, 1 imported European breed and 1 imported Asian breed were genotyped using the PorcineSNP60 beadchip. Following breed-informative SNP selection, assignment power was calculated for increasing SNP panel size. A 96-plex assay created using the most informative SNPs revealed remarkably high genetic differentiation between the British pig breeds, with an average FST of 0.54 and Bayesian clustering analysis also indicated that they were distinct homogenous populations. The posterior probability of assignment of any individual of a presumed origin actually originating from that breed given an alternative breed origin was > 99.5% in 174 out of 182 contrasts, at a test value of log(LR) > 0. Validation of the 96-plex assay using independent test samples of known origin was successful; a subsequent survey of market samples revealed a high level of breed label conformity. Conclusion The newly created 96-plex assay using selected markers from the PorcineSNP60 beadchip enables powerful assignment of samples to traditional breed origin and can effectively identify mislabelling, providing a highly effective tool for DNA analysis in food forensics. PMID:23150935
Win-Stay, Lose-Sample: a simple sequential algorithm for approximating Bayesian inference.
Bonawitz, Elizabeth; Denison, Stephanie; Gopnik, Alison; Griffiths, Thomas L
2014-11-01
People can behave in a way that is consistent with Bayesian models of cognition, despite the fact that performing exact Bayesian inference is computationally challenging. What algorithms could people be using to make this possible? We show that a simple sequential algorithm "Win-Stay, Lose-Sample", inspired by the Win-Stay, Lose-Shift (WSLS) principle, can be used to approximate Bayesian inference. We investigate the behavior of adults and preschoolers on two causal learning tasks to test whether people might use a similar algorithm. These studies use a "mini-microgenetic method", investigating how people sequentially update their beliefs as they encounter new evidence. Experiment 1 investigates a deterministic causal learning scenario and Experiments 2 and 3 examine how people make inferences in a stochastic scenario. The behavior of adults and preschoolers in these experiments is consistent with our Bayesian version of the WSLS principle. This algorithm provides both a practical method for performing Bayesian inference and a new way to understand people's judgments. Copyright © 2014 Elsevier Inc. All rights reserved.
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
NASA Astrophysics Data System (ADS)
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Kurushima, J. D.; Lipinski, M. J.; Gandolfi, B.; Froenicke, L.; Grahn, J. C.; Grahn, R. A.; Lyons, L. A.
2012-01-01
Summary Both cat breeders and the lay public have interests in the origins of their pets, not only in the genetic identity of the purebred individuals, but also the historical origins of common household cats. The cat fancy is a relatively new institution with over 85% of its 40–50 breeds arising only in the past 75 years, primarily through selection on single-gene aesthetic traits. The short, yet intense cat breed history poses a significant challenge to the development of a genetic marker-based breed identification strategy. Using different breed assignment strategies and methods, 477 cats representing 29 fancy breeds were analysed with 38 short tandem repeats, 148 intergenic and five phenotypic single nucleotide polymorphisms. Results suggest the frequentist method of Paetkau (accuracy single nucleotide polymorphisms = 0.78, short tandem repeats = 0.88) surpasses the Bayesian method of Rannala and Mountain (single nucleotide polymorphisms = 0.56, short tandem repeats = 0.83) for accurate assignment of individuals to the correct breed. Additionally, a post-assignment verification step with the five phenotypic single nucleotide polymorphisms accurately identified between 0.31 and 0.58 of the mis-assigned individuals raising the sensitivity of assignment with the frequentist method to 0.89 and 0.92 single nucleotide polymorphisms and short tandem repeats respectively. This study provides a novel multi-step assignment strategy and suggests that, despite their short breed history and breed family groupings, a majority of cats can be assigned to their proper breed or population of origin, i.e. race. PMID:23171373
Bayesian selective response-adaptive design using the historical control.
Kim, Mi-Ok; Harun, Nusrat; Liu, Chunyan; Khoury, Jane C; Broderick, Joseph P
2018-06-13
High quality historical control data, if incorporated, may reduce sample size, trial cost, and duration. A too optimistic use of the data, however, may result in bias under prior-data conflict. Motivated by well-publicized two-arm comparative trials in stroke, we propose a Bayesian design that both adaptively incorporates historical control data and selectively adapt the treatment allocation ratios within an ongoing trial responsively to the relative treatment effects. The proposed design differs from existing designs that borrow from historical controls. As opposed to reducing the number of subjects assigned to the control arm blindly, this design does so adaptively to the relative treatment effects only if evaluation of cumulated current trial data combined with the historical control suggests the superiority of the intervention arm. We used the effective historical sample size approach to quantify borrowed information on the control arm and modified the treatment allocation rules of the doubly adaptive biased coin design to incorporate the quantity. The modified allocation rules were then implemented under the Bayesian framework with commensurate priors addressing prior-data conflict. Trials were also more frequently concluded earlier in line with the underlying truth, reducing trial cost, and duration and yielded parameter estimates with smaller standard errors. © 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons, Ltd.
Ting, Chih-Chung; Yu, Chia-Chen; Maloney, Laurence T.
2015-01-01
In Bayesian decision theory, knowledge about the probabilities of possible outcomes is captured by a prior distribution and a likelihood function. The prior reflects past knowledge and the likelihood summarizes current sensory information. The two combined (integrated) form a posterior distribution that allows estimation of the probability of different possible outcomes. In this study, we investigated the neural mechanisms underlying Bayesian integration using a novel lottery decision task in which both prior knowledge and likelihood information about reward probability were systematically manipulated on a trial-by-trial basis. Consistent with Bayesian integration, as sample size increased, subjects tended to weigh likelihood information more compared with prior information. Using fMRI in humans, we found that the medial prefrontal cortex (mPFC) correlated with the mean of the posterior distribution, a statistic that reflects the integration of prior knowledge and likelihood of reward probability. Subsequent analysis revealed that both prior and likelihood information were represented in mPFC and that the neural representations of prior and likelihood in mPFC reflected changes in the behaviorally estimated weights assigned to these different sources of information in response to changes in the environment. Together, these results establish the role of mPFC in prior-likelihood integration and highlight its involvement in representing and integrating these distinct sources of information. PMID:25632152
Vilar, M J; Ranta, J; Virtanen, S; Korkeala, H
2015-01-01
Bayesian analysis was used to estimate the pig's and herd's true prevalence of enteropathogenic Yersinia in serum samples collected from Finnish pig farms. The sensitivity and specificity of the diagnostic test were also estimated for the commercially available ELISA which is used for antibody detection against enteropathogenic Yersinia. The Bayesian analysis was performed in two steps; the first step estimated the prior true prevalence of enteropathogenic Yersinia with data obtained from a systematic review of the literature. In the second step, data of the apparent prevalence (cross-sectional study data), prior true prevalence (first step), and estimated sensitivity and specificity of the diagnostic methods were used for building the Bayesian model. The true prevalence of Yersinia in slaughter-age pigs was 67.5% (95% PI 63.2-70.9). The true prevalence of Yersinia in sows was 74.0% (95% PI 57.3-82.4). The estimates of sensitivity and specificity values of the ELISA were 79.5% and 96.9%.
A Hierarchical Bayesian Model for Calibrating Estimates of Species Divergence Times
Heath, Tracy A.
2012-01-01
In Bayesian divergence time estimation methods, incorporating calibrating information from the fossil record is commonly done by assigning prior densities to ancestral nodes in the tree. Calibration prior densities are typically parametric distributions offset by minimum age estimates provided by the fossil record. Specification of the parameters of calibration densities requires the user to quantify his or her prior knowledge of the age of the ancestral node relative to the age of its calibrating fossil. The values of these parameters can, potentially, result in biased estimates of node ages if they lead to overly informative prior distributions. Accordingly, determining parameter values that lead to adequate prior densities is not straightforward. In this study, I present a hierarchical Bayesian model for calibrating divergence time analyses with multiple fossil age constraints. This approach applies a Dirichlet process prior as a hyperprior on the parameters of calibration prior densities. Specifically, this model assumes that the rate parameters of exponential prior distributions on calibrated nodes are distributed according to a Dirichlet process, whereby the rate parameters are clustered into distinct parameter categories. Both simulated and biological data are analyzed to evaluate the performance of the Dirichlet process hyperprior. Compared with fixed exponential prior densities, the hierarchical Bayesian approach results in more accurate and precise estimates of internal node ages. When this hyperprior is applied using Markov chain Monte Carlo methods, the ages of calibrated nodes are sampled from mixtures of exponential distributions and uncertainty in the values of calibration density parameters is taken into account. PMID:22334343
Constructing a Bayesian network model for improving safety behavior of employees at workplaces.
Mohammadfam, Iraj; Ghasemi, Fakhradin; Kalatpour, Omid; Moghimbeigi, Abbas
2017-01-01
Unsafe behavior increases the risk of accident at workplaces and needs to be managed properly. The aim of the present study was to provide a model for managing and improving safety behavior of employees using the Bayesian networks approach. The study was conducted in several power plant construction projects in Iran. The data were collected using a questionnaire composed of nine factors, including management commitment, supporting environment, safety management system, employees' participation, safety knowledge, safety attitude, motivation, resource allocation, and work pressure. In order for measuring the score of each factor assigned by a responder, a measurement model was constructed for each of them. The Bayesian network was constructed using experts' opinions and Dempster-Shafer theory. Using belief updating, the best intervention strategies for improving safety behavior also were selected. The result of the present study demonstrated that the majority of employees do not tend to consider safety rules, regulation, procedures and norms in their behavior at the workplace. Safety attitude, safety knowledge, and supporting environment were the best predictor of safety behavior. Moreover, it was determined that instantaneous improvement of supporting environment and employee participation is the best strategy to reach a high proportion of safety behavior at the workplace. The lack of a comprehensive model that can be used for explaining safety behavior was one of the most problematic issues of the study. Furthermore, it can be concluded that belief updating is a unique feature of Bayesian networks that is very useful in comparing various intervention strategies and selecting the best one form them. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ducrot, Virginie; Billoir, Elise; Péry, Alexandre R R; Garric, Jeanne; Charles, Sandrine
2010-05-01
Effects of zinc were studied in the freshwater worm Branchiura sowerbyi using partial and full life-cycle tests. Only newborn and juveniles were sensitive to zinc, displaying effects on survival, growth, and age at first brood at environmentally relevant concentrations. Threshold effect models were proposed to assess toxic effects on individuals. They were fitted to life-cycle test data using Bayesian inference and adequately described life-history trait data in exposed organisms. The daily asymptotic growth rate of theoretical populations was then simulated with a matrix population model, based upon individual-level outputs. Population-level outputs were in accordance with existing literature for controls. Working in a Bayesian framework allowed incorporating parameter uncertainty in the simulation of the population-level response to zinc exposure, thus increasing the relevance of test results in the context of ecological risk assessment.
Assessing noninferiority in a three-arm trial using the Bayesian approach.
Ghosh, Pulak; Nathoo, Farouk; Gönen, Mithat; Tiwari, Ram C
2011-07-10
Non-inferiority trials, which aim to demonstrate that a test product is not worse than a competitor by more than a pre-specified small amount, are of great importance to the pharmaceutical community. As a result, methodology for designing and analyzing such trials is required, and developing new methods for such analysis is an important area of statistical research. The three-arm trial consists of a placebo, a reference and an experimental treatment, and simultaneously tests the superiority of the reference over the placebo along with comparing this reference to an experimental treatment. In this paper, we consider the analysis of non-inferiority trials using Bayesian methods which incorporate both parametric as well as semi-parametric models. The resulting testing approach is both flexible and robust. The benefit of the proposed Bayesian methods is assessed via simulation, based on a study examining home-based blood pressure interventions. Copyright © 2011 John Wiley & Sons, Ltd.
Bayesian median regression for temporal gene expression data
NASA Astrophysics Data System (ADS)
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng
2017-05-10
Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murakami, Haruko; Chen, X.; Hahn, Melanie S.
2010-10-21
This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less
Substantial advantage of a combined Bayesian and genotyping approach in testosterone doping tests.
Schulze, Jenny Jakobsson; Lundmark, Jonas; Garle, Mats; Ekström, Lena; Sottas, Pierre-Edouard; Rane, Anders
2009-03-01
Testosterone abuse is conventionally assessed by the urinary testosterone/epitestosterone (T/E) ratio, levels above 4.0 being considered suspicious. A deletion polymorphism in the gene coding for UGT2B17 is strongly associated with reduced testosterone glucuronide (TG) levels in urine. Many of the individuals devoid of the gene would not reach a T/E ratio of 4.0 after testosterone intake. Future test programs will most likely shift from population based- to individual-based T/E cut-off ratios using Bayesian inference. A longitudinal analysis is dependent on an individual's true negative baseline T/E ratio. The aim was to investigate whether it is possible to increase the sensitivity and specificity of the T/E test by addition of UGT2B17 genotype information in a Bayesian framework. A single intramuscular dose of 500mg testosterone enanthate was given to 55 healthy male volunteers with either two, one or no allele (ins/ins, ins/del or del/del) of the UGT2B17 gene. Urinary excretion of TG and the T/E ratio was measured during 15 days. The Bayesian analysis was conducted to calculate the individual T/E cut-off ratio. When adding the genotype information, the program returned lower individual cut-off ratios in all del/del subjects increasing the sensitivity of the test considerably. It will be difficult, if not impossible, to discriminate between a true negative baseline T/E value and a false negative one without knowledge of the UGT2B17 genotype. UGT2B17 genotype information is crucial, both to decide which initial cut-off ratio to use for an individual, and for increasing the sensitivity of the Bayesian analysis.
Bayesian methods including nonrandomized study data increased the efficiency of postlaunch RCTs.
Schmidt, Amand F; Klugkist, Irene; Klungel, Olaf H; Nielen, Mirjam; de Boer, Anthonius; Hoes, Arno W; Groenwold, Rolf H H
2015-04-01
Findings from nonrandomized studies on safety or efficacy of treatment in patient subgroups may trigger postlaunch randomized clinical trials (RCTs). In the analysis of such RCTs, results from nonrandomized studies are typically ignored. This study explores the trade-off between bias and power of Bayesian RCT analysis incorporating information from nonrandomized studies. A simulation study was conducted to compare frequentist with Bayesian analyses using noninformative and informative priors in their ability to detect interaction effects. In simulated subgroups, the effect of a hypothetical treatment differed between subgroups (odds ratio 1.00 vs. 2.33). Simulations varied in sample size, proportions of the subgroups, and specification of the priors. As expected, the results for the informative Bayesian analyses were more biased than those from the noninformative Bayesian analysis or frequentist analysis. However, because of a reduction in posterior variance, informative Bayesian analyses were generally more powerful to detect an effect. In scenarios where the informative priors were in the opposite direction of the RCT data, type 1 error rates could be 100% and power 0%. Bayesian methods incorporating data from nonrandomized studies can meaningfully increase power of interaction tests in postlaunch RCTs. Copyright © 2015 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.
2012-06-01
Tracer testing under natural or forced gradient flow holds the potential to provide useful information for characterizing subsurface properties, through monitoring, modeling and interpretation of the tracer plume migration in an aquifer. Non-reactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter (EBF) profiling. A Bayesian data assimilation technique, the method of anchored distributions (MAD) [Rubin et al., 2010], was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of themore » Hanford formation. In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using the constant-rate injection tests and the EBF data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively-parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field data shows that the hydrogeological model, when conditioned on the tracer test data, can reproduce the tracer transport behavior better than the field characterized without the tracer test data. This study successfully demonstrates that MAD can sequentially assimilate multi-scale multi-type field data through a consistent Bayesian framework.« less
ERIC Educational Resources Information Center
Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S.
2013-01-01
This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Hierarchical Bayesian Models of Subtask Learning
ERIC Educational Resources Information Center
Anglim, Jeromy; Wynton, Sarah K. A.
2015-01-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…
Incremental Bayesian Category Learning from Natural Language
ERIC Educational Resources Information Center
Frermann, Lea; Lapata, Mirella
2016-01-01
Models of category learning have been extensively studied in cognitive science and primarily tested on perceptual abstractions or artificial stimuli. In this paper, we focus on categories acquired from natural language stimuli, that is, words (e.g., "chair" is a member of the furniture category). We present a Bayesian model that, unlike…
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.
Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J
2015-07-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.
Verifying the geographic origin of mahogany (Swietenia macrophylla King) with DNA-fingerprints.
Degen, B; Ward, S E; Lemes, M R; Navarro, C; Cavers, S; Sebbenn, A M
2013-01-01
Illegal logging is one of the main causes of ongoing worldwide deforestation and needs to be eradicated. The trade in illegal timber and wood products creates market disadvantages for products from sustainable forestry. Although various measures have been established to counter illegal logging and the subsequent trade, there is a lack of practical mechanisms for identifying the origin of timber and wood products. In this study, six nuclear microsatellites were used to generate DNA fingerprints for a genetic reference database characterising the populations of origin of a large set of mahogany (Swietenia macrophylla King, Meliaceae) samples. For the database, leaves and/or cambium from 1971 mahogany trees sampled in 31 stands from Mexico to Bolivia were genotyped. A total of 145 different alleles were found, showing strong genetic differentiation (δ(Gregorious)=0.52, F(ST)=0.18, G(ST(Hedrick))=0.65) and clear correlation between genetic and spatial distances among stands (r=0.82, P<0.05). We used the genetic reference database and Bayesian assignment testing to determine the geographic origins of two sets of mahogany wood samples, based on their multilocus genotypes. In both cases the wood samples were assigned to the correct country of origin. We discuss the overall applicability of this methodology to tropical timber trading. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Bellato, Cláudia M; Motilal, Lambert; Zhang, Dapeng
2014-01-15
Cacao (Theobroma cacao L.), the source of cocoa, is an economically important tropical crop. One problem with the premium cacao market is contamination with off-types adulterating raw premium material. Accurate determination of the genetic identity of single cacao beans is essential for ensuring cocoa authentication. Using nanofluidic single nucleotide polymorphism (SNP) genotyping with 48 SNP markers, we generated SNP fingerprints for small quantities of DNA extracted from the seed coat of single cacao beans. On the basis of the SNP profiles, we identified an assumed adulterant variety, which was unambiguously distinguished from the authentic beans by multilocus matching. Assignment tests based on both Bayesian clustering analysis and allele frequency clearly separated all 30 authentic samples from the non-authentic samples. Distance-based principle coordinate analysis further supported these results. The nanofluidic SNP protocol, together with forensic statistical tools, is sufficiently robust to establish authentication and to verify gourmet cacao varieties. This method shows significant potential for practical application.
Shankar, Vijay; Reo, Nicholas V; Paliy, Oleg
2015-12-09
We previously showed that stool samples of pre-adolescent and adolescent US children diagnosed with diarrhea-predominant IBS (IBS-D) had different compositions of microbiota and metabolites compared to healthy age-matched controls. Here we explored whether observed fecal microbiota and metabolite differences between these two adolescent populations can be used to discriminate between IBS and health. We constructed individual microbiota- and metabolite-based sample classification models based on the partial least squares multivariate analysis and then applied a Bayesian approach to integrate individual models into a single classifier. The resulting combined classification achieved 84 % accuracy of correct sample group assignment and 86 % prediction for IBS-D in cross-validation tests. The performance of the cumulative classification model was further validated by the de novo analysis of stool samples from a small independent IBS-D cohort. High-throughput microbial and metabolite profiling of subject stool samples can be used to facilitate IBS diagnosis.
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
BEASTling: A software tool for linguistic phylogenetics using BEAST 2
Forkel, Robert; Kaiping, Gereon A.; Atkinson, Quentin D.
2017-01-01
We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF) permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists) and relevant domain experts. PMID:28796784
BEASTling: A software tool for linguistic phylogenetics using BEAST 2.
Maurits, Luke; Forkel, Robert; Kaiping, Gereon A; Atkinson, Quentin D
2017-01-01
We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF) permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists) and relevant domain experts.
Umek, Lan; Fonseca, Elza; Drumonde-Neves, João; Dequin, Sylvie; Zupan, Blaz; Schuller, Dorit
2013-01-01
Saccharomyces cerevisiae strains from diverse natural habitats harbour a vast amount of phenotypic diversity, driven by interactions between yeast and the respective environment. In grape juice fermentations, strains are exposed to a wide array of biotic and abiotic stressors, which may lead to strain selection and generate naturally arising strain diversity. Certain phenotypes are of particular interest for the winemaking industry and could be identified by screening of large number of different strains. The objective of the present work was to use data mining approaches to identify those phenotypic tests that are most useful to predict a strain's potential for winemaking. We have constituted a S. cerevisiae collection comprising 172 strains of worldwide geographical origins or technological applications. Their phenotype was screened by considering 30 physiological traits that are important from an oenological point of view. Growth in the presence of potassium bisulphite, growth at 40°C, and resistance to ethanol were mostly contributing to strain variability, as shown by the principal component analysis. In the hierarchical clustering of phenotypic profiles the strains isolated from the same wines and vineyards were scattered throughout all clusters, whereas commercial winemaking strains tended to co-cluster. Mann-Whitney test revealed significant associations between phenotypic results and strain's technological application or origin. Naïve Bayesian classifier identified 3 of the 30 phenotypic tests of growth in iprodion (0.05 mg/mL), cycloheximide (0.1 µg/mL) and potassium bisulphite (150 mg/mL) that provided most information for the assignment of a strain to the group of commercial strains. The probability of a strain to be assigned to this group was 27% using the entire phenotypic profile and increased to 95%, when only results from the three tests were considered. Results show the usefulness of computational approaches to simplify strain selection procedures. PMID:23874393
Harrison, Jay M; Breeze, Matthew L; Harrigan, George G
2011-08-01
Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.
True versus Apparent Malaria Infection Prevalence: The Contribution of a Bayesian Approach
Claes, Filip; Van Hong, Nguyen; Torres, Kathy; Mao, Sokny; Van den Eede, Peter; Thi Thinh, Ta; Gamboa, Dioni; Sochantha, Tho; Thang, Ngo Duc; Coosemans, Marc; Büscher, Philippe; D'Alessandro, Umberto; Berkvens, Dirk; Erhart, Annette
2011-01-01
Aims To present a new approach for estimating the “true prevalence” of malaria and apply it to datasets from Peru, Vietnam, and Cambodia. Methods Bayesian models were developed for estimating both the malaria prevalence using different diagnostic tests (microscopy, PCR & ELISA), without the need of a gold standard, and the tests' characteristics. Several sources of information, i.e. data, expert opinions and other sources of knowledge can be integrated into the model. This approach resulting in an optimal and harmonized estimate of malaria infection prevalence, with no conflict between the different sources of information, was tested on data from Peru, Vietnam and Cambodia. Results Malaria sero-prevalence was relatively low in all sites, with ELISA showing the highest estimates. The sensitivity of microscopy and ELISA were statistically lower in Vietnam than in the other sites. Similarly, the specificities of microscopy, ELISA and PCR were significantly lower in Vietnam than in the other sites. In Vietnam and Peru, microscopy was closer to the “true” estimate than the other 2 tests while as expected ELISA, with its lower specificity, usually overestimated the prevalence. Conclusions Bayesian methods are useful for analyzing prevalence results when no gold standard diagnostic test is available. Though some results are expected, e.g. PCR more sensitive than microscopy, a standardized and context-independent quantification of the diagnostic tests' characteristics (sensitivity and specificity) and the underlying malaria prevalence may be useful for comparing different sites. Indeed, the use of a single diagnostic technique could strongly bias the prevalence estimation. This limitation can be circumvented by using a Bayesian framework taking into account the imperfect characteristics of the currently available diagnostic tests. As discussed in the paper, this approach may further support global malaria burden estimation initiatives. PMID:21364745
CytoBayesJ: software tools for Bayesian analysis of cytogenetic radiation dosimetry data.
Ainsbury, Elizabeth A; Vinnikov, Volodymyr; Puig, Pedro; Maznyk, Nataliya; Rothkamm, Kai; Lloyd, David C
2013-08-30
A number of authors have suggested that a Bayesian approach may be most appropriate for analysis of cytogenetic radiation dosimetry data. In the Bayesian framework, probability of an event is described in terms of previous expectations and uncertainty. Previously existing, or prior, information is used in combination with experimental results to infer probabilities or the likelihood that a hypothesis is true. It has been shown that the Bayesian approach increases both the accuracy and quality assurance of radiation dose estimates. New software entitled CytoBayesJ has been developed with the aim of bringing Bayesian analysis to cytogenetic biodosimetry laboratory practice. CytoBayesJ takes a number of Bayesian or 'Bayesian like' methods that have been proposed in the literature and presents them to the user in the form of simple user-friendly tools, including testing for the most appropriate model for distribution of chromosome aberrations and calculations of posterior probability distributions. The individual tools are described in detail and relevant examples of the use of the methods and the corresponding CytoBayesJ software tools are given. In this way, the suitability of the Bayesian approach to biological radiation dosimetry is highlighted and its wider application encouraged by providing a user-friendly software interface and manual in English and Russian. Copyright © 2013 Elsevier B.V. All rights reserved.
A Bayesian Approach to the Paleomagnetic Conglomerate Test
NASA Astrophysics Data System (ADS)
Heslop, David; Roberts, Andrew P.
2018-02-01
The conglomerate test has served the paleomagnetic community for over 60 years as a means to detect remagnetizations. The test states that if a suite of clasts within a bed have uniformly random paleomagnetic directions, then the conglomerate cannot have experienced a pervasive event that remagnetized the clasts in the same direction. The current form of the conglomerate test is based on null hypothesis testing, which results in a binary "pass" (uniformly random directions) or "fail" (nonrandom directions) outcome. We have recast the conglomerate test in a Bayesian framework with the aim of providing more information concerning the level of support a given data set provides for a hypothesis of uniformly random paleomagnetic directions. Using this approach, we place the conglomerate test in a fully probabilistic framework that allows for inconclusive results when insufficient information is available to draw firm conclusions concerning the randomness or nonrandomness of directions. With our method, sample sets larger than those typically employed in paleomagnetism may be required to achieve strong support for a hypothesis of random directions. Given the potentially detrimental effect of unrecognized remagnetizations on paleomagnetic reconstructions, it is important to provide a means to draw statistically robust data-driven inferences. Our Bayesian analysis provides a means to do this for the conglomerate test.
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
ERIC Educational Resources Information Center
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graves, Todd L; Hamada, Michael S
2008-01-01
Good estimates of the reliability of a system make use of test data and expert knowledge at all available levels. Furthermore, by integrating all these information sources, one can determine how best to allocate scarce testing resources to reduce uncertainty. Both of these goals are facilitated by modern Bayesian computational methods. We apply these tools to examples that were previously solvable only through the use of ingenious approximations, and use genetic algorithms to guide resource allocation.
Integration of individual and social information for decision-making in groups of different sizes
Goïame, Sidney; O'Connor, David A.; Dreher, Jean-Claude
2017-01-01
When making judgments in a group, individuals often revise their initial beliefs about the best judgment to make given what others believe. Despite the ubiquity of this phenomenon, we know little about how the brain updates beliefs when integrating personal judgments (individual information) with those of others (social information). Here, we investigated the neurocomputational mechanisms of how we adapt our judgments to those made by groups of different sizes, in the context of jury decisions for a criminal. By testing different theoretical models, we showed that a social Bayesian inference model captured changes in judgments better than 2 other models. Our results showed that participants updated their beliefs by appropriately weighting individual and social sources of information according to their respective credibility. When investigating 2 fundamental computations of Bayesian inference, belief updates and credibility estimates of social information, we found that the dorsal anterior cingulate cortex (dACC) computed the level of belief updates, while the bilateral frontopolar cortex (FPC) was more engaged in individuals who assigned a greater credibility to the judgments of a larger group. Moreover, increased functional connectivity between these 2 brain regions reflected a greater influence of group size on the relative credibility of social information. These results provide a mechanistic understanding of the computational roles of the FPC-dACC network in steering judgment adaptation to a group’s opinion. Taken together, these findings provide a computational account of how the human brain integrates individual and social information for decision-making in groups. PMID:28658252
Godde, Kanya
2017-01-01
The aim of this study is to examine how well different informative priors model age-at-death in Bayesian statistics, which will shed light on how the skeleton ages, particularly at the sacroiliac joint. Data from four samples were compared for their performance as informative priors for auricular surface age-at-death estimation: (1) American population from US Census data; (2) county data from the US Census data; (3) a local cemetery; and (4) a skeletal collection. The skeletal collection and cemetery are located within the county that was sampled. A Gompertz model was applied to compare survivorship across the four samples. Transition analysis parameters, coupled with the generated Gompertz parameters, were input into Bayes' theorem to generate highest posterior density ranges from posterior density functions. Transition analysis describes the age at which an individual transitions from one age phase to another. The result is age ranges that should describe the chronological age of 90% of the individuals who fall in a particular phase. Cumulative binomial tests indicate the method performed lower than 90% at capturing chronological age as assigned to a biological phase, despite wide age ranges at older ages. The samples performed similarly overall, despite small differences in survivorship. Collectively, these results show that as we age, the senescence pattern becomes more variable. More local samples performed better at describing the aging process than more general samples, which implies practitioners need to consider sample selection when using the literature to diagnose and work with patients with sacroiliac joint pain.
NASA Astrophysics Data System (ADS)
Olson, R.; An, S. I.
2016-12-01
Atlantic Meridional Overturning Circulation (AMOC) in the ocean might slow down in the future, which can lead to a host of climatic effects in North Atlantic and throughout the world. Despite improvements in climate models and availability of new observations, AMOC projections remain uncertain. Here we constrain CMIP5 multi-model ensemble output with observations of a recently developed AMOC index to provide improved Bayesian predictions of future AMOC. Specifically, we first calculate yearly AMOC index loosely based on Rahmstorf et al. (2015) for years 1880—2004 for both observations, and the CMIP5 models for which relevant output is available. We then assign a weight to each model based on a Bayesian Model Averaging method that accounts for differential model skill in terms of both mean state and variability. We include the temporal autocorrelation in climate model errors, and account for the uncertainty in the parameters of our statistical model. We use the weights to provide future weighted projections of AMOC, and compare them to un-weighted ones. Our projections use bootstrapping to account for uncertainty in internal AMOC variability. We also perform spectral and other statistical analyses to show that AMOC index variability, both in models and in observations, is consistent with red noise. Our results improve on and complement previous work by using a new ensemble of climate models, a different observational metric, and an improved Bayesian weighting method that accounts for differential model skill at reproducing internal variability. Reference: Rahmstorf, S., Box, J. E., Feulner, G., Mann, M. E., Robinson, A., Rutherford, S., & Schaffernicht, E. J. (2015). Exceptional twentieth-century slowdown in atlantic ocean overturning circulation. Nature Climate Change, 5(5), 475-480. doi:10.1038/nclimate2554
Borchani, Hanen; Bielza, Concha; Martı Nez-Martı N, Pablo; Larrañaga, Pedro
2012-12-01
Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learning MBCs from data. Basically, it consists of determining the Markov blanket around each class variable using the HITON algorithm, then specifying the directionality over the MBC subgraphs. Our approach is applied to the prediction problem of the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39) in order to estimate the health-related quality of life of Parkinson's patients. Fivefold cross-validation experiments were carried out on randomly generated synthetic data sets, Yeast data set, as well as on a real-world Parkinson's disease data set containing 488 patients. The experimental study, including comparison with additional Bayesian network-based approaches, back propagation for multi-label learning, multi-label k-nearest neighbor, multinomial logistic regression, ordinary least squares, and censored least absolute deviations, shows encouraging results in terms of predictive accuracy as well as the identification of dependence relationships among class and feature variables. Copyright © 2012 Elsevier Inc. All rights reserved.
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Can Bayesian Theories of Autism Spectrum Disorder Help Improve Clinical Practice?
Haker, Helene; Schneebeli, Maya; Stephan, Klaas Enno
2016-01-01
Diagnosis and individualized treatment of autism spectrum disorder (ASD) represent major problems for contemporary psychiatry. Tackling these problems requires guidance by a pathophysiological theory. In this paper, we consider recent theories that re-conceptualize ASD from a "Bayesian brain" perspective, which posit that the core abnormality of ASD resides in perceptual aberrations due to a disbalance in the precision of prediction errors (sensory noise) relative to the precision of predictions (prior beliefs). This results in percepts that are dominated by sensory inputs and less guided by top-down regularization and shifts the perceptual focus to detailed aspects of the environment with difficulties in extracting meaning. While these Bayesian theories have inspired ongoing empirical studies, their clinical implications have not yet been carved out. Here, we consider how this Bayesian perspective on disease mechanisms in ASD might contribute to improving clinical care for affected individuals. Specifically, we describe a computational strategy, based on generative (e.g., hierarchical Bayesian) models of behavioral and functional neuroimaging data, for establishing diagnostic tests. These tests could provide estimates of specific cognitive processes underlying ASD and delineate pathophysiological mechanisms with concrete treatment targets. Written with a clinical audience in mind, this article outlines how the development of computational diagnostics applicable to behavioral and functional neuroimaging data in routine clinical practice could not only fundamentally alter our concept of ASD but eventually also transform the clinical management of this disorder.
Can Bayesian Theories of Autism Spectrum Disorder Help Improve Clinical Practice?
Haker, Helene; Schneebeli, Maya; Stephan, Klaas Enno
2016-01-01
Diagnosis and individualized treatment of autism spectrum disorder (ASD) represent major problems for contemporary psychiatry. Tackling these problems requires guidance by a pathophysiological theory. In this paper, we consider recent theories that re-conceptualize ASD from a “Bayesian brain” perspective, which posit that the core abnormality of ASD resides in perceptual aberrations due to a disbalance in the precision of prediction errors (sensory noise) relative to the precision of predictions (prior beliefs). This results in percepts that are dominated by sensory inputs and less guided by top-down regularization and shifts the perceptual focus to detailed aspects of the environment with difficulties in extracting meaning. While these Bayesian theories have inspired ongoing empirical studies, their clinical implications have not yet been carved out. Here, we consider how this Bayesian perspective on disease mechanisms in ASD might contribute to improving clinical care for affected individuals. Specifically, we describe a computational strategy, based on generative (e.g., hierarchical Bayesian) models of behavioral and functional neuroimaging data, for establishing diagnostic tests. These tests could provide estimates of specific cognitive processes underlying ASD and delineate pathophysiological mechanisms with concrete treatment targets. Written with a clinical audience in mind, this article outlines how the development of computational diagnostics applicable to behavioral and functional neuroimaging data in routine clinical practice could not only fundamentally alter our concept of ASD but eventually also transform the clinical management of this disorder. PMID:27378955
Bayesian learning and the psychology of rule induction
Endress, Ansgar D.
2014-01-01
In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to spell out the underlying assumptions, and to confront them with the empirical results Frank and Tenenbaum (2011) propose to simulate, as well as with novel experiments. While rule-learning is arguably well suited to rational Bayesian approaches, I show that their models are neither psychologically plausible nor ideal observer models. Further, I show that their central assumption is unfounded: humans do not always preferentially learn more specific rules, but, at least in some situations, those rules that happen to be more salient. Even when granting the unsupported assumptions, I show that all of the experiments modeled by Frank and Tenenbaum (2011) either contradict their models, or have a large number of more plausible interpretations. I provide an alternative account of the experimental data based on simple psychological mechanisms, and show that this account both describes the data better, and is easier to falsify. I conclude that, despite the recent surge in Bayesian models of cognitive phenomena, psychological phenomena are best understood by developing and testing psychological theories rather than models that can be fit to virtually any data. PMID:23454791
Bayesian inference of a historical bottleneck in a heavily exploited marine mammal.
Hoffman, J I; Grant, S M; Forcada, J; Phillips, C D
2011-10-01
Emerging Bayesian analytical approaches offer increasingly sophisticated means of reconstructing historical population dynamics from genetic data, but have been little applied to scenarios involving demographic bottlenecks. Consequently, we analysed a large mitochondrial and microsatellite dataset from the Antarctic fur seal Arctocephalus gazella, a species subjected to one of the most extreme examples of uncontrolled exploitation in history when it was reduced to the brink of extinction by the sealing industry during the late eighteenth and nineteenth centuries. Classical bottleneck tests, which exploit the fact that rare alleles are rapidly lost during demographic reduction, yielded ambiguous results. In contrast, a strong signal of recent demographic decline was detected using both Bayesian skyline plots and Approximate Bayesian Computation, the latter also allowing derivation of posterior parameter estimates that were remarkably consistent with historical observations. This was achieved using only contemporary samples, further emphasizing the potential of Bayesian approaches to address important problems in conservation and evolutionary biology. © 2011 Blackwell Publishing Ltd.
Number-Knower Levels in Young Children: Insights from Bayesian Modeling
ERIC Educational Resources Information Center
Lee, Michael D.; Sarnecka, Barbara W.
2011-01-01
Lee and Sarnecka (2010) developed a Bayesian model of young children's behavior on the Give-N test of number knowledge. This paper presents two new extensions of the model, and applies the model to new data. In the first extension, the model is used to evaluate competing theories about the conceptual knowledge underlying children's behavior. One,…
B.G. Marcot; J.D. Steventon; G.D. Sutherland; R.K. McCann
2006-01-01
We provide practical guidelines for developing, testing, and revising Bayesian belief networks (BBNs). Primary steps in this process include creating influence diagrams of the hypothesized "causal web" of key factors affecting a species or ecological outcome of interest; developing a first, alpha-level BBN model from the influence diagram; revising the model...
ERIC Educational Resources Information Center
Leventhal, Brian C.; Stone, Clement A.
2018-01-01
Interest in Bayesian analysis of item response theory (IRT) models has grown tremendously due to the appeal of the paradigm among psychometricians, advantages of these methods when analyzing complex models, and availability of general-purpose software. Possible models include models which reflect multidimensionality due to designed test structure,…
ERIC Educational Resources Information Center
Tsiouris, John; Mann, Rachel; Patti, Paul; Sturmey, Peter
2004-01-01
Clinicians need to know the likelihood of a condition given a positive or negative diagnostic test. In this study a Bayesian analysis of the Clinical Behavior Checklist for Persons with Intellectual Disabilities (CBCPID) to predict depression in people with intellectual disability was conducted. The CBCPID was administered to 92 adults with…
Approximate string matching algorithms for limited-vocabulary OCR output correction
NASA Astrophysics Data System (ADS)
Lasko, Thomas A.; Hauser, Susan E.
2000-12-01
Five methods for matching words mistranslated by optical character recognition to their most likely match in a reference dictionary were tested on data from the archives of the National Library of Medicine. The methods, including an adaptation of the cross correlation algorithm, the generic edit distance algorithm, the edit distance algorithm with a probabilistic substitution matrix, Bayesian analysis, and Bayesian analysis on an actively thinned reference dictionary were implemented and their accuracy rates compared. Of the five, the Bayesian algorithm produced the most correct matches (87%), and had the advantage of producing scores that have a useful and practical interpretation.
CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data.
Fidaner, Işık Barış; Cankorur-Cetinkaya, Ayca; Dikicioglu, Duygu; Kirdar, Betul; Cemgil, Ali Taylan; Oliver, Stephen G
2016-02-01
Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications. The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG. sgo24@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Wang, Shi-Heng; Chen, Wei J; Tsai, Yu-Chin; Huang, Yung-Hsiang; Hwu, Hai-Gwo; Hsiao, Chuhsing K
2013-01-01
The copy number variation (CNV) is a type of genetic variation in the genome. It is measured based on signal intensity measures and can be assessed repeatedly to reduce the uncertainty in PCR-based typing. Studies have shown that CNVs may lead to phenotypic variation and modification of disease expression. Various challenges exist, however, in the exploration of CNV-disease association. Here we construct latent variables to infer the discrete CNV values and to estimate the probability of mutations. In addition, we propose to pool rare variants to increase the statistical power and we conduct family studies to mitigate the computational burden in determining the composition of CNVs on each chromosome. To explore in a stochastic sense the association between the collapsing CNV variants and disease status, we utilize a Bayesian hierarchical model incorporating the mutation parameters. This model assigns integers in a probabilistic sense to the quantitatively measured copy numbers, and is able to test simultaneously the association for all variants of interest in a regression framework. This integrative model can account for the uncertainty in copy number assignment and differentiate if the variation was de novo or inherited on the basis of posterior probabilities. For family studies, this model can accommodate the dependence within family members and among repeated CNV data. Moreover, the Mendelian rule can be assumed under this model and yet the genetic variation, including de novo and inherited variation, can still be included and quantified directly for each individual. Finally, simulation studies show that this model has high true positive and low false positive rates in the detection of de novo mutation.
Bayesian randomized clinical trials: From fixed to adaptive design.
Yin, Guosheng; Lam, Chi Kin; Shi, Haolun
2017-08-01
Randomized controlled studies are the gold standard for phase III clinical trials. Using α-spending functions to control the overall type I error rate, group sequential methods are well established and have been dominating phase III studies. Bayesian randomized design, on the other hand, can be viewed as a complement instead of competitive approach to the frequentist methods. For the fixed Bayesian design, the hypothesis testing can be cast in the posterior probability or Bayes factor framework, which has a direct link to the frequentist type I error rate. Bayesian group sequential design relies upon Bayesian decision-theoretic approaches based on backward induction, which is often computationally intensive. Compared with the frequentist approaches, Bayesian methods have several advantages. The posterior predictive probability serves as a useful and convenient tool for trial monitoring, and can be updated at any time as the data accrue during the trial. The Bayesian decision-theoretic framework possesses a direct link to the decision making in the practical setting, and can be modeled more realistically to reflect the actual cost-benefit analysis during the drug development process. Other merits include the possibility of hierarchical modeling and the use of informative priors, which would lead to a more comprehensive utilization of information from both historical and longitudinal data. From fixed to adaptive design, we focus on Bayesian randomized controlled clinical trials and make extensive comparisons with frequentist counterparts through numerical studies. Copyright © 2017 Elsevier Inc. All rights reserved.
A systematic review of Bayesian articles in psychology: The last 25 years.
van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah
2017-06-01
Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model
Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.
2014-01-01
Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Adaptive Randomization of Neratinib in Early Breast Cancer.
Park, John W; Liu, Minetta C; Yee, Douglas; Yau, Christina; van 't Veer, Laura J; Symmans, W Fraser; Paoloni, Melissa; Perlmutter, Jane; Hylton, Nola M; Hogarth, Michael; DeMichele, Angela; Buxton, Meredith B; Chien, A Jo; Wallace, Anne M; Boughey, Judy C; Haddad, Tufia C; Chui, Stephen Y; Kemmer, Kathleen A; Kaplan, Henry G; Isaacs, Claudine; Nanda, Rita; Tripathy, Debasish; Albain, Kathy S; Edmiston, Kirsten K; Elias, Anthony D; Northfelt, Donald W; Pusztai, Lajos; Moulder, Stacy L; Lang, Julie E; Viscusi, Rebecca K; Euhus, David M; Haley, Barbara B; Khan, Qamar J; Wood, William C; Melisko, Michelle; Schwab, Richard; Helsten, Teresa; Lyandres, Julia; Davis, Sarah E; Hirst, Gillian L; Sanil, Ashish; Esserman, Laura J; Berry, Donald A
2016-07-07
The heterogeneity of breast cancer makes identifying effective therapies challenging. The I-SPY 2 trial, a multicenter, adaptive phase 2 trial of neoadjuvant therapy for high-risk clinical stage II or III breast cancer, evaluated multiple new agents added to standard chemotherapy to assess the effects on rates of pathological complete response (i.e., absence of residual cancer in the breast or lymph nodes at the time of surgery). We used adaptive randomization to compare standard neoadjuvant chemotherapy plus the tyrosine kinase inhibitor neratinib with control. Eligible women were categorized according to eight biomarker subtypes on the basis of human epidermal growth factor receptor 2 (HER2) status, hormone-receptor status, and risk according to a 70-gene profile. Neratinib was evaluated against control with regard to 10 biomarker signatures (prospectively defined combinations of subtypes). The primary end point was pathological complete response. Volume changes on serial magnetic resonance imaging were used to assess the likelihood of such a response in each patient. Adaptive assignment to experimental groups within each disease subtype was based on Bayesian probabilities of the superiority of the treatment over control. Enrollment in the experimental group was stopped when the 85% Bayesian predictive probability of success in a confirmatory phase 3 trial of neoadjuvant therapy reached a prespecified threshold for any biomarker signature ("graduation"). Enrollment was stopped for futility if the probability fell to below 10% for every biomarker signature. Neratinib reached the prespecified efficacy threshold with regard to the HER2-positive, hormone-receptor-negative signature. Among patients with HER2-positive, hormone-receptor-negative cancer, the mean estimated rate of pathological complete response was 56% (95% Bayesian probability interval [PI], 37 to 73%) among 115 patients in the neratinib group, as compared with 33% among 78 controls (95% PI, 11 to 54%). The final predictive probability of success in phase 3 testing was 79%. Neratinib added to standard therapy was highly likely to result in higher rates of pathological complete response than standard chemotherapy with trastuzumab among patients with HER2-positive, hormone-receptor-negative breast cancer. (Funded by QuantumLeap Healthcare Collaborative and others; I-SPY 2 TRIAL ClinicalTrials.gov number, NCT01042379.).
Bayesian module identification from multiple noisy networks.
Zamani Dadaneh, Siamak; Qian, Xiaoning
2016-12-01
Module identification has been studied extensively in order to gain deeper understanding of complex systems, such as social networks as well as biological networks. Modules are often defined as groups of vertices in these networks that are topologically cohesive with similar interaction patterns with the rest of the vertices. Most of the existing module identification algorithms assume that the given networks are faithfully measured without errors. However, in many real-world applications, for example, when analyzing protein-protein interaction networks from high-throughput profiling techniques, there is significant noise with both false positive and missing links between vertices. In this paper, we propose a new model for more robust module identification by taking advantage of multiple observed networks with significant noise so that signals in multiple networks can be strengthened and help improve the solution quality by combining information from various sources. We adopt a hierarchical Bayesian model to integrate multiple noisy snapshots that capture the underlying modular structure of the networks under study. By introducing a latent root assignment matrix and its relations to instantaneous module assignments in all the observed networks to capture the underlying modular structure and combine information across multiple networks, an efficient variational Bayes algorithm can be derived to accurately and robustly identify the underlying modules from multiple noisy networks. Experiments on synthetic and protein-protein interaction data sets show that our proposed model enhances both the accuracy and resolution in detecting cohesive modules, and it is less vulnerable to noise in the observed data. In addition, it shows higher power in predicting missing edges compared to individual-network methods.
A two step Bayesian approach for genomic prediction of breeding values.
Shariati, Mohammad M; Sørensen, Peter; Janss, Luc
2012-05-21
In genomic models that assign an individual variance to each marker, the contribution of one marker to the posterior distribution of the marker variance is only one degree of freedom (df), which introduces many variance parameters with only little information per variance parameter. A better alternative could be to form clusters of markers with similar effects where markers in a cluster have a common variance. Therefore, the influence of each marker group of size p on the posterior distribution of the marker variances will be p df. The simulated data from the 15th QTL-MAS workshop were analyzed such that SNP markers were ranked based on their effects and markers with similar estimated effects were grouped together. In step 1, all markers with minor allele frequency more than 0.01 were included in a SNP-BLUP prediction model. In step 2, markers were ranked based on their estimated variance on the trait in step 1 and each 150 markers were assigned to one group with a common variance. In further analyses, subsets of 1500 and 450 markers with largest effects in step 2 were kept in the prediction model. Grouping markers outperformed SNP-BLUP model in terms of accuracy of predicted breeding values. However, the accuracies of predicted breeding values were lower than Bayesian methods with marker specific variances. Grouping markers is less flexible than allowing each marker to have a specific marker variance but, by grouping, the power to estimate marker variances increases. A prior knowledge of the genetic architecture of the trait is necessary for clustering markers and appropriate prior parameterization.
Storey, Rebecca
2007-01-01
Comparison of different adult age estimation methods on the same skeletal sample with unknown ages could forward paleodemographic inference, while researchers sort out various controversies. The original aging method for the auricular surface (Lovejoy et al., 1985a) assigned an age estimation based on several separate characteristics. Researchers have found this original method hard to apply. It is usually forgotten that before assigning an age, there was a seriation, an ordering of all available individuals from youngest to oldest. Thus, age estimation reflected the place of an individual within its sample. A recent article (Buckberry and Chamberlain, 2002) proposed a revised method that scores theses various characteristics into age stages, which can then be used with a Bayesian method to estimate an adult age distribution for the sample. Both methods were applied to the adult auricular surfaces of a Pre-Columbian Maya skeletal population from Copan, Honduras and resulted in age distributions with significant numbers of older adults. However, contrary to the usual paleodemographic distribution, one Bayesian estimation based on uniform prior probabilities yielded a population with 57% of the ages at death over 65, while another based on a high mortality life table still had 12% of the individuals aged over 75 years. The seriation method yielded an age distribution more similar to that known from preindustrial historical situations, without excessive longevity of adults. Paleodemography must still wrestle with its elusive goal of accurate adult age estimation from skeletons, a necessary base for demographic study of past populations. (c) 2006 Wiley-Liss, Inc
Roelandt, S; Van der Stede, Y; Czaplicki, G; Van Loo, H; Van Driessche, E; Dewulf, J; Hooyberghs, J; Faes, C
2015-06-06
Currently, there are no perfect reference tests for the in vivo detection of Neospora caninum infection. Two commercial N caninum ELISA tests are currently used in Belgium for bovine sera (TEST A and TEST B). The goal of this study is to evaluate these tests used at their current cut-offs, with a no gold standard approach, for the test purpose of (1) demonstration of freedom of infection at purchase and (2) diagnosis in aborting cattle. Sera of two study populations, Abortion population (n=196) and Purchase population (n=514), were selected and tested with both ELISA's. Test results were entered in a Bayesian model with informative priors on population prevalences only (Scenario 1). As sensitivity analysis, two more models were used: one with informative priors on test diagnostic accuracy (Scenario 2) and one with all priors uninformative (Scenario 3). The accuracy parameters were estimated from the first model: diagnostic sensitivity (Test A: 93.54 per cent-Test B: 86.99 per cent) and specificity (Test A: 90.22 per cent-Test B: 90.15 per cent) were high and comparable (Bayesian P values >0.05). Based on predictive values in the two study populations, both tests were fit for purpose, despite an expected false negative fraction of ±0.5 per cent in the Purchase population and ±5 per cent in the Abortion population. In addition, a false positive fraction of ±3 per cent in the overall Purchase population and ±4 per cent in the overall Abortion population was found. British Veterinary Association.
ERIC Educational Resources Information Center
Vrieze, Scott I.
2012-01-01
This article reviews the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) in model selection and the appraisal of psychological theory. The focus is on latent variable models, given their growing use in theory testing and construction. Theoretical statistical results in regression are discussed, and more important…
ERIC Educational Resources Information Center
Kessler, Lawrence M.
2013-01-01
In this paper I propose Bayesian estimation of a nonlinear panel data model with a fractional dependent variable (bounded between 0 and 1). Specifically, I estimate a panel data fractional probit model which takes into account the bounded nature of the fractional response variable. I outline estimation under the assumption of strict exogeneity as…
Howard B. Stauffer; Cynthia J. Zabel; Jeffrey R. Dunk
2005-01-01
We compared a set of competing logistic regression habitat selection models for Northern Spotted Owls (Strix occidentalis caurina) in California. The habitat selection models were estimated, compared, evaluated, and tested using multiple sample datasets collected on federal forestlands in northern California. We used Bayesian methods in interpreting...
A Test of Bayesian Observer Models of Processing in the Eriksen Flanker Task
ERIC Educational Resources Information Center
White, Corey N.; Brown, Scott; Ratcliff, Roger
2012-01-01
Two Bayesian observer models were recently proposed to account for data from the Eriksen flanker task, in which flanking items interfere with processing of a central target. One model assumes that interference stems from a perceptual bias to process nearby items as if they are compatible, and the other assumes that the interference is due to…
ERIC Educational Resources Information Center
Kelava, Augustin; Nagengast, Benjamin
2012-01-01
Structural equation models with interaction and quadratic effects have become a standard tool for testing nonlinear hypotheses in the social sciences. Most of the current approaches assume normally distributed latent predictor variables. In this article, we present a Bayesian model for the estimation of latent nonlinear effects when the latent…
Bayesian adaptive phase II screening design for combination trials.
Cai, Chunyan; Yuan, Ying; Johnson, Valen E
2013-01-01
Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Simulation studies show that the proposed design substantially outperforms the conventional multiarm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while allocating substantially more patients to efficacious treatments. The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while providing higher power to identify the best treatment at the end of the trial.
Dale, Julia; Price, Erin P; Hornstra, Heidie; Busch, Joseph D; Mayo, Mark; Godoy, Daniel; Wuthiekanun, Vanaporn; Baker, Anthony; Foster, Jeffrey T; Wagner, David M; Tuanyok, Apichai; Warner, Jeffrey; Spratt, Brian G; Peacock, Sharon J; Currie, Bart J; Keim, Paul; Pearson, Talima
2011-12-01
Rapid assignment of bacterial pathogens into predefined populations is an important first step for epidemiological tracking. For clonal species, a single allele can theoretically define a population. For non-clonal species such as Burkholderia pseudomallei, however, shared allelic states between distantly related isolates make it more difficult to identify population defining characteristics. Two distinct B. pseudomallei populations have been previously identified using multilocus sequence typing (MLST). These populations correlate with the major foci of endemicity (Australia and Southeast Asia). Here, we use multiple Bayesian approaches to evaluate the compositional robustness of these populations, and provide assignment results for MLST sequence types (STs). Our goal was to provide a reference for assigning STs to an established population without the need for further computational analyses. We also provide allele frequency results for each population to enable estimation of population assignment even when novel STs are discovered. The ability for humans and potentially contaminated goods to move rapidly across the globe complicates the task of identifying the source of an infection or outbreak. Population genetic dynamics of B. pseudomallei are particularly complicated relative to other bacterial pathogens, but the work here provides the ability for broad scale population assignment. As there is currently no independent empirical measure of successful population assignment, we provide comprehensive analytical details of our comparisons to enable the reader to evaluate the robustness of population designations and assignments as they pertain to individual research questions. Finer scale subdivision and verification of current population compositions will likely be possible with genotyping data that more comprehensively samples the genome. The approach used here may be valuable for other non-clonal pathogens that lack simple group-defining genetic characteristics and provides a rapid reference for epidemiologists wishing to track the origin of infection without the need to compile population data and learn population assignment algorithms.
Yu, Rongjie; Abdel-Aty, Mohamed
2013-07-01
The Bayesian inference method has been frequently adopted to develop safety performance functions. One advantage of the Bayesian inference is that prior information for the independent variables can be included in the inference procedures. However, there are few studies that discussed how to formulate informative priors for the independent variables and evaluated the effects of incorporating informative priors in developing safety performance functions. This paper addresses this deficiency by introducing four approaches of developing informative priors for the independent variables based on historical data and expert experience. Merits of these informative priors have been tested along with two types of Bayesian hierarchical models (Poisson-gamma and Poisson-lognormal models). Deviance information criterion (DIC), R-square values, and coefficients of variance for the estimations were utilized as evaluation measures to select the best model(s). Comparison across the models indicated that the Poisson-gamma model is superior with a better model fit and it is much more robust with the informative priors. Moreover, the two-stage Bayesian updating informative priors provided the best goodness-of-fit and coefficient estimation accuracies. Furthermore, informative priors for the inverse dispersion parameter have also been introduced and tested. Different types of informative priors' effects on the model estimations and goodness-of-fit have been compared and concluded. Finally, based on the results, recommendations for future research topics and study applications have been made. Copyright © 2013 Elsevier Ltd. All rights reserved.
Brase, Gary L.; Hill, W. Trey
2015-01-01
Bayesian reasoning, defined here as the updating of a posterior probability following new information, has historically been problematic for humans. Classic psychology experiments have tested human Bayesian reasoning through the use of word problems and have evaluated each participant’s performance against the normatively correct answer provided by Bayes’ theorem. The standard finding is of generally poor performance. Over the past two decades, though, progress has been made on how to improve Bayesian reasoning. Most notably, research has demonstrated that the use of frequencies in a natural sampling framework—as opposed to single-event probabilities—can improve participants’ Bayesian estimates. Furthermore, pictorial aids and certain individual difference factors also can play significant roles in Bayesian reasoning success. The mechanics of how to build tasks which show these improvements is not under much debate. The explanations for why naturally sampled frequencies and pictures help Bayesian reasoning remain hotly contested, however, with many researchers falling into ingrained “camps” organized around two dominant theoretical perspectives. The present paper evaluates the merits of these theoretical perspectives, including the weight of empirical evidence, theoretical coherence, and predictive power. By these criteria, the ecological rationality approach is clearly better than the heuristics and biases view. Progress in the study of Bayesian reasoning will depend on continued research that honestly, vigorously, and consistently engages across these different theoretical accounts rather than staying “siloed” within one particular perspective. The process of science requires an understanding of competing points of view, with the ultimate goal being integration. PMID:25873904
Brase, Gary L; Hill, W Trey
2015-01-01
Bayesian reasoning, defined here as the updating of a posterior probability following new information, has historically been problematic for humans. Classic psychology experiments have tested human Bayesian reasoning through the use of word problems and have evaluated each participant's performance against the normatively correct answer provided by Bayes' theorem. The standard finding is of generally poor performance. Over the past two decades, though, progress has been made on how to improve Bayesian reasoning. Most notably, research has demonstrated that the use of frequencies in a natural sampling framework-as opposed to single-event probabilities-can improve participants' Bayesian estimates. Furthermore, pictorial aids and certain individual difference factors also can play significant roles in Bayesian reasoning success. The mechanics of how to build tasks which show these improvements is not under much debate. The explanations for why naturally sampled frequencies and pictures help Bayesian reasoning remain hotly contested, however, with many researchers falling into ingrained "camps" organized around two dominant theoretical perspectives. The present paper evaluates the merits of these theoretical perspectives, including the weight of empirical evidence, theoretical coherence, and predictive power. By these criteria, the ecological rationality approach is clearly better than the heuristics and biases view. Progress in the study of Bayesian reasoning will depend on continued research that honestly, vigorously, and consistently engages across these different theoretical accounts rather than staying "siloed" within one particular perspective. The process of science requires an understanding of competing points of view, with the ultimate goal being integration.
A Bayesian pick-the-winner design in a randomized phase II clinical trial.
Chen, Dung-Tsa; Huang, Po-Yu; Lin, Hui-Yi; Chiappori, Alberto A; Gabrilovich, Dmitry I; Haura, Eric B; Antonia, Scott J; Gray, Jhanelle E
2017-10-24
Many phase II clinical trials evaluate unique experimental drugs/combinations through multi-arm design to expedite the screening process (early termination of ineffective drugs) and to identify the most effective drug (pick the winner) to warrant a phase III trial. Various statistical approaches have been developed for the pick-the-winner design but have been criticized for lack of objective comparison among the drug agents. We developed a Bayesian pick-the-winner design by integrating a Bayesian posterior probability with Simon two-stage design in a randomized two-arm clinical trial. The Bayesian posterior probability, as the rule to pick the winner, is defined as probability of the response rate in one arm higher than in the other arm. The posterior probability aims to determine the winner when both arms pass the second stage of the Simon two-stage design. When both arms are competitive (i.e., both passing the second stage), the Bayesian posterior probability performs better to correctly identify the winner compared with the Fisher exact test in the simulation study. In comparison to a standard two-arm randomized design, the Bayesian pick-the-winner design has a higher power to determine a clear winner. In application to two studies, the approach is able to perform statistical comparison of two treatment arms and provides a winner probability (Bayesian posterior probability) to statistically justify the winning arm. We developed an integrated design that utilizes Bayesian posterior probability, Simon two-stage design, and randomization into a unique setting. It gives objective comparisons between the arms to determine the winner.
vonHoldt, Bridgett M; Stahler, Daniel R; Bangs, Edward E; Smith, Douglas W; Jimenez, Mike D; Mack, Curt M; Niemeyer, Carter C; Pollinger, John P; Wayne, Robert K
2010-10-01
The successful re-introduction of grey wolves to the western United States is an impressive accomplishment for conservation science. However, the degree to which subpopulations are genetically structured and connected, along with the preservation of genetic variation, is an important concern for the continued viability of the metapopulation. We analysed DNA samples from 555 Northern Rocky Mountain wolves from the three recovery areas (Greater Yellowstone Area, Montana, and Idaho), including all 66 re-introduced founders, for variation in 26 microsatellite loci over the initial 10-year recovery period (1995-2004). The population maintained high levels of variation (H(O) = 0.64-0.72; allelic diversity k=7.0-10.3) with low levels of inbreeding (F(IS) < 0.03) and throughout this period, the population expanded rapidly (n(1995) =101; n(2004) =846). Individual-based Bayesian analyses revealed significant population genetic structure and identified three subpopulations coinciding with designated recovery areas. Population assignment and migrant detection were difficult because of the presence of related founders among different recovery areas and required a novel approach to determine genetically effective migration and admixture. However, by combining assignment tests, private alleles, sibship reconstruction, and field observations, we detected genetically effective dispersal among the three recovery areas. Successful conservation of Northern Rocky Mountain wolves will rely on management decisions that promote natural dispersal dynamics and minimize anthropogenic factors that reduce genetic connectivity. © 2010 Blackwell Publishing Ltd.
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.
Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu
2018-05-01
Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.
NASA Astrophysics Data System (ADS)
Zhang, Haibin; Johnson, Shannon B.; Flores, Vanessa R.; Vrijenhoek, Robert C.
2015-11-01
We describe a broad zone of intergradation between genetically differentiated, northern and southern lineages of the hydrothermal vent tubeworm, Tevnia jerichonana. DNA sequences from four genes, nuclear HSP and ATPsα and mitochondrial COI and Cytb were examined in samples from eastern Pacific vent localities between 13°N and 38°S latitude. Allelic frequencies at these loci exhibited concordant latitudinal clines, and genetic differentiation (pairwise ΦST's) increased with geographical distances between sample localities. Though this pattern of differentiation suggested isolation-by-distance (IBD), it appeared to result from hierarchical population structure. Genotypic assignment tests identified two population clusters comprised of samples from the northern East Pacific Rise (NEPR: 9-13°N) and an extension of the Pacific-Antarctic Ridge (PAR: 31-32°S) with a zone of intergradation along the southern East Pacific Rise (SEPR: 7-17°S). The overall degrees of DNA sequence divergence between the NEPR and PAR populations were slight and not indicative of lengthy isolation. Bayesian assignment methods suggested that the SEPR populations constitute intergrades that connect the NEPR and PAR populations. Though it typically is difficult to distinguish between primary and secondary intergradation, our results were consistent with parallel studies of vent-restricted species that suggest a high degree of demographic instability along the superfast-spreading SEPR axis. Frequent local extinctions and immigration from NEPR and PAR refugia probably shaped the observed pattern of intergradation.
Krypotos, Angelos-Miltiadis; Klugkist, Irene; Engelhard, Iris M.
2017-01-01
ABSTRACT Threat conditioning procedures have allowed the experimental investigation of the pathogenesis of Post-Traumatic Stress Disorder. The findings of these procedures have also provided stable foundations for the development of relevant intervention programs (e.g. exposure therapy). Statistical inference of threat conditioning procedures is commonly based on p-values and Null Hypothesis Significance Testing (NHST). Nowadays, however, there is a growing concern about this statistical approach, as many scientists point to the various limitations of p-values and NHST. As an alternative, the use of Bayes factors and Bayesian hypothesis testing has been suggested. In this article, we apply this statistical approach to threat conditioning data. In order to enable the easy computation of Bayes factors for threat conditioning data we present a new R package named condir, which can be used either via the R console or via a Shiny application. This article provides both a non-technical introduction to Bayesian analysis for researchers using the threat conditioning paradigm, and the necessary tools for computing Bayes factors easily. PMID:29038683
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-01
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-08
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
A Bayesian Assessment of Seismic Semi-Periodicity Forecasts
NASA Astrophysics Data System (ADS)
Nava, F.; Quinteros, C.; Glowacka, E.; Frez, J.
2016-01-01
Among the schemes for earthquake forecasting, the search for semi-periodicity during large earthquakes in a given seismogenic region plays an important role. When considering earthquake forecasts based on semi-periodic sequence identification, the Bayesian formalism is a useful tool for: (1) assessing how well a given earthquake satisfies a previously made forecast; (2) re-evaluating the semi-periodic sequence probability; and (3) testing other prior estimations of the sequence probability. A comparison of Bayesian estimates with updated estimates of semi-periodic sequences that incorporate new data not used in the original estimates shows extremely good agreement, indicating that: (1) the probability that a semi-periodic sequence is not due to chance is an appropriate estimate for the prior sequence probability estimate; and (2) the Bayesian formalism does a very good job of estimating corrected semi-periodicity probabilities, using slightly less data than that used for updated estimates. The Bayesian approach is exemplified explicitly by its application to the Parkfield semi-periodic forecast, and results are given for its application to other forecasts in Japan and Venezuela.
Vogt, Martin; Bajorath, Jürgen
2008-01-01
Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
ERIC Educational Resources Information Center
Page, Robert; Satake, Eiki
2017-01-01
While interest in Bayesian statistics has been growing in statistics education, the treatment of the topic is still inadequate in both textbooks and the classroom. Because so many fields of study lead to careers that involve a decision-making process requiring an understanding of Bayesian methods, it is becoming increasingly clear that Bayesian…
A Bayesian truth serum for subjective data.
Prelec, Drazen
2004-10-15
Subjective judgments, an essential information source for science and policy, are problematic because there are no public criteria for assessing judgmental truthfulness. I present a scoring method for eliciting truthful subjective data in situations where objective truth is unknowable. The method assigns high scores not to the most common answers but to the answers that are more common than collectively predicted, with predictions drawn from the same population. This simple adjustment in the scoring criterion removes all bias in favor of consensus: Truthful answers maximize expected score even for respondents who believe that their answer represents a minority view.
Bayesian parameter estimation for chiral effective field theory
NASA Astrophysics Data System (ADS)
Wesolowski, Sarah; Furnstahl, Richard; Phillips, Daniel; Klco, Natalie
2016-09-01
The low-energy constants (LECs) of a chiral effective field theory (EFT) interaction in the two-body sector are fit to observable data using a Bayesian parameter estimation framework. By using Bayesian prior probability distributions (pdfs), we quantify relevant physical expectations such as LEC naturalness and include them in the parameter estimation procedure. The final result is a posterior pdf for the LECs, which can be used to propagate uncertainty resulting from the fit to data to the final observable predictions. The posterior pdf also allows an empirical test of operator redundancy and other features of the potential. We compare results of our framework with other fitting procedures, interpreting the underlying assumptions in Bayesian probabilistic language. We also compare results from fitting all partial waves of the interaction simultaneously to cross section data compared to fitting to extracted phase shifts, appropriately accounting for correlations in the data. Supported in part by the NSF and DOE.
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
DOE Office of Scientific and Technical Information (OSTI.GOV)
Warren, Harry P.; Byers, Jeff M.; Crump, Nicholas A.
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of themore » solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.« less
Cipoli, Daniel E; Martinez, Edson Z; Castro, Margaret de; Moreira, Ayrton C
2012-12-01
To estimate the pretest probability of Cushing's syndrome (CS) diagnosis by a Bayesian approach using intuitive clinical judgment. Physicians were requested, in seven endocrinology meetings, to answer three questions: "Based on your personal expertise, after obtaining clinical history and physical examination, without using laboratorial tests, what is your probability of diagnosing Cushing's Syndrome?"; "For how long have you been practicing Endocrinology?"; and "Where do you work?". A Bayesian beta regression, using the WinBugs software was employed. We obtained 294 questionnaires. The mean pretest probability of CS diagnosis was 51.6% (95%CI: 48.7-54.3). The probability was directly related to experience in endocrinology, but not with the place of work. Pretest probability of CS diagnosis was estimated using a Bayesian methodology. Although pretest likelihood can be context-dependent, experience based on years of practice may help the practitioner to diagnosis CS.
Bayesian analysis of CCDM models
NASA Astrophysics Data System (ADS)
Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Refining value-at-risk estimates using a Bayesian Markov-switching GJR-GARCH copula-EVT model.
Sampid, Marius Galabe; Hasim, Haslifah M; Dai, Hongsheng
2018-01-01
In this paper, we propose a model for forecasting Value-at-Risk (VaR) using a Bayesian Markov-switching GJR-GARCH(1,1) model with skewed Student's-t innovation, copula functions and extreme value theory. A Bayesian Markov-switching GJR-GARCH(1,1) model that identifies non-constant volatility over time and allows the GARCH parameters to vary over time following a Markov process, is combined with copula functions and EVT to formulate the Bayesian Markov-switching GJR-GARCH(1,1) copula-EVT VaR model, which is then used to forecast the level of risk on financial asset returns. We further propose a new method for threshold selection in EVT analysis, which we term the hybrid method. Empirical and back-testing results show that the proposed VaR models capture VaR reasonably well in periods of calm and in periods of crisis.
Godde, Kanya
2017-01-01
The aim of this study is to examine how well different informative priors model age-at-death in Bayesian statistics, which will shed light on how the skeleton ages, particularly at the sacroiliac joint. Data from four samples were compared for their performance as informative priors for auricular surface age-at-death estimation: (1) American population from US Census data; (2) county data from the US Census data; (3) a local cemetery; and (4) a skeletal collection. The skeletal collection and cemetery are located within the county that was sampled. A Gompertz model was applied to compare survivorship across the four samples. Transition analysis parameters, coupled with the generated Gompertz parameters, were input into Bayes' theorem to generate highest posterior density ranges from posterior density functions. Transition analysis describes the age at which an individual transitions from one age phase to another. The result is age ranges that should describe the chronological age of 90% of the individuals who fall in a particular phase. Cumulative binomial tests indicate the method performed lower than 90% at capturing chronological age as assigned to a biological phase, despite wide age ranges at older ages. The samples performed similarly overall, despite small differences in survivorship. Collectively, these results show that as we age, the senescence pattern becomes more variable. More local samples performed better at describing the aging process than more general samples, which implies practitioners need to consider sample selection when using the literature to diagnose and work with patients with sacroiliac joint pain. PMID:29546217
Gaubert, Philippe; Patel, Riddhi P; Veron, Géraldine; Goodman, Steven M; Willsch, Maraike; Vasconcelos, Raquel; Lourenço, André; Sigaud, Marie; Justy, Fabienne; Joshi, Bheem Dutt; Fickel, Jörns; Wilting, Andreas
2017-05-01
The biogeographic dynamics affecting the Indian subcontinent, East and Southeast Asia during the Plio-Pleistocene has generated complex biodiversity patterns. We assessed the molecular biogeography of the small Indian civet (Viverricula indica) through mitogenome and cytochrome b + control region sequencing of 89 historical and modern samples to (1) establish a time-calibrated phylogeography across the species' native range and (2) test introduction scenarios to western Indian Ocean islands. Bayesian phylogenetic analyses identified 3 geographic lineages (East Asia, sister-group to Southeast Asia and the Indian subcontinent + northern Indochina) diverging 3.2-2.3 million years ago (Mya), with no clear signature of past demographic expansion. Within Southeast Asia, Balinese populations separated from the rest 2.6-1.3 Mya. Western Indian Ocean populations were assigned to the Indian subcontinent + northern Indochina lineage and had the lowest mitochondrial diversity. Approximate Bayesian computation did not distinguish between single versus multiple introduction scenarios. The early diversification of the small Indian civet was likely shaped by humid periods in the Late Pliocene-Early Pleistocene that created evergreen rainforest barriers, generating areas of intra-specific endemism in the Indian subcontinent, East, and Southeast Asia. Later, Pleistocene dispersals through drier conditions in South and Southeast Asia were likely, giving rise to the species' current natural distribution. Our molecular data supported the delineation of only 4 subspecies in V. indica, including an endemic Balinese lineage. Our study also highlighted the influence of prefirst millennium AD introductions to western Indian Ocean islands, with Indian and/or Arab traders probably introducing the species for its civet oil. © The American Genetic Association 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Adaptive sequential Bayesian classification using Page's test
NASA Astrophysics Data System (ADS)
Lynch, Robert S., Jr.; Willett, Peter K.
2002-03-01
In this paper, the previously introduced Mean-Field Bayesian Data Reduction Algorithm is extended for adaptive sequential hypothesis testing utilizing Page's test. In general, Page's test is well understood as a method of detecting a permanent change in distribution associated with a sequence of observations. However, the relationship between detecting a change in distribution utilizing Page's test with that of classification and feature fusion is not well understood. Thus, the contribution of this work is based on developing a method of classifying an unlabeled vector of fused features (i.e., detect a change to an active statistical state) as quickly as possible given an acceptable mean time between false alerts. In this case, the developed classification test can be thought of as equivalent to performing a sequential probability ratio test repeatedly until a class is decided, with the lower log-threshold of each test being set to zero and the upper log-threshold being determined by the expected distance between false alerts. It is of interest to estimate the delay (or, related stopping time) to a classification decision (the number of time samples it takes to classify the target), and the mean time between false alerts, as a function of feature selection and fusion by the Mean-Field Bayesian Data Reduction Algorithm. Results are demonstrated by plotting the delay to declaring the target class versus the mean time between false alerts, and are shown using both different numbers of simulated training data and different numbers of relevant features for each class.
Bayesian modelling of lung function data from multiple-breath washout tests.
Mahar, Robert K; Carlin, John B; Ranganathan, Sarath; Ponsonby, Anne-Louise; Vuillermin, Peter; Vukcevic, Damjan
2018-05-30
Paediatric respiratory researchers have widely adopted the multiple-breath washout (MBW) test because it allows assessment of lung function in unsedated infants and is well suited to longitudinal studies of lung development and disease. However, a substantial proportion of MBW tests in infants fail current acceptability criteria. We hypothesised that a model-based approach to analysing the data, in place of traditional simple empirical summaries, would enable more efficient use of these tests. We therefore developed a novel statistical model for infant MBW data and applied it to 1197 tests from 432 individuals from a large birth cohort study. We focus on Bayesian estimation of the lung clearance index, the most commonly used summary of lung function from MBW tests. Our results show that the model provides an excellent fit to the data and shed further light on statistical properties of the standard empirical approach. Furthermore, the modelling approach enables the lung clearance index to be estimated by using tests with different degrees of completeness, something not possible with the standard approach. Our model therefore allows previously unused data to be used rather than discarded, as well as routine use of shorter tests without significant loss of precision. Beyond our specific application, our work illustrates a number of important aspects of Bayesian modelling in practice, such as the importance of hierarchical specifications to account for repeated measurements and the value of model checking via posterior predictive distributions. Copyright © 2018 John Wiley & Sons, Ltd.
Zhang, Jingyang; Chaloner, Kathryn; McLinden, James H.; Stapleton, Jack T.
2013-01-01
Reconciling two quantitative ELISA tests for an antibody to an RNA virus, in a situation without a gold standard and where false negatives may occur, is the motivation for this work. False negatives occur when access of the antibody to the binding site is blocked. Based on the mechanism of the assay, a mixture of four bivariate normal distributions is proposed with the mixture probabilities depending on a two-stage latent variable model including the prevalence of the antibody in the population and the probabilities of blocking on each test. There is prior information on the prevalence of the antibody, and also on the probability of false negatives, and so a Bayesian analysis is used. The dependence between the two tests is modeled to be consistent with the biological mechanism. Bayesian decision theory is utilized for classification. The proposed method is applied to the motivating data set to classify the data into two groups: those with and those without the antibody. Simulation studies describe the properties of the estimation and the classification. Sensitivity to the choice of the prior distribution is also addressed by simulation. The same model with two levels of latent variables is applicable in other testing procedures such as quantitative polymerase chain reaction tests where false negatives occur when there is a mutation in the primer sequence. PMID:23592433
Bayes in biological anthropology.
Konigsberg, Lyle W; Frankenberg, Susan R
2013-12-01
In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.
Gottschling, Marc; Soehner, Sylvia; Zinssmeister, Carmen; John, Uwe; Plötner, Jörg; Schweikert, Michael; Aligizaki, Katerina; Elbrächter, Malte
2012-01-01
The phylogenetic relationships of the Dinophyceae (Alveolata) are not sufficiently resolved at present. The Thoracosphaeraceae (Peridiniales) are the only group of the Alveolata that include members with calcareous coccoid stages; this trait is considered apomorphic. Although the coccoid stage apparently is not calcareous, Bysmatrum has been assigned to the Thoracosphaeraceae based on thecal morphology. We tested the monophyly of the Thoracosphaeraceae using large sets of ribosomal RNA sequence data of the Alveolata including the Dinophyceae. Phylogenetic analyses were performed using Maximum Likelihood and Bayesian approaches. The Thoracosphaeraceae were monophyletic, but included also a number of non-calcareous dinophytes (such as Pentapharsodinium and Pfiesteria) and even parasites (such as Duboscquodinium and Tintinnophagus). Bysmatrum had an isolated and uncertain phylogenetic position outside the Thoracosphaeraceae. The phylogenetic relationships among calcareous dinophytes appear complex, and the assumption of the single origin of the potential to produce calcareous structures is challenged. The application of concatenated ribosomal RNA sequence data may prove promising for phylogenetic reconstructions of the Dinophyceae in future. Copyright © 2011 Elsevier GmbH. All rights reserved.
NASA Astrophysics Data System (ADS)
Zonta, Daniele; Pozzi, Matteo; Wu, Huayong; Inaudi, Daniele
2008-03-01
This paper introduces a concept of smart structural elements for the real-time condition monitoring of bridges. These are prefabricated reinforced concrete elements embedding a permanent sensing system and capable of self-diagnosis when in operation. The real-time assessment is automatically controlled by a numerical algorithm founded on Bayesian logic: the method assigns a probability to each possible damage scenario, and estimates the statistical distribution of the damage parameters involved (such as location and extent). To verify the effectiveness of the technology, we produced and tested in the laboratory a reduced-scale smart beam prototype. The specimen is 3.8 m long and has cross-section 0.3 by 0.5m, and has been prestressed using a Dywidag bar, in such a way as to control the preload level. The sensor system includes a multiplexed version of SOFO interferometric sensors mounted on a composite bar, along with a number of traditional metal-foil strain gauges. The method allowed clear recognition of increasing fault states, simulated on the beam by gradually reducing the prestress level.
Genetic structure of cougar populations across the Wyoming basin: Metapopulation or megapopulation
Anderson, C.R.; Lindzey, F.G.; McDonald, D.B.
2004-01-01
We examined the genetic structure of 5 Wyoming cougar (Puma concolor) populations surrounding the Wyoming Basin, as well as a population from southwestern Colorado. When using 9 microsatellite DNA loci, observed heterozygosity was similar among populations (HO = 0.49-0.59) and intermediate to that of other large carnivores. Estimates of genetic structure (FST = 0.028, RST = 0.029) and number of migrants per generation (Nm) suggested high gene flow. Nm was lowest between distant populations and highest among adjacent populations. Examination of these data, plus Mantel test results of genetic versus geographic distance (P ??? 0.01), suggested both isolation by distance and an effect of habitat matrix. Bayesian assignment to population based on individual genotypes showed that cougars in this region were best described as a single panmictic population. Total effective population size for cougars in this region ranged from 1,797 to 4,532 depending on mutation model and analytical method used. Based on measures of gene flow, extinction risk in the near future appears low. We found no support for the existence of metapopulation structure among cougars in this region.
NASA Astrophysics Data System (ADS)
Mai, Ana C. G.; Miño, Carolina I.; Marins, Luis F. F.; Monteiro-Neto, Cassiano; Miranda, Laura; Schwingel, Paulo R.; Lemos, Valéria M.; Gonzalez-Castro, Mariano; Castello, Jorge P.; Vieira, João P.
2014-08-01
The mullet Mugil liza is distributed along the Atlantic coast of South America, from Argentina to Venezuela, and it is heavily exploited in Brazil. We assessed patterns of distribution of neutral nuclear genetic variation in 250 samples from the Brazilian states of Rio de Janeiro, São Paulo, Santa Catarina and Rio Grande do Sul (latitudinal range of 23-31°S) and from Buenos Aires Province in Argentina (36°S). Nine microsatellite loci revealed 131 total alleles, 3-23 alleles per locus, He: 0.69 and Ho: 0.67. Significant genetic differentiation was observed between Rio de Janeiro samples (23°S) and those from all other locations, as indicated by FST, hierarchical analyses of genetic structure, Bayesian cluster analyses and assignment tests. The presence of two different demographic clusters better explains the allelic diversity observed in mullets from the southernmost portion of the Atlantic coast of Brazil and from Argentina. This may be taken into account when designing fisheries management plans involving Brazilian, Uruguayan and Argentinean M. liza populations.
A Smartphone-Based Driver Safety Monitoring System Using Data Fusion
Lee, Boon-Giin; Chung, Wan-Young
2012-01-01
This paper proposes a method for monitoring driver safety levels using a data fusion approach based on several discrete data types: eye features, bio-signal variation, in-vehicle temperature, and vehicle speed. The driver safety monitoring system was developed in practice in the form of an application for an Android-based smartphone device, where measuring safety-related data requires no extra monetary expenditure or equipment. Moreover, the system provides high resolution and flexibility. The safety monitoring process involves the fusion of attributes gathered from different sensors, including video, electrocardiography, photoplethysmography, temperature, and a three-axis accelerometer, that are assigned as input variables to an inference analysis framework. A Fuzzy Bayesian framework is designed to indicate the driver’s capability level and is updated continuously in real-time. The sensory data are transmitted via Bluetooth communication to the smartphone device. A fake incoming call warning service alerts the driver if his or her safety level is suspiciously compromised. Realistic testing of the system demonstrates the practical benefits of multiple features and their fusion in providing a more authentic and effective driver safety monitoring. PMID:23247416
TANDI: threat assessment of network data and information
NASA Astrophysics Data System (ADS)
Holsopple, Jared; Yang, Shanchieh Jay; Sudit, Moises
2006-04-01
Current practice for combating cyber attacks typically use Intrusion Detection Sensors (IDSs) to passively detect and block multi-stage attacks. This work leverages Level-2 fusion that correlates IDS alerts belonging to the same attacker, and proposes a threat assessment algorithm to predict potential future attacker actions. The algorithm, TANDI, reduces the problem complexity by separating the models of the attacker's capability and opportunity, and fuse the two to determine the attacker's intent. Unlike traditional Bayesian-based approaches, which require assigning a large number of edge probabilities, the proposed Level-3 fusion procedure uses only 4 parameters. TANDI has been implemented and tested with randomly created attack sequences. The results demonstrate that TANDI predicts future attack actions accurately as long as the attack is not part of a coordinated attack and contains no insider threats. In the presence of abnormal attack events, TANDI will alarm the network analyst for further analysis. The attempt to evaluate a threat assessment algorithm via simulation is the first in the literature, and shall open up a new avenue in the area of high level fusion.
Mushet, David M.; Euliss, Ned H.; Chen, Yongjiu; Stockwell, Craig A.
2013-01-01
In contrast to most local amphibian populations, northeastern populations of the Northern Leopard Frog (Lithobates pipiens) have displayed uncharacteristically high levels of genetic diversity that have been attributed to large, stable populations. However, this widely distributed species also occurs in areas known for great climatic fluctuations that should be reflected in corresponding fluctuations in population sizes and reduced genetic diversity. To test our hypothesis that Northern Leopard Frog genetic diversity would be reduced in areas subjected to significant climate variability, we examined the genetic diversity of L. pipiens collected from 12 sites within the Prairie Pothole Region of North Dakota. Despite the region's fluctuating climate that includes periods of recurring drought and deluge, we found unexpectedly high levels of genetic diversity approaching that of northeastern populations. Further, genetic structure at a landscape scale was strikingly homogeneous; genetic differentiation estimates (Dest) averaged 0.10 (SD = 0.036) across the six microsatellite loci we studied, and two Bayesian assignment tests (STRUCTURE and BAPS) failed to reveal the development of significant population structure across the 68 km breadth of our study area. These results suggest that L. pipiens in the Prairie Pothole Region consists of a large, panmictic population capable of maintaining high genetic diversity in the face of marked climate variability.
Bayesian non-parametric inference for stochastic epidemic models using Gaussian Processes.
Xu, Xiaoguang; Kypraios, Theodore; O'Neill, Philip D
2016-10-01
This paper considers novel Bayesian non-parametric methods for stochastic epidemic models. Many standard modeling and data analysis methods use underlying assumptions (e.g. concerning the rate at which new cases of disease will occur) which are rarely challenged or tested in practice. To relax these assumptions, we develop a Bayesian non-parametric approach using Gaussian Processes, specifically to estimate the infection process. The methods are illustrated with both simulated and real data sets, the former illustrating that the methods can recover the true infection process quite well in practice, and the latter illustrating that the methods can be successfully applied in different settings. © The Author 2016. Published by Oxford University Press.
Cross-view gait recognition using joint Bayesian
NASA Astrophysics Data System (ADS)
Li, Chao; Sun, Shouqian; Chen, Xiaoyu; Min, Xin
2017-07-01
Human gait, as a soft biometric, helps to recognize people by walking. To further improve the recognition performance under cross-view condition, we propose Joint Bayesian to model the view variance. We evaluated our prosed method with the largest population (OULP) dataset which makes our result reliable in a statically way. As a result, we confirmed our proposed method significantly outperformed state-of-the-art approaches for both identification and verification tasks. Finally, sensitivity analysis on the number of training subjects was conducted, we find Joint Bayesian could achieve competitive results even with a small subset of training subjects (100 subjects). For further comparison, experimental results, learning models, and test codes are available.
NASA Astrophysics Data System (ADS)
Kiyan, Duygu; Rath, Volker; Delhaye, Robert
2017-04-01
The frequency- and time-domain airborne electromagnetic (AEM) data collected under the Tellus projects of the Geological Survey of Ireland (GSI) which represent a wealth of information on the multi-dimensional electrical structure of Ireland's near-surface. Our project, which was funded by GSI under the framework of their Short Call Research Programme, aims to develop and implement inverse techniques based on various Bayesian methods for these densely sampled data. We have developed a highly flexible toolbox using Python language for the one-dimensional inversion of AEM data along the flight lines. The computational core is based on an adapted frequency- and time-domain forward modelling core derived from the well-tested open-source code AirBeo, which was developed by the CSIRO (Australia) and the AMIRA consortium. Three different inversion methods have been implemented: (i) Tikhonov-type inversion including optimal regularisation methods (Aster el al., 2012; Zhdanov, 2015), (ii) Bayesian MAP inversion in parameter and data space (e.g. Tarantola, 2005), and (iii) Full Bayesian inversion with Markov Chain Monte Carlo (Sambridge and Mosegaard, 2002; Mosegaard and Sambridge, 2002), all including different forms of spatial constraints. The methods have been tested on synthetic and field data. This contribution will introduce the toolbox and present case studies on the AEM data from the Tellus projects.
Bayesian adaptive phase II screening design for combination trials
Cai, Chunyan; Yuan, Ying; Johnson, Valen E
2013-01-01
Background Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Methods Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Results Simulation studies show that the proposed design substantially outperforms the conventional multiarm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while allocating substantially more patients to efficacious treatments. Limitations The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. Conclusions The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while providing higher power to identify the best treatment at the end of the trial. PMID:23359875
Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration.
Conner, Mary M; Saunders, W Carl; Bouwes, Nicolaas; Jordan, Chris
2015-10-01
Before-after-control-impact (BACI) designs are an effective method to evaluate natural and human-induced perturbations on ecological variables when treatment sites cannot be randomly chosen. While effect sizes of interest can be tested with frequentist methods, using Bayesian Markov chain Monte Carlo (MCMC) sampling methods, probabilities of effect sizes, such as a ≥20 % increase in density after restoration, can be directly estimated. Although BACI and Bayesian methods are used widely for assessing natural and human-induced impacts for field experiments, the application of hierarchal Bayesian modeling with MCMC sampling to BACI designs is less common. Here, we combine these approaches and extend the typical presentation of results with an easy to interpret ratio, which provides an answer to the main study question-"How much impact did a management action or natural perturbation have?" As an example of this approach, we evaluate the impact of a restoration project, which implemented beaver dam analogs, on survival and density of juvenile steelhead. Results indicated the probabilities of a ≥30 % increase were high for survival and density after the dams were installed, 0.88 and 0.99, respectively, while probabilities for a higher increase of ≥50 % were variable, 0.17 and 0.82, respectively. This approach demonstrates a useful extension of Bayesian methods that can easily be generalized to other study designs from simple (e.g., single factor ANOVA, paired t test) to more complicated block designs (e.g., crossover, split-plot). This approach is valuable for estimating the probabilities of restoration impacts or other management actions.
Bohling, Justin H; Waits, Lisette P
2011-05-01
Predicting spatial patterns of hybridization is important for evolutionary and conservation biology yet are hampered by poor understanding of how hybridizing species can interact. This is especially pertinent in contact zones where hybridizing populations are sympatric. In this study, we examined the extent of red wolf (Canis rufus) colonization and introgression where the species contacts a coyote (C. latrans) population in North Carolina, USA. We surveyed 22,000km(2) in the winter of 2008 for scat and identified individual canids through genetic analysis. Of 614 collected scats, 250 were assigned to canids by mitochondrial DNA (mtDNA) sequencing. Canid samples were genotyped at 6-17 microsatellite loci (nDNA) and assigned to species using three admixture criteria implemented in two Bayesian clustering programs. We genotyped 82 individuals but none were identified as red wolves. Two individuals had red wolf mtDNA but no significant red wolf nDNA ancestry. One individual possessed significant red wolf nDNA ancestry (approximately 30%) using all criteria, although seven other individuals showed evidence of red wolf ancestry (11-21%) using the relaxed criterion. Overall, seven individuals were classified as hybrids using the conservative criteria and 37 using the relaxed criterion. We found evidence of dog (C. familiaris) and gray wolf (C. lupus) introgression into the coyote population. We compared the performance of different methods and criteria by analyzing known red wolves and hybrids. These results suggest that red wolf colonization and introgression in North Carolina is minimal and provide insights into the utility of Bayesian clustering methods to detect hybridization. © 2011 Blackwell Publishing Ltd.
Kimble, Steven J. A.; Rhodes Jr., O. E.; Williams, Rod N.
2014-01-01
Rangewide studies of genetic parameters can elucidate patterns and processes that operate only over large geographic scales. Herein, we present a rangewide population genetic assessment of the eastern box turtle Terrapene c. carolina, a species that is in steep decline across its range. To inform conservation planning for this species, we address the hypothesis that disruptions to demographic and movement parameters associated with the decline of the eastern box turtle has resulted in distinctive genetic signatures in the form of low genetic diversity, high population structuring, and decreased gene flow. We used microsatellite genotype data from (n = 799) individuals from across the species range to perform two Bayesian population assignment approaches, two methods for comparing historical and contemporary migration among populations, an evaluation of isolation by distance, and a method for detecting barriers to gene flow. Both Bayesian methods of population assignment indicated that there are two populations rangewide, both of which have maintained high levels of genetic diversity (HO = 0.756). Evidence of isolation by distance was detected in this species at a spatial scale of 300 – 500 km, and the Appalachian Mountains were identified as the primary barrier to gene flow across the species range. We also found evidence for historical but not contemporary migration between populations. Our prediction of many, highly structured populations across the range was not supported. This may point to cryptic contemporary gene flow, which might in turn be explained by the presence of rare transients in populations. However these data may be influenced by historical signatures of genetic connectivity because individuals of this species can be long-lived. PMID:24647580
Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan
2015-01-01
Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.
Boehm, Udo; Steingroever, Helen; Wagenmakers, Eric-Jan
2018-06-01
An important tool in the advancement of cognitive science are quantitative models that represent different cognitive variables in terms of model parameters. To evaluate such models, their parameters are typically tested for relationships with behavioral and physiological variables that are thought to reflect specific cognitive processes. However, many models do not come equipped with the statistical framework needed to relate model parameters to covariates. Instead, researchers often revert to classifying participants into groups depending on their values on the covariates, and subsequently comparing the estimated model parameters between these groups. Here we develop a comprehensive solution to the covariate problem in the form of a Bayesian regression framework. Our framework can be easily added to existing cognitive models and allows researchers to quantify the evidential support for relationships between covariates and model parameters using Bayes factors. Moreover, we present a simulation study that demonstrates the superiority of the Bayesian regression framework to the conventional classification-based approach.
Bayesian convolutional neural network based MRI brain extraction on nonhuman primates.
Zhao, Gengyan; Liu, Fang; Oler, Jonathan A; Meyerand, Mary E; Kalin, Ned H; Birn, Rasmus M
2018-07-15
Brain extraction or skull stripping of magnetic resonance images (MRI) is an essential step in neuroimaging studies, the accuracy of which can severely affect subsequent image processing procedures. Current automatic brain extraction methods demonstrate good results on human brains, but are often far from satisfactory on nonhuman primates, which are a necessary part of neuroscience research. To overcome the challenges of brain extraction in nonhuman primates, we propose a fully-automated brain extraction pipeline combining deep Bayesian convolutional neural network (CNN) and fully connected three-dimensional (3D) conditional random field (CRF). The deep Bayesian CNN, Bayesian SegNet, is used as the core segmentation engine. As a probabilistic network, it is not only able to perform accurate high-resolution pixel-wise brain segmentation, but also capable of measuring the model uncertainty by Monte Carlo sampling with dropout in the testing stage. Then, fully connected 3D CRF is used to refine the probability result from Bayesian SegNet in the whole 3D context of the brain volume. The proposed method was evaluated with a manually brain-extracted dataset comprising T1w images of 100 nonhuman primates. Our method outperforms six popular publicly available brain extraction packages and three well-established deep learning based methods with a mean Dice coefficient of 0.985 and a mean average symmetric surface distance of 0.220 mm. A better performance against all the compared methods was verified by statistical tests (all p-values < 10 -4 , two-sided, Bonferroni corrected). The maximum uncertainty of the model on nonhuman primate brain extraction has a mean value of 0.116 across all the 100 subjects. The behavior of the uncertainty was also studied, which shows the uncertainty increases as the training set size decreases, the number of inconsistent labels in the training set increases, or the inconsistency between the training set and the testing set increases. Copyright © 2018 Elsevier Inc. All rights reserved.
Bayesian meta-analysis of Cronbach's coefficient alpha to evaluate informative hypotheses.
Okada, Kensuke
2015-12-01
This paper proposes a new method to evaluate informative hypotheses for meta-analysis of Cronbach's coefficient alpha using a Bayesian approach. The coefficient alpha is one of the most widely used reliability indices. In meta-analyses of reliability, researchers typically form specific informative hypotheses beforehand, such as 'alpha of this test is greater than 0.8' or 'alpha of one form of a test is greater than the others.' The proposed method enables direct evaluation of these informative hypotheses. To this end, a Bayes factor is calculated to evaluate the informative hypothesis against its complement. It allows researchers to summarize the evidence provided by previous studies in favor of their informative hypothesis. The proposed approach can be seen as a natural extension of the Bayesian meta-analysis of coefficient alpha recently proposed in this journal (Brannick and Zhang, 2013). The proposed method is illustrated through two meta-analyses of real data that evaluate different kinds of informative hypotheses on superpopulation: one is that alpha of a particular test is above the criterion value, and the other is that alphas among different test versions have ordered relationships. Informative hypotheses are supported from the data in both cases, suggesting that the proposed approach is promising for application. Copyright © 2015 John Wiley & Sons, Ltd.
Validating module network learning algorithms using simulated data.
Michoel, Tom; Maere, Steven; Bonnet, Eric; Joshi, Anagha; Saeys, Yvan; Van den Bulcke, Tim; Van Leemput, Koenraad; van Remortel, Piet; Kuiper, Martin; Marchal, Kathleen; Van de Peer, Yves
2007-05-03
In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators. We show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods.
Farid, Ahmed; Abdel-Aty, Mohamed; Lee, Jaeyoung; Eluru, Naveen
2017-09-01
Safety performance functions (SPFs) are essential tools for highway agencies to predict crashes, identify hotspots and assess safety countermeasures. In the Highway Safety Manual (HSM), a variety of SPFs are provided for different types of roadway facilities, crash types and severity levels. Agencies, lacking the necessary resources to develop own localized SPFs, may opt to apply the HSM's SPFs for their jurisdictions. Yet, municipalities that want to develop and maintain their regional SPFs might encounter the issue of the small sample bias. Bayesian inference is being conducted to address this issue by combining the current data with prior information to achieve reliable results. It follows that the essence of Bayesian statistics is the application of informative priors, obtained from other SPFs or experts' experiences. In this study, we investigate the applicability of informative priors for Bayesian negative binomial SPFs for rural divided multilane highway segments in Florida and California. An SPF with non-informative priors is developed for each state and its parameters' distributions are assigned to the other state's SPF as informative priors. The performances of SPFs are evaluated by applying each state's SPFs to the other state. The analysis is conducted for both total (KABCO) and severe (KAB) crashes. As per the results, applying one state's SPF with informative priors, which are the other state's SPF independent variable estimates, to the latter state's conditions yields better goodness of fit (GOF) values than applying the former state's SPF with non-informative priors to the conditions of the latter state. This is for both total and severe crash SPFs. Hence, for localities where it is not preferred to develop own localized SPFs and adopt SPFs from elsewhere to cut down on resources, application of informative priors is shown to facilitate the process. Copyright © 2017 National Safety Council and Elsevier Ltd. All rights reserved.
Learning oncogenetic networks by reducing to mixed integer linear programming.
Shahrabi Farahani, Hossein; Lagergren, Jens
2013-01-01
Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease. In order to model cancer progression, we propose Progression Networks, a special case of Bayesian networks, that are tailored to model disease progression. Progression networks have similarities with Conjunctive Bayesian Networks (CBNs) [1],a variation of Bayesian networks also proposed for modeling disease progression. We also describe a learning algorithm for learning Bayesian networks in general and progression networks in particular. We reduce the hard problem of learning the Bayesian and progression networks to Mixed Integer Linear Programming (MILP). MILP is a Non-deterministic Polynomial-time complete (NP-complete) problem for which very good heuristics exists. We tested our algorithm on synthetic and real cytogenetic data from renal cell carcinoma. We also compared our learned progression networks with the networks proposed in earlier publications. The software is available on the website https://bitbucket.org/farahani/diprog.
Kalil, Andre C; Sun, Junfeng
2014-10-01
To review Bayesian methodology and its utility to clinical decision making and research in the critical care field. Clinical, epidemiological, and biostatistical studies on Bayesian methods in PubMed and Embase from their inception to December 2013. Bayesian methods have been extensively used by a wide range of scientific fields, including astronomy, engineering, chemistry, genetics, physics, geology, paleontology, climatology, cryptography, linguistics, ecology, and computational sciences. The application of medical knowledge in clinical research is analogous to the application of medical knowledge in clinical practice. Bedside physicians have to make most diagnostic and treatment decisions on critically ill patients every day without clear-cut evidence-based medicine (more subjective than objective evidence). Similarly, clinical researchers have to make most decisions about trial design with limited available data. Bayesian methodology allows both subjective and objective aspects of knowledge to be formally measured and transparently incorporated into the design, execution, and interpretation of clinical trials. In addition, various degrees of knowledge and several hypotheses can be tested at the same time in a single clinical trial without the risk of multiplicity. Notably, the Bayesian technology is naturally suited for the interpretation of clinical trial findings for the individualized care of critically ill patients and for the optimization of public health policies. We propose that the application of the versatile Bayesian methodology in conjunction with the conventional statistical methods is not only ripe for actual use in critical care clinical research but it is also a necessary step to maximize the performance of clinical trials and its translation to the practice of critical care medicine.
Bayesian Nonparametric Prediction and Statistical Inference
1989-09-07
Kadane, J. (1980), "Bayesian decision theory and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. Ramsey , eds...the random model and weighted least squares regression," in Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217...likelihood function. On the other hand, H. Jeffreys’s theory of hypothesis testing covers the most important situations in which the prior is not diffuse. See
Palmprint identification using FRIT
NASA Astrophysics Data System (ADS)
Kisku, D. R.; Rattani, A.; Gupta, P.; Hwang, C. J.; Sing, J. K.
2011-06-01
This paper proposes a palmprint identification system using Finite Ridgelet Transform (FRIT) and Bayesian classifier. FRIT is applied on the ROI (region of interest), which is extracted from palmprint image, to extract a set of distinctive features from palmprint image. These features are used to classify with the help of Bayesian classifier. The proposed system has been tested on CASIA and IIT Kanpur palmprint databases. The experimental results reveal better performance compared to all well known systems.
Suggestions for presenting the results of data analyses
Anderson, David R.; Link, William A.; Johnson, Douglas H.; Burnham, Kenneth P.
2001-01-01
We give suggestions for the presentation of research results from frequentist, information-theoretic, and Bayesian analysis paradigms, followed by several general suggestions. The information-theoretic and Bayesian methods offer alternative approaches to data analysis and inference compared to traditionally used methods. Guidance is lacking on the presentation of results under these alternative procedures and on nontesting aspects of classical frequentists methods of statistical analysis. Null hypothesis testing has come under intense criticism. We recommend less reporting of the results of statistical tests of null hypotheses in cases where the null is surely false anyway, or where the null hypothesis is of little interest to science or management.
A Bayesian framework to estimate diversification rates and their variation through time and space
2011-01-01
Background Patterns of species diversity are the result of speciation and extinction processes, and molecular phylogenetic data can provide valuable information to derive their variability through time and across clades. Bayesian Markov chain Monte Carlo methods offer a promising framework to incorporate phylogenetic uncertainty when estimating rates of diversification. Results We introduce a new approach to estimate diversification rates in a Bayesian framework over a distribution of trees under various constant and variable rate birth-death and pure-birth models, and test it on simulated phylogenies. Furthermore, speciation and extinction rates and their posterior credibility intervals can be estimated while accounting for non-random taxon sampling. The framework is particularly suitable for hypothesis testing using Bayes factors, as we demonstrate analyzing dated phylogenies of Chondrostoma (Cyprinidae) and Lupinus (Fabaceae). In addition, we develop a model that extends the rate estimation to a meta-analysis framework in which different data sets are combined in a single analysis to detect general temporal and spatial trends in diversification. Conclusions Our approach provides a flexible framework for the estimation of diversification parameters and hypothesis testing while simultaneously accounting for uncertainties in the divergence times and incomplete taxon sampling. PMID:22013891
NASA Astrophysics Data System (ADS)
Le Bras, Ronan; Kushida, Noriyuki; Mialle, Pierrick; Tomuta, Elena; Arora, Nimar
2017-04-01
The Preparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO) has been developing a Bayesian method and software to perform the key step of automatic association of seismological, hydroacoustic, and infrasound (SHI) parametric data. In our preliminary testing in the CTBTO, NET_VISA shows much better performance than its currently operating automatic association module, with a rate for automatic events matching the analyst-reviewed events increased by 10%, signifying that the percentage of missed events is lowered by 40%. Initial tests involving analysts also showed that the new software will complete the automatic bulletins of the CTBTO by adding previously missed events. Because products by the CTBTO are also widely distributed to its member States as well as throughout the seismological community, the introduction of a new technology must be carried out carefully, and the first step of operational integration is to first use NET-VISA results within the interactive analysts' software so that the analysts can check the robustness of the Bayesian approach. We report on the latest results both on the progress for automatic processing and for the initial introduction of NET-VISA results in the analyst review process
Pariset, L; Mariotti, M; Nardone, A; Soysal, M I; Ozkan, E; Williams, J L; Dunner, S; Leveziel, H; Maróti-Agóts, A; Bodò, I; Valentini, A
2010-12-01
Italian Maremmana, Turkish Grey and Hungarian Grey breeds belong to the same Podolic group of cattle, have a similar conformation and recently experienced a similar demographic reduction. The aim of this study was to assess the relationship among the analysed Podolic breeds and to verify whether their genetic state reflects their history. To do so, approximately 100 single nucleotide polymorphisms (SNPs) were genotyped on individuals belonging to these breeds and compared to genotypes of individuals of two Italian beef breeds, Marchigiana and Piemontese, which underwent different selection and migration histories. Population genetic parameters such as allelic frequencies and heterozygosity values were assessed, genetic distances calculated and assignment test performed to evaluate the possibility of recent admixture between the populations. The data show that the physical similarity among the Podolic breeds examined, and particularly between Hungarian Grey and Maremmana cattle that experienced admixture in the recent past, is mainly morphological. The assignment of individuals from genotype data was achieved using Bayesian inference, confirming that the set of chosen SNPs is able to distinguish among the breeds and that the breeds are genetically distinct. Individuals of Turkish Grey breed were clearly assigned to their breed of origin for all clustering alternatives, showing that this breed can be differentiated from the others on the basis of the allelic frequencies. Remarkably, in the Turkish Grey there were differences observed between the population of Enez district, where in situ conservation studies are practised, and that of Bandirma district of Balikesir, where ex situ conservation studies are practised out of the original raising area. In conclusion, this study demonstrates that molecular data could be used to reveal an unbiased view of past events and provide the basis for a rational exploitation of livestock, suggesting appropriate cross-breeding plans based on genetic distance or breeding strategies that include the population structure. © 2010 Blackwell Verlag GmbH.
Matthews, Luke J.; Tehrani, Jamie J.; Jordan, Fiona M.; Collard, Mark; Nunn, Charles L.
2011-01-01
Background Archaeologists and anthropologists have long recognized that different cultural complexes may have distinct descent histories, but they have lacked analytical techniques capable of easily identifying such incongruence. Here, we show how Bayesian phylogenetic analysis can be used to identify incongruent cultural histories. We employ the approach to investigate Iranian tribal textile traditions. Methods We used Bayes factor comparisons in a phylogenetic framework to test two models of cultural evolution: the hierarchically integrated system hypothesis and the multiple coherent units hypothesis. In the hierarchically integrated system hypothesis, a core tradition of characters evolves through descent with modification and characters peripheral to the core are exchanged among contemporaneous populations. In the multiple coherent units hypothesis, a core tradition does not exist. Rather, there are several cultural units consisting of sets of characters that have different histories of descent. Results For the Iranian textiles, the Bayesian phylogenetic analyses supported the multiple coherent units hypothesis over the hierarchically integrated system hypothesis. Our analyses suggest that pile-weave designs represent a distinct cultural unit that has a different phylogenetic history compared to other textile characters. Conclusions The results from the Iranian textiles are consistent with the available ethnographic evidence, which suggests that the commercial rug market has influenced pile-rug designs but not the techniques or designs incorporated in the other textiles produced by the tribes. We anticipate that Bayesian phylogenetic tests for inferring cultural units will be of great value for researchers interested in studying the evolution of cultural traits including language, behavior, and material culture. PMID:21559083
Probabilistic Model for Untargeted Peak Detection in LC-MS Using Bayesian Statistics.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2015-07-21
We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography-mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a chromatogram are affected by a chromatographic peak and which ones are only affected by noise. The use of probabilities contrasts with the traditional method in which a binary answer is given, relying on a threshold. By contrast, with the Bayesian peak detection presented here, the values of probability can be further propagated into other preprocessing steps, which will increase (or decrease) the importance of chromatographic regions into the final results. The present work is based on the use of the statistical overlap theory of component overlap from Davis and Giddings (Davis, J. M.; Giddings, J. Anal. Chem. 1983, 55, 418-424) as prior probability in the Bayesian formulation. The algorithm was tested on LC-MS Orbitrap data and was able to successfully distinguish chemical noise from actual peaks without any data preprocessing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Borchani, Hanen; Bielza, Concha; Toro, Carlos; Larrañaga, Pedro
2013-03-01
Our aim is to use multi-dimensional Bayesian network classifiers in order to predict the human immunodeficiency virus type 1 (HIV-1) reverse transcriptase and protease inhibitors given an input set of respective resistance mutations that an HIV patient carries. Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models especially designed to solve multi-dimensional classification problems, where each input instance in the data set has to be assigned simultaneously to multiple output class variables that are not necessarily binary. In this paper, we introduce a new method, named MB-MBC, for learning MBCs from data by determining the Markov blanket around each class variable using the HITON algorithm. Our method is applied to both reverse transcriptase and protease data sets obtained from the Stanford HIV-1 database. Regarding the prediction of antiretroviral combination therapies, the experimental study shows promising results in terms of classification accuracy compared with state-of-the-art MBC learning algorithms. For reverse transcriptase inhibitors, we get 71% and 11% in mean and global accuracy, respectively; while for protease inhibitors, we get more than 84% and 31% in mean and global accuracy, respectively. In addition, the analysis of MBC graphical structures lets us gain insight into both known and novel interactions between reverse transcriptase and protease inhibitors and their respective resistance mutations. MB-MBC algorithm is a valuable tool to analyze the HIV-1 reverse transcriptase and protease inhibitors prediction problem and to discover interactions within and between these two classes of inhibitors. Copyright © 2012 Elsevier B.V. All rights reserved.
Clearing Unexploded Ordnance: Bayesian Methodology for Assessing Success
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, K K.
2005-10-30
The Department of Defense has many Formerly Used Defense Sites (FUDS) that are slated for transfer for public use. Some sites have unexploded ordnance (UXO) that must be cleared prior to any land transfers. Sites are characterized using geophysical sensing devices and locations are identified where possible UXO may be located. In practice, based on the analysis of the geophysical surveys, a dig list of N suspect locations is created for a site that is possibly contaminated with UXO. The suspect locations on the dig list are often assigned into K bins ranging from ``most likely to contain UXO" tomore » ``least likely to be UXO" based on signal discrimination techniques and expert judgment. Usually all dig list locations are sampled to determine if UXO is present before the site is determined to be free of UXO. While this method is 100% certain to insure no UXO remains in the locations identified by the signal discrimination and expert judgment, it is very costly. This paper proposes a statistical Bayesian methodology that may result in digging less than 100% of the suspect locations to reach a pre-defined tolerable risk, where risk is defined in terms of a low probability that any UXO remains in the unsampled dig list locations. Two important features of a Bayesian approach are that it can account for uncertainties in model parameters and that it can handle data that becomes available in stages. The results from each stage of data can be used to direct the subsequent digs.« less
Bayesian wavelet PCA methodology for turbomachinery damage diagnosis under uncertainty
NASA Astrophysics Data System (ADS)
Xu, Shengli; Jiang, Xiaomo; Huang, Jinzhi; Yang, Shuhua; Wang, Xiaofang
2016-12-01
Centrifugal compressor often suffers various defects such as impeller cracking, resulting in forced outage of the total plant. Damage diagnostics and condition monitoring of such a turbomachinery system has become an increasingly important and powerful tool to prevent potential failure in components and reduce unplanned forced outage and further maintenance costs, while improving reliability, availability and maintainability of a turbomachinery system. This paper presents a probabilistic signal processing methodology for damage diagnostics using multiple time history data collected from different locations of a turbomachine, considering data uncertainty and multivariate correlation. The proposed methodology is based on the integration of three advanced state-of-the-art data mining techniques: discrete wavelet packet transform, Bayesian hypothesis testing, and probabilistic principal component analysis. The multiresolution wavelet analysis approach is employed to decompose a time series signal into different levels of wavelet coefficients. These coefficients represent multiple time-frequency resolutions of a signal. Bayesian hypothesis testing is then applied to each level of wavelet coefficient to remove possible imperfections. The ratio of posterior odds Bayesian approach provides a direct means to assess whether there is imperfection in the decomposed coefficients, thus avoiding over-denoising. Power spectral density estimated by the Welch method is utilized to evaluate the effectiveness of Bayesian wavelet cleansing method. Furthermore, the probabilistic principal component analysis approach is developed to reduce dimensionality of multiple time series and to address multivariate correlation and data uncertainty for damage diagnostics. The proposed methodology and generalized framework is demonstrated with a set of sensor data collected from a real-world centrifugal compressor with impeller cracks, through both time series and contour analyses of vibration signal and principal components.
Bug Distribution and Statistical Pattern Classification.
ERIC Educational Resources Information Center
Tatsuoka, Kikumi K.; Tatsuoka, Maurice M.
1987-01-01
The rule space model permits measurement of cognitive skill acquisition and error diagnosis. Further discussion introduces Bayesian hypothesis testing and bug distribution. An illustration involves an artificial intelligence approach to testing fractions and arithmetic. (Author/GDC)
NASA Astrophysics Data System (ADS)
Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.; Hammond, Glenn E.; Rockhold, Mark L.; Zachara, John M.; Rubin, Yoram
2012-06-01
Tracer tests performed under natural or forced gradient flow conditions can provide useful information for characterizing subsurface properties, through monitoring, modeling, and interpretation of the tracer plume migration in an aquifer. Nonreactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter tests. A Bayesian data assimilation technique, the method of anchored distributions (MAD) (Rubin et al., 2010), was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of the Hanford formation.In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using constant-rate injection and borehole flowmeter test data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field tracer data at the Hanford 300 Area demonstrates that inverting for spatial heterogeneity of hydraulic conductivity under transient flow conditions is challenging and more work is needed.
A Bayesian bird's eye view of ‘Replications of important results in social psychology’
Schönbrodt, Felix D.; Yao, Yuling; Gelman, Andrew; Wagenmakers, Eric-Jan
2017-01-01
We applied three Bayesian methods to reanalyse the preregistered contributions to the Social Psychology special issue ‘Replications of Important Results in Social Psychology’ (Nosek & Lakens. 2014 Registered reports: a method to increase the credibility of published results. Soc. Psychol. 45, 137–141. (doi:10.1027/1864-9335/a000192)). First, individual-experiment Bayesian parameter estimation revealed that for directed effect size measures, only three out of 44 central 95% credible intervals did not overlap with zero and fell in the expected direction. For undirected effect size measures, only four out of 59 credible intervals contained values greater than 0.10 (10% of variance explained) and only 19 intervals contained values larger than 0.05. Second, a Bayesian random-effects meta-analysis for all 38 t-tests showed that only one out of the 38 hierarchically estimated credible intervals did not overlap with zero and fell in the expected direction. Third, a Bayes factor hypothesis test was used to quantify the evidence for the null hypothesis against a default one-sided alternative. Only seven out of 60 Bayes factors indicated non-anecdotal support in favour of the alternative hypothesis (BF10>3), whereas 51 Bayes factors indicated at least some support for the null hypothesis. We hope that future analyses of replication success will embrace a more inclusive statistical approach by adopting a wider range of complementary techniques. PMID:28280547
Two-Stage Bayesian Model Averaging in Endogenous Variable Models*
Lenkoski, Alex; Eicher, Theo S.; Raftery, Adrian E.
2013-01-01
Economic modeling in the presence of endogeneity is subject to model uncertainty at both the instrument and covariate level. We propose a Two-Stage Bayesian Model Averaging (2SBMA) methodology that extends the Two-Stage Least Squares (2SLS) estimator. By constructing a Two-Stage Unit Information Prior in the endogenous variable model, we are able to efficiently combine established methods for addressing model uncertainty in regression models with the classic technique of 2SLS. To assess the validity of instruments in the 2SBMA context, we develop Bayesian tests of the identification restriction that are based on model averaged posterior predictive p-values. A simulation study showed that 2SBMA has the ability to recover structure in both the instrument and covariate set, and substantially improves the sharpness of resulting coefficient estimates in comparison to 2SLS using the full specification in an automatic fashion. Due to the increased parsimony of the 2SBMA estimate, the Bayesian Sargan test had a power of 50 percent in detecting a violation of the exogeneity assumption, while the method based on 2SLS using the full specification had negligible power. We apply our approach to the problem of development accounting, and find support not only for institutions, but also for geography and integration as development determinants, once both model uncertainty and endogeneity have been jointly addressed. PMID:24223471
A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
Antoneli, Fernando; Passos, Fernando M.; Lopes, Luciano R.
2018-01-01
Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power. PMID:29300759
Saha, Sreemanti; Narang, Rahul; Deshmukh, Pradeep; Pote, Kiran; Anvikar, Anup; Narang, Pratibha
2017-01-01
The diagnostic techniques for malaria are undergoing a change depending on the availability of newer diagnostics and annual parasite index of infection in a particular area. At the country level, guidelines are available for selection of diagnostic tests; however, at the local level, this decision is made based on malaria situation in the area. The tests are evaluated against the gold standard, and if that standard has limitations, it becomes difficult to compare other available tests. Bayesian latent class analysis computes its internal standard rather than using the conventional gold standard and helps comparison of various tests including the conventional gold standard. In a cross-sectional study conducted in a tertiary care hospital setting, we have evaluated smear microscopy, rapid diagnostic test (RDT), and polymerase chain reaction (PCR) for diagnosis of malaria using Bayesian latent class analysis. We found the magnitude of malaria to be 17.7% (95% confidence interval: 12.5%-23.9%) among the study subjects. In the present study, the sensitivity of microscopy was 63%, but it had very high specificity (99.4%). Sensitivity and specificity of RDT and PCR were high with RDT having a marginally higher sensitivity (94% vs. 90%) and specificity (99% vs. 95%). On comparison of likelihood ratios (LRs), RDT had the highest LR for positive test result (175) and the lowest LR for negative test result (0.058) among the three tests. In settings like ours conventional smear microscopy may be replaced with RDT and as we move toward elimination and facilities become available PCR may be roped into detect cases with lower parasitaemia.
Burroughs, N J; Pillay, D; Mutimer, D
1999-01-01
Bayesian analysis using a virus dynamics model is demonstrated to facilitate hypothesis testing of patterns in clinical time-series. Our Markov chain Monte Carlo implementation demonstrates that the viraemia time-series observed in two sets of hepatitis B patients on antiviral (lamivudine) therapy, chronic carriers and liver transplant patients, are significantly different, overcoming clinical trial design differences that question the validity of non-parametric tests. We show that lamivudine-resistant mutants grow faster in transplant patients than in chronic carriers, which probably explains the differences in emergence times and failure rates between these two sets of patients. Incorporation of dynamic models into Bayesian parameter analysis is of general applicability in medical statistics. PMID:10643081
Bayesian Tracking of Emerging Epidemics Using Ensemble Optimal Statistical Interpolation
Cobb, Loren; Krishnamurthy, Ashok; Mandel, Jan; Beezley, Jonathan D.
2014-01-01
We present a preliminary test of the Ensemble Optimal Statistical Interpolation (EnOSI) method for the statistical tracking of an emerging epidemic, with a comparison to its popular relative for Bayesian data assimilation, the Ensemble Kalman Filter (EnKF). The spatial data for this test was generated by a spatial susceptible-infectious-removed (S-I-R) epidemic model of an airborne infectious disease. Both tracking methods in this test employed Poisson rather than Gaussian noise, so as to handle epidemic data more accurately. The EnOSI and EnKF tracking methods worked well on the main body of the simulated spatial epidemic, but the EnOSI was able to detect and track a distant secondary focus of infection that the EnKF missed entirely. PMID:25113590
Bayesian inference to identify parameters in viscoelasticity
NASA Astrophysics Data System (ADS)
Rappel, Hussein; Beex, Lars A. A.; Bordas, Stéphane P. A.
2017-08-01
This contribution discusses Bayesian inference (BI) as an approach to identify parameters in viscoelasticity. The aims are: (i) to show that the prior has a substantial influence for viscoelasticity, (ii) to show that this influence decreases for an increasing number of measurements and (iii) to show how different types of experiments influence the identified parameters and their uncertainties. The standard linear solid model is the material description of interest and a relaxation test, a constant strain-rate test and a creep test are the tensile experiments focused on. The experimental data are artificially created, allowing us to make a one-to-one comparison between the input parameters and the identified parameter values. Besides dealing with the aforementioned issues, we believe that this contribution forms a comprehensible start for those interested in applying BI in viscoelasticity.
Eberle, Jonas; Warnock, Rachel C M; Ahrens, Dirk
2016-05-05
Defining species units can be challenging, especially during the earliest stages of speciation, when phylogenetic inference and delimitation methods may be compromised by incomplete lineage sorting (ILS) or secondary gene flow. Integrative approaches to taxonomy, which combine molecular and morphological evidence, have the potential to be valuable in such cases. In this study we investigated the South African scarab beetle genus Pleophylla using data collected from 110 individuals of eight putative morphospecies. The dataset included four molecular markers (cox1, 16S, rrnL, ITS1) and morphometric data based on male genital morphology. We applied a suite of molecular and morphological approaches to species delimitation, and implemented a novel Bayesian approach in the software iBPP, which enables continuous morphological trait and molecular data to be combined. Traditional morphology-based species assignments were supported quantitatively by morphometric analyses of the male genitalia (eigenshape analysis, CVA, LDA). While the ITS1-based delineation was also broadly congruent with the morphospecies, the cox1 data resulted in over-splitting (GMYC modelling, haplotype networks, PTP, ABGD). In the most extreme case morphospecies shared identical haplotypes, which may be attributable to ILS based on statistical tests performed using the software JML. We found the strongest support for putative morphospecies based on phylogenetic evidence using the combined approach implemented in iBPP. However, support for putative species was sensitive to the use of alternative guide trees and alternative combinations of priors on the population size (θ) and rootage (τ 0 ) parameters, especially when the analysis was based on molecular or morphological data alone. We demonstrate that continuous morphological trait data can be extremely valuable in assessing competing hypotheses to species delimitation. In particular, we show that the inclusion of morphological data in an integrative Bayesian framework can improve the resolution of inferred species units. However, we also demonstrate that this approach is extremely sensitive to guide tree and prior parameter choice. These parameters should be chosen with caution - if possible - based on independent empirical evidence, or careful sensitivity analyses should be performed to assess the robustness of results. Young species provide exemplars for investigating the mechanisms of speciation and for assessing the performance of tools used to delimit species on the basis of molecular and/or morphological evidence.
Case studies in Bayesian microbial risk assessments.
Kennedy, Marc C; Clough, Helen E; Turner, Joanne
2009-12-21
The quantification of uncertainty and variability is a key component of quantitative risk analysis. Recent advances in Bayesian statistics make it ideal for integrating multiple sources of information, of different types and quality, and providing a realistic estimate of the combined uncertainty in the final risk estimates. We present two case studies related to foodborne microbial risks. In the first, we combine models to describe the sequence of events resulting in illness from consumption of milk contaminated with VTEC O157. We used Monte Carlo simulation to propagate uncertainty in some of the inputs to computer models describing the farm and pasteurisation process. Resulting simulated contamination levels were then assigned to consumption events from a dietary survey. Finally we accounted for uncertainty in the dose-response relationship and uncertainty due to limited incidence data to derive uncertainty about yearly incidences of illness in young children. Options for altering the risk were considered by running the model with different hypothetical policy-driven exposure scenarios. In the second case study we illustrate an efficient Bayesian sensitivity analysis for identifying the most important parameters of a complex computer code that simulated VTEC O157 prevalence within a managed dairy herd. This was carried out in 2 stages, first to screen out the unimportant inputs, then to perform a more detailed analysis on the remaining inputs. The method works by building a Bayesian statistical approximation to the computer code using a number of known code input/output pairs (training runs). We estimated that the expected total number of children aged 1.5-4.5 who become ill due to VTEC O157 in milk is 8.6 per year, with 95% uncertainty interval (0,11.5). The most extreme policy we considered was banning on-farm pasteurisation of milk, which reduced the estimate to 6.4 with 95% interval (0,11). In the second case study the effective number of inputs was reduced from 30 to 7 in the screening stage, and just 2 inputs were found to explain 82.8% of the output variance. A combined total of 500 runs of the computer code were used. These case studies illustrate the use of Bayesian statistics to perform detailed uncertainty and sensitivity analyses, integrating multiple information sources in a way that is both rigorous and efficient.
Hu, Weiming; Tian, Guodong; Kang, Yongxin; Yuan, Chunfeng; Maybank, Stephen
2017-09-25
In this paper, a new nonparametric Bayesian model called the dual sticky hierarchical Dirichlet process hidden Markov model (HDP-HMM) is proposed for mining activities from a collection of time series data such as trajectories. All the time series data are clustered. Each cluster of time series data, corresponding to a motion pattern, is modeled by an HMM. Our model postulates a set of HMMs that share a common set of states (topics in an analogy with topic models for document processing), but have unique transition distributions. For the application to motion trajectory modeling, topics correspond to motion activities. The learnt topics are clustered into atomic activities which are assigned predicates. We propose a Bayesian inference method to decompose a given trajectory into a sequence of atomic activities. On combining the learnt sources and sinks, semantic motion regions, and the learnt sequence of atomic activities, the action represented by the trajectory can be described in natural language in as automatic a way as possible. The effectiveness of our dual sticky HDP-HMM is validated on several trajectory datasets. The effectiveness of the natural language descriptions for motions is demonstrated on the vehicle trajectories extracted from a traffic scene.
Neelon, Brian; Chang, Howard H; Ling, Qiang; Hastings, Nicole S
2016-12-01
Motivated by a study exploring spatiotemporal trends in emergency department use, we develop a class of two-part hurdle models for the analysis of zero-inflated areal count data. The models consist of two components-one for the probability of any emergency department use and one for the number of emergency department visits given use. Through a hierarchical structure, the models incorporate both patient- and region-level predictors, as well as spatially and temporally correlated random effects for each model component. The random effects are assigned multivariate conditionally autoregressive priors, which induce dependence between the components and provide spatial and temporal smoothing across adjacent spatial units and time periods, resulting in improved inferences. To accommodate potential overdispersion, we consider a range of parametric specifications for the positive counts, including truncated negative binomial and generalized Poisson distributions. We adopt a Bayesian inferential approach, and posterior computation is handled conveniently within standard Bayesian software. Our results indicate that the negative binomial and generalized Poisson hurdle models vastly outperform the Poisson hurdle model, demonstrating that overdispersed hurdle models provide a useful approach to analyzing zero-inflated spatiotemporal data. © The Author(s) 2014.
Bayesian modelling of uncertainties of Monte Carlo radiative-transfer simulations
NASA Astrophysics Data System (ADS)
Beaujean, Frederik; Eggers, Hans C.; Kerzendorf, Wolfgang E.
2018-07-01
One of the big challenges in astrophysics is the comparison of complex simulations to observations. As many codes do not directly generate observables (e.g. hydrodynamic simulations), the last step in the modelling process is often a radiative-transfer treatment. For this step, the community relies increasingly on Monte Carlo radiative transfer due to the ease of implementation and scalability with computing power. We consider simulations in which the number of photon packets is Poisson distributed, while the weight assigned to a single photon packet follows any distribution of choice. We show how to estimate the statistical uncertainty of the sum of weights in each bin from the output of a single radiative-transfer simulation. Our Bayesian approach produces a posterior distribution that is valid for any number of packets in a bin, even zero packets, and is easy to implement in practice. Our analytic results for large number of packets show that we generalize existing methods that are valid only in limiting cases. The statistical problem considered here appears in identical form in a wide range of Monte Carlo simulations including particle physics and importance sampling. It is particularly powerful in extracting information when the available data are sparse or quantities are small.
Costal vulnerability systems-network using Fuzzy and Bayesian approaches
NASA Astrophysics Data System (ADS)
Taramelli, A.; Valentini, E.; Filipponi, F.; Nguyen Xuan, A.; Arosio, M.
2016-12-01
Marine drivers such as surge in the context of SLR, are threatening low-lying coastal plains. In order to deal with disturbances a deeper understanding of benefits deriving from ecosystem services assesment, management and planning (e.g. the role of dune ridges in surge mitigation and climate adaptation) can enhance the resilience of coastal systems. In this frame assessing the vulnerability is a key concern of many SOS (social, ecological, institutional) that deals with several challenges like the definition of Essential Variables (EVs) able to synthesize the required information, the assignment of different weight to be attributed to each considered variable, the selection of method for combining the relevant variables, etc.. To this end it is unclear how SLR, subsidence and erosion might affect coastal subsistence resources because of highly complex interactions and because of the subjective system of weighting many variables and their interaction within the systems. In this contribution, making the best use of many EO products, in situ data and modelling, we propose a multidimensional surge vulnerability assessment that aims at combining together geophysical and socioeconomic variable on the base of different approaches: 1) Fuzzy Logic; 2) Bayesian approach. The final goal is providing insight in understanding how to quantify regulating ecosystem services.
NASA Astrophysics Data System (ADS)
van Rossum, Anne C.; Lin, Hai Xiang; Dubbeldam, Johan; van der Herik, H. Jaap
2018-04-01
In machine vision typical heuristic methods to extract parameterized objects out of raw data points are the Hough transform and RANSAC. Bayesian models carry the promise to optimally extract such parameterized objects given a correct definition of the model and the type of noise at hand. A category of solvers for Bayesian models are Markov chain Monte Carlo methods. Naive implementations of MCMC methods suffer from slow convergence in machine vision due to the complexity of the parameter space. Towards this blocked Gibbs and split-merge samplers have been developed that assign multiple data points to clusters at once. In this paper we introduce a new split-merge sampler, the triadic split-merge sampler, that perform steps between two and three randomly chosen clusters. This has two advantages. First, it reduces the asymmetry between the split and merge steps. Second, it is able to propose a new cluster that is composed out of data points from two different clusters. Both advantages speed up convergence which we demonstrate on a line extraction problem. We show that the triadic split-merge sampler outperforms the conventional split-merge sampler. Although this new MCMC sampler is demonstrated in this machine vision context, its application extend to the very general domain of statistical inference.
Identification of transmissivity fields using a Bayesian strategy and perturbative approach
NASA Astrophysics Data System (ADS)
Zanini, Andrea; Tanda, Maria Giovanna; Woodbury, Allan D.
2017-10-01
The paper deals with the crucial problem of the groundwater parameter estimation that is the basis for efficient modeling and reclamation activities. A hierarchical Bayesian approach is developed: it uses the Akaike's Bayesian Information Criteria in order to estimate the hyperparameters (related to the covariance model chosen) and to quantify the unknown noise variance. The transmissivity identification proceeds in two steps: the first, called empirical Bayesian interpolation, uses Y* (Y = lnT) observations to interpolate Y values on a specified grid; the second, called empirical Bayesian update, improve the previous Y estimate through the addition of hydraulic head observations. The relationship between the head and the lnT has been linearized through a perturbative solution of the flow equation. In order to test the proposed approach, synthetic aquifers from literature have been considered. The aquifers in question contain a variety of boundary conditions (both Dirichelet and Neuman type) and scales of heterogeneities (σY2 = 1.0 and σY2 = 5.3). The estimated transmissivity fields were compared to the true one. The joint use of Y* and head measurements improves the estimation of Y considering both degrees of heterogeneity. Even if the variance of the strong transmissivity field can be considered high for the application of the perturbative approach, the results show the same order of approximation of the non-linear methods proposed in literature. The procedure allows to compute the posterior probability distribution of the target quantities and to quantify the uncertainty in the model prediction. Bayesian updating has advantages related both to the Monte-Carlo (MC) and non-MC approaches. In fact, as the MC methods, Bayesian updating allows computing the direct posterior probability distribution of the target quantities and as non-MC methods it has computational times in the order of seconds.
Carreon-Martinez, Lucia B.; Walter, Ryan P.; Johnson, Timothy B.; Ludsin, Stuart A.; Heath, Daniel D.
2015-01-01
Nutrient-rich, turbid river plumes that are common to large lakes and coastal marine ecosystems have been hypothesized to benefit survival of fish during early life stages by increasing food availability and (or) reducing vulnerability to visual predators. However, evidence that river plumes truly benefit the recruitment process remains meager for both freshwater and marine fishes. Here, we use genotype assignment between juvenile and larval yellow perch (Perca flavescens) from western Lake Erie to estimate and compare recruitment to the age-0 juvenile stage for larvae residing inside the highly turbid, south-shore Maumee River plume versus those occupying the less turbid, more northerly Detroit River plume. Bayesian genotype assignment of a mixed assemblage of juvenile (age-0) yellow perch to putative larval source populations established that recruitment of larvae was higher from the turbid Maumee River plume than for the less turbid Detroit River plume during 2006 and 2007, but not in 2008. Our findings add to the growing evidence that turbid river plumes can indeed enhance survival of fish larvae to recruited life stages, and also demonstrate how novel population genetic analyses of early life stages can contribute to determining critical early life stage processes in the fish recruitment process. PMID:25954968
Bayesian Methods for Determining the Importance of Effects
USDA-ARS?s Scientific Manuscript database
Criticisms have plagued the frequentist null-hypothesis significance testing (NHST) procedure since the day it was created from the Fisher Significance Test and Hypothesis Test of Jerzy Neyman and Egon Pearson. Alternatives to NHST exist in frequentist statistics, but competing methods are also avai...
Bayesian characterization of uncertainty in species interaction strengths.
Wolf, Christopher; Novak, Mark; Gitelman, Alix I
2017-06-01
Considerable effort has been devoted to the estimation of species interaction strengths. This effort has focused primarily on statistical significance testing and obtaining point estimates of parameters that contribute to interaction strength magnitudes, leaving the characterization of uncertainty associated with those estimates unconsidered. We consider a means of characterizing the uncertainty of a generalist predator's interaction strengths by formulating an observational method for estimating a predator's prey-specific per capita attack rates as a Bayesian statistical model. This formulation permits the explicit incorporation of multiple sources of uncertainty. A key insight is the informative nature of several so-called non-informative priors that have been used in modeling the sparse data typical of predator feeding surveys. We introduce to ecology a new neutral prior and provide evidence for its superior performance. We use a case study to consider the attack rates in a New Zealand intertidal whelk predator, and we illustrate not only that Bayesian point estimates can be made to correspond with those obtained by frequentist approaches, but also that estimation uncertainty as described by 95% intervals is more useful and biologically realistic using the Bayesian method. In particular, unlike in bootstrap confidence intervals, the lower bounds of the Bayesian posterior intervals for attack rates do not include zero when a predator-prey interaction is in fact observed. We conclude that the Bayesian framework provides a straightforward, probabilistic characterization of interaction strength uncertainty, enabling future considerations of both the deterministic and stochastic drivers of interaction strength and their impact on food webs.
NASA Astrophysics Data System (ADS)
Kushida, N.; Kebede, F.; Feitio, P.; Le Bras, R.
2016-12-01
The Preparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO) has been developing and testing NET-VISA (Arora et al., 2013), a Bayesian automatic event detection and localization program, and evaluating its performance in a realistic operational mode. In our preliminary testing at the CTBTO, NET-VISA shows better performance than its currently operating automatic localization program. However, given CTBTO's role and its international context, a new technology should be introduced cautiously when it replaces a key piece of the automatic processing. We integrated the results of NET-VISA into the Analyst Review Station, extensively used by the analysts so that they can check the accuracy and robustness of the Bayesian approach. We expect the workload of the analysts to be reduced because of the better performance of NET-VISA in finding missed events and getting a more complete set of stations than the current system which has been operating for nearly twenty years. The results of a series of tests indicate that the expectations born from the automatic tests, which show an overall overlap improvement of 11%, meaning that the missed events rate is cut by 42%, hold for the integrated interactive module as well. New events are found by analysts, which qualify for the CTBTO Reviewed Event Bulletin, beyond the ones analyzed through the standard procedures. Arora, N., Russell, S., and Sudderth, E., NET-VISA: Network Processing Vertically Integrated Seismic Analysis, 2013, Bull. Seismol. Soc. Am., 103, 709-729.
Analytical study to define a helicopter stability derivative extraction method, volume 1
NASA Technical Reports Server (NTRS)
Molusis, J. A.
1973-01-01
A method is developed for extracting six degree-of-freedom stability and control derivatives from helicopter flight data. Different combinations of filtering and derivative estimate are investigated and used with a Bayesian approach for derivative identification. The combination of filtering and estimate found to yield the most accurate time response match to flight test data is determined and applied to CH-53A and CH-54B flight data. The method found to be most accurate consists of (1) filtering flight test data with a digital filter, followed by an extended Kalman filter (2) identifying a derivative estimate with a least square estimator, and (3) obtaining derivatives with the Bayesian derivative extraction method.
Development of a novel forensic STR multiplex for ancestry analysis and extended identity testing.
Phillips, Chris; Fernandez-Formoso, Luis; Gelabert-Besada, Miguel; Garcia-Magariños, Manuel; Santos, Carla; Fondevila, Manuel; Carracedo, Angel; Lareu, Maria Victoria
2013-04-01
There is growing interest in developing additional DNA typing techniques to provide better investigative leads in forensic analysis. These include inference of genetic ancestry and prediction of common physical characteristics of DNA donors. To date, forensic ancestry analysis has centered on population-divergent SNPs but these binary loci cannot reliably detect DNA mixtures, common in forensic samples. Furthermore, STR genotypes, forming the principal DNA profiling system, are not routinely combined with forensic SNPs to strengthen frequency data available for ancestry inference. We report development of a 12-STR multiplex composed of ancestry informative marker STRs (AIM-STRs) selected from 434 tetranucleotide repeat loci. We adapted our online Bayesian classifier for AIM-SNPs: Snipper, to handle multiallele STR data using frequency-based training sets. We assessed the ability of the 12-plex AIM-STRs to differentiate CEPH Human Genome Diversity Panel populations, plus their informativeness combined with established forensic STRs and AIM-SNPs. We found combining STRs and SNPs improves the success rate of ancestry assignments while providing a reliable mixture detection system lacking from SNP analysis alone. As the 12 STRs generally show a broad range of alleles in all populations, they provide highly informative supplementary STRs for extended relationship testing and identification of missing persons with incomplete reference pedigrees. Lastly, mixed marker approaches (combining STRs with binary loci) for simple ancestry inference tests beyond forensic analysis bring advantages and we discuss the genotyping options available. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Bayesian observer replicates convexity context effects in figure-ground perception.
Goldreich, Daniel; Peterson, Mary A
2012-01-01
Peterson and Salvagio (2008) demonstrated convexity context effects in figure-ground perception. Subjects shown displays consisting of unfamiliar alternating convex and concave regions identified the convex regions as foreground objects progressively more frequently as the number of regions increased; this occurred only when the concave regions were homogeneously colored. The origins of these effects have been unclear. Here, we present a two-free-parameter Bayesian observer that replicates convexity context effects. The Bayesian observer incorporates two plausible expectations regarding three-dimensional scenes: (1) objects tend to be convex rather than concave, and (2) backgrounds tend (more than foreground objects) to be homogeneously colored. The Bayesian observer estimates the probability that a depicted scene is three-dimensional, and that the convex regions are figures. It responds stochastically by sampling from its posterior distributions. Like human observers, the Bayesian observer shows convexity context effects only for images with homogeneously colored concave regions. With optimal parameter settings, it performs similarly to the average human subject on the four display types tested. We propose that object convexity and background color homogeneity are environmental regularities exploited by human visual perception; vision achieves figure-ground perception by interpreting ambiguous images in light of these and other expected regularities in natural scenes.
Tracking composite material damage evolution using Bayesian filtering and flash thermography data
NASA Astrophysics Data System (ADS)
Gregory, Elizabeth D.; Holland, Steve D.
2016-05-01
We propose a method for tracking the condition of a composite part using Bayesian filtering of ash thermography data over the lifetime of the part. In this demonstration, composite panels were fabricated; impacted to induce subsurface delaminations; and loaded in compression over multiple time steps, causing the delaminations to grow in size. Flash thermography data was collected between each damage event to serve as a time history of the part. The ash thermography indicated some areas of damage but provided little additional information as to the exact nature or depth of the damage. Computed tomography (CT) data was also collected after each damage event and provided a high resolution volume model of damage that acted as truth. After each cycle, the condition estimate, from the ash thermography data and the Bayesian filter, was compared to 'ground truth'. The Bayesian process builds on the lifetime history of ash thermography scans and can give better estimates of material condition as compared to the most recent scan alone, which is common practice in the aerospace industry. Bayesian inference provides probabilistic estimates of damage condition that are updated as each new set of data becomes available. The method was tested on simulated data and then on an experimental data set.
Natural frequencies facilitate diagnostic inferences of managers
Hoffrage, Ulrich; Hafenbrädl, Sebastian; Bouquet, Cyril
2015-01-01
In Bayesian inference tasks, information about base rates as well as hit rate and false-alarm rate needs to be integrated according to Bayes’ rule after the result of a diagnostic test became known. Numerous studies have found that presenting information in a Bayesian inference task in terms of natural frequencies leads to better performance compared to variants with information presented in terms of probabilities or percentages. Natural frequencies are the tallies in a natural sample in which hit rate and false-alarm rate are not normalized with respect to base rates. The present research replicates the beneficial effect of natural frequencies with four tasks from the domain of management, and with management students as well as experienced executives as participants. The percentage of Bayesian responses was almost twice as high when information was presented in natural frequencies compared to a presentation in terms of percentages. In contrast to most tasks previously studied, the majority of numerical responses were lower than the Bayesian solutions. Having heard of Bayes’ rule prior to the study did not affect Bayesian performance. An implication of our work is that textbooks explaining Bayes’ rule should teach how to represent information in terms of natural frequencies instead of how to plug probabilities or percentages into a formula. PMID:26157397
Bayesian approach for counting experiment statistics applied to a neutrino point source analysis
NASA Astrophysics Data System (ADS)
Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.
2013-12-01
In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.
NASA Astrophysics Data System (ADS)
Kopka, Piotr; Wawrzynczak, Anna; Borysiewicz, Mieczyslaw
2016-11-01
In this paper the Bayesian methodology, known as Approximate Bayesian Computation (ABC), is applied to the problem of the atmospheric contamination source identification. The algorithm input data are on-line arriving concentrations of the released substance registered by the distributed sensors network. This paper presents the Sequential ABC algorithm in detail and tests its efficiency in estimation of probabilistic distributions of atmospheric release parameters of a mobile contamination source. The developed algorithms are tested using the data from Over-Land Atmospheric Diffusion (OLAD) field tracer experiment. The paper demonstrates estimation of seven parameters characterizing the contamination source, i.e.: contamination source starting position (x,y), the direction of the motion of the source (d), its velocity (v), release rate (q), start time of release (ts) and its duration (td). The online-arriving new concentrations dynamically update the probability distributions of search parameters. The atmospheric dispersion Second-order Closure Integrated PUFF (SCIPUFF) Model is used as the forward model to predict the concentrations at the sensors locations.
Pidlisecky, Adam; Haines, S.S.
2011-01-01
Conventional processing methods for seismic cone penetrometer data present several shortcomings, most notably the absence of a robust velocity model uncertainty estimate. We propose a new seismic cone penetrometer testing (SCPT) data-processing approach that employs Bayesian methods to map measured data errors into quantitative estimates of model uncertainty. We first calculate travel-time differences for all permutations of seismic trace pairs. That is, we cross-correlate each trace at each measurement location with every trace at every other measurement location to determine travel-time differences that are not biased by the choice of any particular reference trace and to thoroughly characterize data error. We calculate a forward operator that accounts for the different ray paths for each measurement location, including refraction at layer boundaries. We then use a Bayesian inversion scheme to obtain the most likely slowness (the reciprocal of velocity) and a distribution of probable slowness values for each model layer. The result is a velocity model that is based on correct ray paths, with uncertainty bounds that are based on the data error. ?? NRC Research Press 2011.
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
The multicategory case of the sequential Bayesian pixel selection and estimation procedure
NASA Technical Reports Server (NTRS)
Pore, M. D.; Dennis, T. B. (Principal Investigator)
1980-01-01
A Bayesian technique for stratified proportion estimation and a sampling based on minimizing the mean squared error of this estimator were developed and tested on LANDSAT multispectral scanner data using the beta density function to model the prior distribution in the two-class case. An extention of this procedure to the k-class case is considered. A generalization of the beta function is shown to be a density function for the general case which allows the procedure to be extended.
Exploiting Cross-sensitivity by Bayesian Decoding of Mixed Potential Sensor Arrays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kreller, Cortney
LANL mixed-potential electrochemical sensor (MPES) device arrays were coupled with advanced Bayesian inference treatment of the physical model of relevant sensor-analyte interactions. We demonstrated that our approach could be used to uniquely discriminate the composition of ternary gas sensors with three discreet MPES sensors with an average error of less than 2%. We also observed that the MPES exhibited excellent stability over a year of operation at elevated temperatures in the presence of test gases.
Bayesian networks and statistical analysis application to analyze the diagnostic test accuracy
NASA Astrophysics Data System (ADS)
Orzechowski, P.; Makal, Jaroslaw; Onisko, A.
2005-02-01
The computer aided BPH diagnosis system based on Bayesian network is described in the paper. First result are compared to a given statistical method. Different statistical methods are used successfully in medicine for years. However, the undoubted advantages of probabilistic methods make them useful in application in newly created systems which are frequent in medicine, but do not have full and competent knowledge. The article presents advantages of the computer aided BPH diagnosis system in clinical practice for urologists.
Degen, Bernd; Blanc-Jolivet, Céline; Stierand, Katrin; Gillet, Elizabeth
2017-03-01
During the past decade, the use of DNA for forensic applications has been extensively implemented for plant and animal species, as well as in humans. Tracing back the geographical origin of an individual usually requires genetic assignment analysis. These approaches are based on reference samples that are grouped into populations or other aggregates and intend to identify the most likely group of origin. Often this grouping does not have a biological but rather a historical or political justification, such as "country of origin". In this paper, we present a new nearest neighbour approach to individual assignment or classification within a given but potentially imperfect grouping of reference samples. This method, which is based on the genetic distance between individuals, functions better in many cases than commonly used methods. We demonstrate the operation of our assignment method using two data sets. One set is simulated for a large number of trees distributed in a 120km by 120km landscape with individual genotypes at 150 SNPs, and the other set comprises experimental data of 1221 individuals of the African tropical tree species Entandrophragma cylindricum (Sapelli) genotyped at 61 SNPs. Judging by the level of correct self-assignment, our approach outperformed the commonly used frequency and Bayesian approaches by 15% for the simulated data set and by 5-7% for the Sapelli data set. Our new approach is less sensitive to overlapping sources of genetic differentiation, such as genetic differences among closely-related species, phylogeographic lineages and isolation by distance, and thus operates better even for suboptimal grouping of individuals. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A program for the Bayesian Neural Network in the ROOT framework
NASA Astrophysics Data System (ADS)
Zhong, Jiahang; Huang, Run-Sheng; Lee, Shih-Chang
2011-12-01
We present a Bayesian Neural Network algorithm implemented in the TMVA package (Hoecker et al., 2007 [1]), within the ROOT framework (Brun and Rademakers, 1997 [2]). Comparing to the conventional utilization of Neural Network as discriminator, this new implementation has more advantages as a non-parametric regression tool, particularly for fitting probabilities. It provides functionalities including cost function selection, complexity control and uncertainty estimation. An example of such application in High Energy Physics is shown. The algorithm is available with ROOT release later than 5.29. Program summaryProgram title: TMVA-BNN Catalogue identifier: AEJX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: BSD license No. of lines in distributed program, including test data, etc.: 5094 No. of bytes in distributed program, including test data, etc.: 1,320,987 Distribution format: tar.gz Programming language: C++ Computer: Any computer system or cluster with C++ compiler and UNIX-like operating system Operating system: Most UNIX/Linux systems. The application programs were thoroughly tested under Fedora and Scientific Linux CERN. Classification: 11.9 External routines: ROOT package version 5.29 or higher ( http://root.cern.ch) Nature of problem: Non-parametric fitting of multivariate distributions Solution method: An implementation of Neural Network following the Bayesian statistical interpretation. Uses Laplace approximation for the Bayesian marginalizations. Provides the functionalities of automatic complexity control and uncertainty estimation. Running time: Time consumption for the training depends substantially on the size of input sample, the NN topology, the number of training iterations, etc. For the example in this manuscript, about 7 min was used on a PC/Linux with 2.0 GHz processors.
A Bayesian Framework for Reliability Analysis of Spacecraft Deployments
NASA Technical Reports Server (NTRS)
Evans, John W.; Gallo, Luis; Kaminsky, Mark
2012-01-01
Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a two stage sequential Bayesian framework for reliability estimation of spacecraft deployment was developed for this purpose. This process was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the Optical Telescope Element. Initially, detailed studies of NASA deployment history, "heritage information", were conducted, extending over 45 years of spacecraft launches. This information was then coupled to a non-informative prior and a binomial likelihood function to create a posterior distribution for deployments of various subsystems uSing Monte Carlo Markov Chain sampling. Select distributions were then coupled to a subsequent analysis, using test data and anomaly occurrences on successive ground test deployments of scale model test articles of JWST hardware, to update the NASA heritage data. This allowed for a realistic prediction for the reliability of the complex Sunshield deployment, with credibility limits, within this two stage Bayesian framework.
Ritchie, Andrew M; Lo, Nathan; Ho, Simon Y W
2017-05-01
In Bayesian phylogenetic analyses of genetic data, prior probability distributions need to be specified for the model parameters, including the tree. When Bayesian methods are used for molecular dating, available tree priors include those designed for species-level data, such as the pure-birth and birth-death priors, and coalescent-based priors designed for population-level data. However, molecular dating methods are frequently applied to data sets that include multiple individuals across multiple species. Such data sets violate the assumptions of both the speciation and coalescent-based tree priors, making it unclear which should be chosen and whether this choice can affect the estimation of node times. To investigate this problem, we used a simulation approach to produce data sets with different proportions of within- and between-species sampling under the multispecies coalescent model. These data sets were then analyzed under pure-birth, birth-death, constant-size coalescent, and skyline coalescent tree priors. We also explored the ability of Bayesian model testing to select the best-performing priors. We confirmed the applicability of our results to empirical data sets from cetaceans, phocids, and coregonid whitefish. Estimates of node times were generally robust to the choice of tree prior, but some combinations of tree priors and sampling schemes led to large differences in the age estimates. In particular, the pure-birth tree prior frequently led to inaccurate estimates for data sets containing a mixture of inter- and intraspecific sampling, whereas the birth-death and skyline coalescent priors produced stable results across all scenarios. Model testing provided an adequate means of rejecting inappropriate tree priors. Our results suggest that tree priors do not strongly affect Bayesian molecular dating results in most cases, even when severely misspecified. However, the choice of tree prior can be significant for the accuracy of dating results in the case of data sets with mixed inter- and intraspecies sampling. [Bayesian phylogenetic methods; model testing; molecular dating; node time; tree prior.]. © The authors 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please e-mail: journals.permission@oup.com.
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
Bayes Factor Approaches for Testing Interval Null Hypotheses
ERIC Educational Resources Information Center
Morey, Richard D.; Rouder, Jeffrey N.
2011-01-01
Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue…
Byass, Peter; Huong, Dao Lan; Minh, Hoang Van
2003-01-01
Verbal autopsy (VA) has become an important tool in the past 20 years for determining cause of death in communities where there is no routine registration. In many cases, expert physicians have been used to interpret the VA findings and so assign individual causes of death. However, this is time consuming and not always repeatable. Other approaches such as algorithms and neural networks have been developed in some settings. This paper aims to develop a method that is simple, reliable and consistent, which could represent an advance in VA interpretation. This paper describes the development of a Bayesian probability model for VA interpretation as an attempt to find a better approach. This methodology and a preliminary implementation are described, with an evaluation based on VA material from rural Vietnam. The new model was tested against a series of 189 VA interviews from a rural community in Vietnam. Using this very basic model, over 70% of individual causes of death corresponded with those determined by two physicians increasing to over 80% if those cases ascribed to old age or as being indeterminate by the physicians were excluded. Although there is a clear need to improve the preliminary model and to test more extensively with larger and more varied datasets, these preliminary results suggest that there may be good potential in this probabilistic approach.
Unification of field theory and maximum entropy methods for learning probability densities
NASA Astrophysics Data System (ADS)
Kinney, Justin B.
2015-09-01
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing—with its unique statistical properties—became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca. PMID:28817636
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-08-27
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the fluttermore » speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.« less
Ramachandran, Parameswaran; Sánchez-Taltavull, Daniel; Perkins, Theodore J
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing-with its unique statistical properties-became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca.
Unification of field theory and maximum entropy methods for learning probability densities.
Kinney, Justin B
2015-09-01
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.
Gucciardi, Daniel F; Zhang, Chun-Qing; Ponnusamy, Vellapandian; Si, Gangyan; Stenling, Andreas
2016-04-01
The aims of this study were to assess the cross-cultural invariance of athletes' self-reports of mental toughness and to introduce and illustrate the application of approximate measurement invariance using Bayesian estimation for sport and exercise psychology scholars. Athletes from Australia (n = 353, Mage = 19.13, SD = 3.27, men = 161), China (n = 254, Mage = 17.82, SD = 2.28, men = 138), and Malaysia (n = 341, Mage = 19.13, SD = 3.27, men = 200) provided a cross-sectional snapshot of their mental toughness. The cross-cultural invariance of the mental toughness inventory in terms of (a) the factor structure (configural invariance), (b) factor loadings (metric invariance), and (c) item intercepts (scalar invariance) was tested using an approximate measurement framework with Bayesian estimation. Results indicated that approximate metric and scalar invariance was established. From a methodological standpoint, this study demonstrated the usefulness and flexibility of Bayesian estimation for single-sample and multigroup analyses of measurement instruments. Substantively, the current findings suggest that the measurement of mental toughness requires cultural adjustments to better capture the contextually salient (emic) aspects of this concept.
Discriminative Bayesian Dictionary Learning for Classification.
Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal
2016-12-01
We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.
Bayesian estimation of the transmissivity spatial structure from pumping test data
NASA Astrophysics Data System (ADS)
Demir, Mehmet Taner; Copty, Nadim K.; Trinchero, Paolo; Sanchez-Vila, Xavier
2017-06-01
Estimating the statistical parameters (mean, variance, and integral scale) that define the spatial structure of the transmissivity or hydraulic conductivity fields is a fundamental step for the accurate prediction of subsurface flow and contaminant transport. In practice, the determination of the spatial structure is a challenge because of spatial heterogeneity and data scarcity. In this paper, we describe a novel approach that uses time drawdown data from multiple pumping tests to determine the transmissivity statistical spatial structure. The method builds on the pumping test interpretation procedure of Copty et al. (2011) (Continuous Derivation method, CD), which uses the time-drawdown data and its time derivative to estimate apparent transmissivity values as a function of radial distance from the pumping well. A Bayesian approach is then used to infer the statistical parameters of the transmissivity field by combining prior information about the parameters and the likelihood function expressed in terms of radially-dependent apparent transmissivities determined from pumping tests. A major advantage of the proposed Bayesian approach is that the likelihood function is readily determined from randomly generated multiple realizations of the transmissivity field, without the need to solve the groundwater flow equation. Applying the method to synthetically-generated pumping test data, we demonstrate that, through a relatively simple procedure, information on the spatial structure of the transmissivity may be inferred from pumping tests data. It is also shown that the prior parameter distribution has a significant influence on the estimation procedure, given the non-uniqueness of the estimation procedure. Results also indicate that the reliability of the estimated transmissivity statistical parameters increases with the number of available pumping tests.
NASA Astrophysics Data System (ADS)
Chen, Po-Hao; Botzolakis, Emmanuel; Mohan, Suyash; Bryan, R. N.; Cook, Tessa
2016-03-01
In radiology, diagnostic errors occur either through the failure of detection or incorrect interpretation. Errors are estimated to occur in 30-35% of all exams and contribute to 40-54% of medical malpractice litigations. In this work, we focus on reducing incorrect interpretation of known imaging features. Existing literature categorizes cognitive bias leading a radiologist to an incorrect diagnosis despite having correctly recognized the abnormal imaging features: anchoring bias, framing effect, availability bias, and premature closure. Computational methods make a unique contribution, as they do not exhibit the same cognitive biases as a human. Bayesian networks formalize the diagnostic process. They modify pre-test diagnostic probabilities using clinical and imaging features, arriving at a post-test probability for each possible diagnosis. To translate Bayesian networks to clinical practice, we implemented an entirely web-based open-source software tool. In this tool, the radiologist first selects a network of choice (e.g. basal ganglia). Then, large, clearly labeled buttons displaying salient imaging features are displayed on the screen serving both as a checklist and for input. As the radiologist inputs the value of an extracted imaging feature, the conditional probabilities of each possible diagnosis are updated. The software presents its level of diagnostic discrimination using a Pareto distribution chart, updated with each additional imaging feature. Active collaboration with the clinical radiologist is a feasible approach to software design and leads to design decisions closely coupling the complex mathematics of conditional probability in Bayesian networks with practice.
Models and simulation of 3D neuronal dendritic trees using Bayesian networks.
López-Cruz, Pedro L; Bielza, Concha; Larrañaga, Pedro; Benavides-Piccione, Ruth; DeFelipe, Javier
2011-12-01
Neuron morphology is crucial for neuronal connectivity and brain information processing. Computational models are important tools for studying dendritic morphology and its role in brain function. We applied a class of probabilistic graphical models called Bayesian networks to generate virtual dendrites from layer III pyramidal neurons from three different regions of the neocortex of the mouse. A set of 41 morphological variables were measured from the 3D reconstructions of real dendrites and their probability distributions used in a machine learning algorithm to induce the model from the data. A simulation algorithm is also proposed to obtain new dendrites by sampling values from Bayesian networks. The main advantage of this approach is that it takes into account and automatically locates the relationships between variables in the data instead of using predefined dependencies. Therefore, the methodology can be applied to any neuronal class while at the same time exploiting class-specific properties. Also, a Bayesian network was defined for each part of the dendrite, allowing the relationships to change in the different sections and to model heterogeneous developmental factors or spatial influences. Several univariate statistical tests and a novel multivariate test based on Kullback-Leibler divergence estimation confirmed that virtual dendrites were similar to real ones. The analyses of the models showed relationships that conform to current neuroanatomical knowledge and support model correctness. At the same time, studying the relationships in the models can help to identify new interactions between variables related to dendritic morphology.
Probabilistic peak detection in CE-LIF for STR DNA typing.
Woldegebriel, Michael; van Asten, Arian; Kloosterman, Ate; Vivó-Truyols, Gabriel
2017-07-01
In this work, we present a novel probabilistic peak detection algorithm based on a Bayesian framework for forensic DNA analysis. The proposed method aims at an exhaustive use of raw electropherogram data from a laser-induced fluorescence multi-CE system. As the raw data are informative up to a single data point, the conventional threshold-based approaches discard relevant forensic information early in the data analysis pipeline. Our proposed method assigns a posterior probability reflecting the data point's relevance with respect to peak detection criteria. Peaks of low intensity generated from a truly existing allele can thus constitute evidential value instead of fully discarding them and contemplating a potential allele drop-out. This way of working utilizes the information available within each individual data point and thus avoids making early (binary) decisions on the data analysis that can lead to error propagation. The proposed method was tested and compared to the application of a set threshold as is current practice in forensic STR DNA profiling. The new method was found to yield a significant improvement in the number of alleles identified, regardless of the peak heights and deviation from Gaussian shape. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mycofier: a new machine learning-based classifier for fungal ITS sequences.
Delgado-Serrano, Luisa; Restrepo, Silvia; Bustos, Jose Ricardo; Zambrano, Maria Mercedes; Anzola, Juan Manuel
2016-08-11
The taxonomic and phylogenetic classification based on sequence analysis of the ITS1 genomic region has become a crucial component of fungal ecology and diversity studies. Nowadays, there is no accurate alignment-free classification tool for fungal ITS1 sequences for large environmental surveys. This study describes the development of a machine learning-based classifier for the taxonomical assignment of fungal ITS1 sequences at the genus level. A fungal ITS1 sequence database was built using curated data. Training and test sets were generated from it. A Naïve Bayesian classifier was built using features from the primary sequence with an accuracy of 87 % in the classification at the genus level. The final model was based on a Naïve Bayes algorithm using ITS1 sequences from 510 fungal genera. This classifier, denoted as Mycofier, provides similar classification accuracy compared to BLASTN, but the database used for the classification contains curated data and the tool, independent of alignment, is more efficient and contributes to the field, given the lack of an accurate classification tool for large data from fungal ITS1 sequences. The software and source code for Mycofier are freely available at https://github.com/ldelgado-serrano/mycofier.git .
Sidibé, Cheick Abou Kounta; Grosbois, Vladimir; Thiaucourt, François; Niang, Mamadou; Lesnoff, Matthieu; Roger, François
2012-08-01
A Bayesian approach, allowing for conditional dependence between two tests was used to estimate without gold standard the sensitivities of complement fixation test (CFT) and competitive enzyme-linked immunosorbent assay test (cELISA) and the serological prevalence of CBPP in a cattle population of the Central Delta of the Niger River in Mali, where CBPP is enzootic and the true prevalence and animals serological state were unknown. A significant difference (P = 0.99) was observed between the sensitivities of the two tests, estimated at 73.7% (95% probability interval [PI], 63.4-82.7) for cELISA and 42.3% (95% PI, 33.3-53.7) for CFT. Individual-level serological prevalence in the study population was estimated at 14.1% (95% PI, 10.8-16.9). Our results indicate that in enzootic areas, cELISA performs better in terms of sensitivity than CFT. However, negative conditional sensitivity dependence between the two tests was detected, implying that to achieve maximum sensitivity, the two tests should be applied in parallel.
NASA Astrophysics Data System (ADS)
Eadie, Gwendolyn M.; Springford, Aaron; Harris, William E.
2017-02-01
We present a hierarchical Bayesian method for estimating the total mass and mass profile of the Milky Way Galaxy. The new hierarchical Bayesian approach further improves the framework presented by Eadie et al. and Eadie and Harris and builds upon the preliminary reports by Eadie et al. The method uses a distribution function f({ E },L) to model the Galaxy and kinematic data from satellite objects, such as globular clusters (GCs), to trace the Galaxy’s gravitational potential. A major advantage of the method is that it not only includes complete and incomplete data simultaneously in the analysis, but also incorporates measurement uncertainties in a coherent and meaningful way. We first test the hierarchical Bayesian framework, which includes measurement uncertainties, using the same data and power-law model assumed in Eadie and Harris and find the results are similar but more strongly constrained. Next, we take advantage of the new statistical framework and incorporate all possible GC data, finding a cumulative mass profile with Bayesian credible regions. This profile implies a mass within 125 kpc of 4.8× {10}11{M}⊙ with a 95% Bayesian credible region of (4.0{--}5.8)× {10}11{M}⊙ . Our results also provide estimates of the true specific energies of all the GCs. By comparing these estimated energies to the measured energies of GCs with complete velocity measurements, we observe that (the few) remote tracers with complete measurements may play a large role in determining a total mass estimate of the Galaxy. Thus, our study stresses the need for more remote tracers with complete velocity measurements.
Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L
2016-02-10
Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.
Molecular and morphologic data reveal multiple species in Peromyscus pectoralis
Bradley, Robert D.; Schmidly, David J.; Amman, Brian R.; Platt, Roy N.; Neumann, Kathy M.; Huynh, Howard M.; Muñiz-Martínez, Raúl; López-González, Celia; Ordóñez-Garza, Nicté
2015-01-01
DNA sequence and morphometric data were used to re-evaluate the taxonomy and systematics of Peromyscus pectoralis. Phylogenetic analyses (maximum likelihood and Bayesian inference) of DNA sequences from the mitochondrial cytochrome-b gene in 44 samples of P. pectoralis indicated 2 well-supported monophyletic clades. The 1st clade contained specimens from Texas historically assigned to P. p. laceianus; the 2nd was comprised of specimens previously referable to P. p. collinus, P. p. laceianus, and P. p. pectoralis obtained from northern and eastern Mexico. Levels of genetic variation (~7%) between these 2 clades indicated that the genetic divergence typically exceeded that reported for other species of Peromyscus. Samples of P. p. laceianus north and south of the Río Grande were not monophyletic. In addition, samples representing P. p. collinus and P. p. pectoralis formed 2 clades that differed genetically by 7.14%. Multivariate analyses of external and cranial measurements from 63 populations of P. pectoralis revealed 4 morpho-groups consistent with clades in the DNA sequence analysis: 1 from Texas and New Mexico assignable to P. p. laceianus; a 2nd from western and southern Mexico assignable to P. p. pectoralis; a 3rd from northern and central Mexico previously assigned to P. p. pectoralis but herein shown to represent an undescribed taxon; and a 4th from southeastern Mexico assignable to P. p. collinus. Based on the concordance of these results, populations from the United States are referred to as P. laceianus, whereas populations from Mexico are referred to as P. pectoralis (including some samples historically assigned to P. p. collinus, P. p. laceianus, and P. p. pectoralis). A new subspecies is described to represent populations south of the Río Grande in northern and central Mexico. Additional research is needed to discern if P. p. collinus warrants species recognition. PMID:26937045
New insights into the classification and nomenclature of cortical GABAergic interneurons.
DeFelipe, Javier; López-Cruz, Pedro L; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R; Huang, Josh; Jones, Edward G; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A; Marín, Oscar; Markram, Henry; McBain, Chris J; Meyer, Hanno S; Monyer, Hannah; Nelson, Sacha B; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L R; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M; Sherwood, Chet C; Staiger, Jochen F; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A
2013-03-01
A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts' assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus.
New insights into the classification and nomenclature of cortical GABAergic interneurons
DeFelipe, Javier; López-Cruz, Pedro L.; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F.; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R.; Huang, Josh; Jones, Edward G.; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A.; Marín, Oscar; Markram, Henry; McBain, Chris J.; Meyer, Hanno S.; Monyer, Hannah; Nelson, Sacha B.; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L. R.; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M.; Sherwood, Chet C.; Staiger, Jochen F.; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A.
2013-01-01
A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts’ assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus. PMID:23385869
Learning time series for intelligent monitoring
NASA Technical Reports Server (NTRS)
Manganaris, Stefanos; Fisher, Doug
1994-01-01
We address the problem of classifying time series according to their morphological features in the time domain. In a supervised machine-learning framework, we induce a classification procedure from a set of preclassified examples. For each class, we infer a model that captures its morphological features using Bayesian model induction and the minimum message length approach to assign priors. In the performance task, we classify a time series in one of the learned classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. We report results from experiments in a monitoring domain of interest to NASA.
QUEST+: A general multidimensional Bayesian adaptive psychometric method.
Watson, Andrew B
2017-03-01
QUEST+ is a Bayesian adaptive psychometric testing method that allows an arbitrary number of stimulus dimensions, psychometric function parameters, and trial outcomes. It is a generalization and extension of the original QUEST procedure and incorporates many subsequent developments in the area of parametric adaptive testing. With a single procedure, it is possible to implement a wide variety of experimental designs, including conventional threshold measurement; measurement of psychometric function parameters, such as slope and lapse; estimation of the contrast sensitivity function; measurement of increment threshold functions; measurement of noise-masking functions; Thurstone scale estimation using pair comparisons; and categorical ratings on linear and circular stimulus dimensions. QUEST+ provides a general method to accelerate data collection in many areas of cognitive and perceptual science.
Bayesian Peptide Peak Detection for High Resolution TOF Mass Spectrometry.
Zhang, Jianqiu; Zhou, Xiaobo; Wang, Honghui; Suffredini, Anthony; Zhang, Lin; Huang, Yufei; Wong, Stephen
2010-11-01
In this paper, we address the issue of peptide ion peak detection for high resolution time-of-flight (TOF) mass spectrometry (MS) data. A novel Bayesian peptide ion peak detection method is proposed for TOF data with resolution of 10 000-15 000 full width at half-maximum (FWHW). MS spectra exhibit distinct characteristics at this resolution, which are captured in a novel parametric model. Based on the proposed parametric model, a Bayesian peak detection algorithm based on Markov chain Monte Carlo (MCMC) sampling is developed. The proposed algorithm is tested on both simulated and real datasets. The results show a significant improvement in detection performance over a commonly employed method. The results also agree with expert's visual inspection. Moreover, better detection consistency is achieved across MS datasets from patients with identical pathological condition.
Bayesian Peptide Peak Detection for High Resolution TOF Mass Spectrometry
Zhang, Jianqiu; Zhou, Xiaobo; Wang, Honghui; Suffredini, Anthony; Zhang, Lin; Huang, Yufei; Wong, Stephen
2011-01-01
In this paper, we address the issue of peptide ion peak detection for high resolution time-of-flight (TOF) mass spectrometry (MS) data. A novel Bayesian peptide ion peak detection method is proposed for TOF data with resolution of 10 000–15 000 full width at half-maximum (FWHW). MS spectra exhibit distinct characteristics at this resolution, which are captured in a novel parametric model. Based on the proposed parametric model, a Bayesian peak detection algorithm based on Markov chain Monte Carlo (MCMC) sampling is developed. The proposed algorithm is tested on both simulated and real datasets. The results show a significant improvement in detection performance over a commonly employed method. The results also agree with expert’s visual inspection. Moreover, better detection consistency is achieved across MS datasets from patients with identical pathological condition. PMID:21544266
NASA Astrophysics Data System (ADS)
Williams, Christopher J.; Moffitt, Christine M.
2003-03-01
An important emerging issue in fisheries biology is the health of free-ranging populations of fish, particularly with respect to the prevalence of certain pathogens. For many years, pathologists focused on captive populations and interest was in the presence or absence of certain pathogens, so it was economically attractive to test pooled samples of fish. Recently, investigators have begun to study individual fish prevalence from pooled samples. Estimation of disease prevalence from pooled samples is straightforward when assay sensitivity and specificity are perfect, but this assumption is unrealistic. Here we illustrate the use of a Bayesian approach for estimating disease prevalence from pooled samples when sensitivity and specificity are not perfect. We also focus on diagnostic plots to monitor the convergence of the Gibbs-sampling-based Bayesian analysis. The methods are illustrated with a sample data set.
Bockman, Alexander; Fackler, Cameron; Xiang, Ning
2015-04-01
Acoustic performance for an interior requires an accurate description of the boundary materials' surface acoustic impedance. Analytical methods may be applied to a small class of test geometries, but inverse numerical methods provide greater flexibility. The parameter estimation problem requires minimizing prediction vice observed acoustic field pressure. The Bayesian-network sampling approach presented here mitigates other methods' susceptibility to noise inherent to the experiment, model, and numerics. A geometry agnostic method is developed here and its parameter estimation performance is demonstrated for an air-backed micro-perforated panel in an impedance tube. Good agreement is found with predictions from the ISO standard two-microphone, impedance-tube method, and a theoretical model for the material. Data by-products exclusive to a Bayesian approach are analyzed to assess sensitivity of the method to nuisance parameters.
Qian, Song S; Lyons, Regan E
2006-10-01
We present a Bayesian approach for characterizing background contaminant concentration distributions using data from sites that may have been contaminated. Our method, focused on estimation, resolves several technical problems of the existing methods sanctioned by the U.S. Environmental Protection Agency (USEPA) (a hypothesis testing based method), resulting in a simple and quick procedure for estimating background contaminant concentrations. The proposed Bayesian method is applied to two data sets from a federal facility regulated under the Resource Conservation and Restoration Act. The results are compared to background distributions identified using existing methods recommended by the USEPA. The two data sets represent low and moderate levels of censorship in the data. Although an unbiased estimator is elusive, we show that the proposed Bayesian estimation method will have a smaller bias than the EPA recommended method.
Foster, Charles S P; Sauquet, Hervê; van der Merwe, Marlien; McPherson, Hannah; Rossetto, Maurizio; Ho, Simon Y W
2017-05-01
The evolutionary timescale of angiosperms has long been a key question in biology. Molecular estimates of this timescale have shown considerable variation, being influenced by differences in taxon sampling, gene sampling, fossil calibrations, evolutionary models, and choices of priors. Here, we analyze a data set comprising 76 protein-coding genes from the chloroplast genomes of 195 taxa spanning 86 families, including novel genome sequences for 11 taxa, to evaluate the impact of models, priors, and gene sampling on Bayesian estimates of the angiosperm evolutionary timescale. Using a Bayesian relaxed molecular-clock method, with a core set of 35 minimum and two maximum fossil constraints, we estimated that crown angiosperms arose 221 (251-192) Ma during the Triassic. Based on a range of additional sensitivity and subsampling analyses, we found that our date estimates were generally robust to large changes in the parameters of the birth-death tree prior and of the model of rate variation across branches. We found an exception to this when we implemented fossil calibrations in the form of highly informative gamma priors rather than as uniform priors on node ages. Under all other calibration schemes, including trials of seven maximum age constraints, we consistently found that the earliest divergences of angiosperm clades substantially predate the oldest fossils that can be assigned unequivocally to their crown group. Overall, our results and experiments with genome-scale data suggest that reliable estimates of the angiosperm crown age will require increased taxon sampling, significant methodological changes, and new information from the fossil record. [Angiospermae, chloroplast, genome, molecular dating, Triassic.]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Feng, Dai; Svetnik, Vladimir; Coimbra, Alexandre; Baumgartner, Richard
2014-01-01
The intraclass correlation coefficient (ICC) with fixed raters or, equivalently, the concordance correlation coefficient (CCC) for continuous outcomes is a widely accepted aggregate index of agreement in settings with small number of raters. Quantifying the precision of the CCC by constructing its confidence interval (CI) is important in early drug development applications, in particular in qualification of biomarker platforms. In recent years, there have been several new methods proposed for construction of CIs for the CCC, but their comprehensive comparison has not been attempted. The methods consisted of the delta method and jackknifing with and without Fisher's Z-transformation, respectively, and Bayesian methods with vague priors. In this study, we carried out a simulation study, with data simulated from multivariate normal as well as heavier tailed distribution (t-distribution with 5 degrees of freedom), to compare the state-of-the-art methods for assigning CI to the CCC. When the data are normally distributed, the jackknifing with Fisher's Z-transformation (JZ) tended to provide superior coverage and the difference between it and the closest competitor, the Bayesian method with the Jeffreys prior was in general minimal. For the nonnormal data, the jackknife methods, especially the JZ method, provided the coverage probabilities closest to the nominal in contrast to the others which yielded overly liberal coverage. Approaches based upon the delta method and Bayesian method with conjugate prior generally provided slightly narrower intervals and larger lower bounds than others, though this was offset by their poor coverage. Finally, we illustrated the utility of the CIs for the CCC in an example of a wake after sleep onset (WASO) biomarker, which is frequently used in clinical sleep studies of drugs for treatment of insomnia.
Inferring on the Intentions of Others by Hierarchical Bayesian Learning
Diaconescu, Andreea O.; Mathys, Christoph; Weber, Lilian A. E.; Daunizeau, Jean; Kasper, Lars; Lomakina, Ekaterina I.; Fehr, Ernst; Stephan, Klaas E.
2014-01-01
Inferring on others' (potentially time-varying) intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to “player” or “adviser” roles) interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI) scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i) employ hierarchical generative models to infer on the changing intentions of others, (ii) use volatility estimates to inform decision-making in social interactions, and (iii) integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition. PMID:25187943
2013-01-01
Background There is a rising public and political demand for prospective cancer cluster monitoring. But there is little empirical evidence on the performance of established cluster detection tests under conditions of small and heterogeneous sample sizes and varying spatial scales, such as are the case for most existing population-based cancer registries. Therefore this simulation study aims to evaluate different cluster detection methods, implemented in the open soure environment R, in their ability to identify clusters of lung cancer using real-life data from an epidemiological cancer registry in Germany. Methods Risk surfaces were constructed with two different spatial cluster types, representing a relative risk of RR = 2.0 or of RR = 4.0, in relation to the overall background incidence of lung cancer, separately for men and women. Lung cancer cases were sampled from this risk surface as geocodes using an inhomogeneous Poisson process. The realisations of the cancer cases were analysed within small spatial (census tracts, N = 1983) and within aggregated large spatial scales (communities, N = 78). Subsequently, they were submitted to the cluster detection methods. The test accuracy for cluster location was determined in terms of detection rates (DR), false-positive (FP) rates and positive predictive values. The Bayesian smoothing models were evaluated using ROC curves. Results With moderate risk increase (RR = 2.0), local cluster tests showed better DR (for both spatial aggregation scales > 0.90) and lower FP rates (both < 0.05) than the Bayesian smoothing methods. When the cluster RR was raised four-fold, the local cluster tests showed better DR with lower FPs only for the small spatial scale. At a large spatial scale, the Bayesian smoothing methods, especially those implementing a spatial neighbourhood, showed a substantially lower FP rate than the cluster tests. However, the risk increases at this scale were mostly diluted by data aggregation. Conclusion High resolution spatial scales seem more appropriate as data base for cancer cluster testing and monitoring than the commonly used aggregated scales. We suggest the development of a two-stage approach that combines methods with high detection rates as a first-line screening with methods of higher predictive ability at the second stage. PMID:24314148
Scheduling viability tests for seeds in long-term storage based on a Bayesian Multi-Level Model
USDA-ARS?s Scientific Manuscript database
Genebank managers conduct viability tests on stored seeds so they can replace lots that have viability near a critical threshold, such as 50 or 85% germination. Currently, these tests are typically scheduled at uniform intervals; testing every 5 years is common. A manager needs to balance the cost...
Persichetti, Maria Flaminia; Solano-Gallego, Laia; Vullo, Angela; Masucci, Marisa; Marty, Pierre; Delaunay, Pascal; Vitale, Fabrizio; Pennisi, Maria Grazia
2017-03-13
Anti-Leishmania antibodies are increasingly investigated in cats for epidemiological studies or for the diagnosis of clinical feline leishmaniosis. The immunofluorescent antibody test (IFAT), the enzyme-linked immunosorbent assay (ELISA) and western blot (WB) are the serological tests more frequently used. The aim of the present study was to assess diagnostic performance of IFAT, ELISA and WB to detect anti-L. infantum antibodies in feline serum samples obtained from endemic (n = 76) and non-endemic (n = 64) areas and from cats affected by feline leishmaniosis (n = 21) by a Bayesian approach without a gold standard. Cut-offs were set at 80 titre for IFAT and 40 ELISA units for ELISA. WB was considered positive in presence of at least a 18 KDa band. Statistical analysis was performed through a written routine with MATLAB software in the Bayesian framework. The latent data and observations from the joint posterior were simulated in the Bayesian approach by an iterative Markov Chain Monte Carlo technique using the Gibbs sampler for estimating sensitivity and specificity of the three tests. The median seroprevalence in the sample used for evaluating the performance of tests was estimated at 0.27 [credible interval (CI) = 0.20-0.34]. The median sensitivity of the three different methods was 0.97 (CI: 0.86-1.00), 0.75 (CI: 0.61-0.87) and 0.70 (CI: 0.56-0.83) for WB, IFAT and ELISA, respectively. Median specificity reached 0.99 (CI: 0.96-1.00) with WB, 0.97 (CI: 0.93-0.99) with IFAT and 0.98 (CI: 0.94-1.00) with ELISA. IFAT was more sensitive than ELISA (75 vs 70%) for the detection of subclinical infection while ELISA was better for diagnosing clinical leishmaniosis when compared with IFAT (98 vs 97%). The overall performance of all serological techniques was good and the most accurate test for anti-Leishmania antibody detection in feline serum samples was WB.
Cyber-T web server: differential analysis of high-throughput data.
Kayala, Matthew A; Baldi, Pierre
2012-07-01
The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.
Computational approaches to protein inference in shotgun proteomics
2012-01-01
Shotgun proteomics has recently emerged as a powerful approach to characterizing proteomes in biological samples. Its overall objective is to identify the form and quantity of each protein in a high-throughput manner by coupling liquid chromatography with tandem mass spectrometry. As a consequence of its high throughput nature, shotgun proteomics faces challenges with respect to the analysis and interpretation of experimental data. Among such challenges, the identification of proteins present in a sample has been recognized as an important computational task. This task generally consists of (1) assigning experimental tandem mass spectra to peptides derived from a protein database, and (2) mapping assigned peptides to proteins and quantifying the confidence of identified proteins. Protein identification is fundamentally a statistical inference problem with a number of methods proposed to address its challenges. In this review we categorize current approaches into rule-based, combinatorial optimization and probabilistic inference techniques, and present them using integer programing and Bayesian inference frameworks. We also discuss the main challenges of protein identification and propose potential solutions with the goal of spurring innovative research in this area. PMID:23176300
Costa, Mariellen C.; Oliveira, Paulo R. R.; Davanço, Paulo V.; de Camargo, Crisley; Laganaro, Natasha M.; Azeredo, Roberto A.; Simpson, James; Silveira, Luis F.
2017-01-01
The conservation of many endangered taxa relies on hybrid identification, and when hybrids become morphologically indistinguishable from the parental species, the use of molecular markers can assign individual admixture levels. Here, we present the puzzling case of the extinct in the wild Alagoas Curassow (Pauxi mitu), whose captive population descends from only three individuals. Hybridization with the Razor-billed Curassow (P. tuberosa) began more than eight generations ago, and admixture uncertainty affects the whole population. We applied an analysis framework that combined morphological diagnostic traits, Bayesian clustering analyses using 14 microsatellite loci, and mtDNA haplotypes to assess the ancestry of all individuals that were alive from 2008 to 2012. Simulated data revealed that our microsatellites could accurately assign an individual a hybrid origin until the second backcross generation, which permitted us to identify a pure group among the older, but still reproductive animals. No wild species has ever survived such a severe bottleneck, followed by hybridization, and studying the recovery capability of the selected pure Alagoas Curassow group might provide valuable insights into biological conservation theory. PMID:28056082
Costa, Mariellen C; Oliveira, Paulo R R; Davanço, Paulo V; Camargo, Crisley de; Laganaro, Natasha M; Azeredo, Roberto A; Simpson, James; Silveira, Luis F; Francisco, Mercival R
2017-01-01
The conservation of many endangered taxa relies on hybrid identification, and when hybrids become morphologically indistinguishable from the parental species, the use of molecular markers can assign individual admixture levels. Here, we present the puzzling case of the extinct in the wild Alagoas Curassow (Pauxi mitu), whose captive population descends from only three individuals. Hybridization with the Razor-billed Curassow (P. tuberosa) began more than eight generations ago, and admixture uncertainty affects the whole population. We applied an analysis framework that combined morphological diagnostic traits, Bayesian clustering analyses using 14 microsatellite loci, and mtDNA haplotypes to assess the ancestry of all individuals that were alive from 2008 to 2012. Simulated data revealed that our microsatellites could accurately assign an individual a hybrid origin until the second backcross generation, which permitted us to identify a pure group among the older, but still reproductive animals. No wild species has ever survived such a severe bottleneck, followed by hybridization, and studying the recovery capability of the selected pure Alagoas Curassow group might provide valuable insights into biological conservation theory.
A Non-parametric Cutout Index for Robust Evaluation of Identified Proteins*
Serang, Oliver; Paulo, Joao; Steen, Hanno; Steen, Judith A.
2013-01-01
This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctly identified. First, the method is used to evaluate protein sets chosen using different protein-level false discovery rate (FDR) thresholds, assigning each protein set a likelihood. The protein set assigned the highest likelihood is used to choose a non-arbitrary protein-level FDR threshold. Because the method can be used to evaluate any protein identification strategy (and is not limited to mere comparisons of different FDR thresholds), we subsequently use the method to compare and evaluate multiple simple methods for merging peptide evidence over replicate experiments. The general statistical approach can be applied to other types of data (e.g. RNA sequencing) and generalizes to multivariate problems. PMID:23292186
Bradbury, Ian R.; Hamilton, Lorraine C.; Rafferty, Sara; Meerburg, David; Poole, Rebecca; Dempson, J. Brian; Robertson, Martha J.; Reddin, David G.; Bourret, Vincent; Dionne, Mélanie; Chaput, Gerald J.; Sheehan, Timothy F.; King, Tim L.; Candy, John R.; Bernatchez, Louis
2014-01-01
Fisheries targeting mixtures of populations risk the over utilization of minor stock constituents unless harvests are monitored and managed. We evaluated stock composition and exploitation of Atlantic salmon in a subsistence fishery in coastal Labrador, Canada using genetic mixture analysis and individual assignment with a microsatellite baseline (15 loci, 11 829 individuals, 12 regional groups) encompassing the species western Atlantic range. Bayesian and maximum likelihood mixture analyses of fishery samples over six years (2006-2011; 1 772 individuals) indicate contributions of adjacent stocks of 96-97%. Estimates of fishery associated exploitation were highest for Labrador salmon (4.2-10.6% per year) and generally < 1% for other regions. Individual assignment of fishery samples indicated non-local contributions to the fishery (e.g., Quebec, Newfoundland) were rare and primarily in southern Labrador, consistent with migration pathways utilizing the Strait of Belle Isle. This work illustrates how genetic analysis of mixed stock Atlantic salmon fisheries in the northwest Atlantic using this new baseline can disentangle exploitation and reveal complex migratory behaviours.
Nowakowska, Marzena
2017-04-01
The development of the Bayesian logistic regression model classifying the road accident severity is discussed. The already exploited informative priors (method of moments, maximum likelihood estimation, and two-stage Bayesian updating), along with the original idea of a Boot prior proposal, are investigated when no expert opinion has been available. In addition, two possible approaches to updating the priors, in the form of unbalanced and balanced training data sets, are presented. The obtained logistic Bayesian models are assessed on the basis of a deviance information criterion (DIC), highest probability density (HPD) intervals, and coefficients of variation estimated for the model parameters. The verification of the model accuracy has been based on sensitivity, specificity and the harmonic mean of sensitivity and specificity, all calculated from a test data set. The models obtained from the balanced training data set have a better classification quality than the ones obtained from the unbalanced training data set. The two-stage Bayesian updating prior model and the Boot prior model, both identified with the use of the balanced training data set, outperform the non-informative, method of moments, and maximum likelihood estimation prior models. It is important to note that one should be careful when interpreting the parameters since different priors can lead to different models. Copyright © 2017 Elsevier Ltd. All rights reserved.
Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks
NASA Astrophysics Data System (ADS)
Sun, Wei; Chang, K. C.
2005-05-01
Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.
Espino-Hernandez, Gabriela; Gustafson, Paul; Burstyn, Igor
2011-05-14
In epidemiological studies explanatory variables are frequently subject to measurement error. The aim of this paper is to develop a Bayesian method to correct for measurement error in multiple continuous exposures in individually matched case-control studies. This is a topic that has not been widely investigated. The new method is illustrated using data from an individually matched case-control study of the association between thyroid hormone levels during pregnancy and exposure to perfluorinated acids. The objective of the motivating study was to examine the risk of maternal hypothyroxinemia due to exposure to three perfluorinated acids measured on a continuous scale. Results from the proposed method are compared with those obtained from a naive analysis. Using a Bayesian approach, the developed method considers a classical measurement error model for the exposures, as well as the conditional logistic regression likelihood as the disease model, together with a random-effect exposure model. Proper and diffuse prior distributions are assigned, and results from a quality control experiment are used to estimate the perfluorinated acids' measurement error variability. As a result, posterior distributions and 95% credible intervals of the odds ratios are computed. A sensitivity analysis of method's performance in this particular application with different measurement error variability was performed. The proposed Bayesian method to correct for measurement error is feasible and can be implemented using statistical software. For the study on perfluorinated acids, a comparison of the inferences which are corrected for measurement error to those which ignore it indicates that little adjustment is manifested for the level of measurement error actually exhibited in the exposures. Nevertheless, a sensitivity analysis shows that more substantial adjustments arise if larger measurement errors are assumed. In individually matched case-control studies, the use of conditional logistic regression likelihood as a disease model in the presence of measurement error in multiple continuous exposures can be justified by having a random-effect exposure model. The proposed method can be successfully implemented in WinBUGS to correct individually matched case-control studies for several mismeasured continuous exposures under a classical measurement error model.
Probabilistic numerical methods for PDE-constrained Bayesian inverse problems
NASA Astrophysics Data System (ADS)
Cockayne, Jon; Oates, Chris; Sullivan, Tim; Girolami, Mark
2017-06-01
This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significant solver error. Theoretical results are presented describing rates of convergence for the posteriors in both the forward and inverse problems. This method is tested on a challenging inverse problem with a nonlinear forward model.
Filipponi, A; Di Cicco, A; Principi, E
2012-12-01
A Bayesian data-analysis approach to data sets of maximum undercooling temperatures recorded in repeated melting-cooling cycles of high-purity samples is proposed. The crystallization phenomenon is described in terms of a nonhomogeneous Poisson process driven by a temperature-dependent sample nucleation rate J(T). The method was extensively tested by computer simulations and applied to real data for undercooled liquid Ge. It proved to be particularly useful in the case of scarce data sets where the usage of binned data would degrade the available experimental information.
A Bayesian approach to parameter and reliability estimation in the Poisson distribution.
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1972-01-01
For life testing procedures, a Bayesian analysis is developed with respect to a random intensity parameter in the Poisson distribution. Bayes estimators are derived for the Poisson parameter and the reliability function based on uniform and gamma prior distributions of that parameter. A Monte Carlo procedure is implemented to make possible an empirical mean-squared error comparison between Bayes and existing minimum variance unbiased, as well as maximum likelihood, estimators. As expected, the Bayes estimators have mean-squared errors that are appreciably smaller than those of the other two.
Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items
ERIC Educational Resources Information Center
Penfield, Randall D.
2006-01-01
This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…
Do Staphylococcus epidermidis Genetic Clusters Predict Isolation Sources?
Tolo, Isaiah; Thomas, Jonathan C.; Fischer, Rebecca S. B.; Brown, Eric L.; Gray, Barry M.
2016-01-01
Staphylococcus epidermidis is a ubiquitous colonizer of human skin and a common cause of medical device-associated infections. The extent to which the population genetic structure of S. epidermidis distinguishes commensal from pathogenic isolates is unclear. Previously, Bayesian clustering of 437 multilocus sequence types (STs) in the international database revealed a population structure of six genetic clusters (GCs) that may reflect the species' ecology. Here, we first verified the presence of six GCs, including two (GC3 and GC5) with significant admixture, in an updated database of 578 STs. Next, a single nucleotide polymorphism (SNP) assay was developed that accurately assigned 545 (94%) of 578 STs to GCs. Finally, the hypothesis that GCs could distinguish isolation sources was tested by SNP typing and GC assignment of 154 isolates from hospital patients with bacteremia and those with blood culture contaminants and from nonhospital carriage. GC5 was isolated almost exclusively from hospital sources. GC1 and GC6 were isolated from all sources but were overrepresented in isolates from nonhospital and infection sources, respectively. GC2, GC3, and GC4 were relatively rare in this collection. No association was detected between fdh-positive isolates (GC2 and GC4) and nonhospital sources. Using a machine learning algorithm, GCs predicted hospital and nonhospital sources with 80% accuracy and predicted infection and contaminant sources with 45% accuracy, which was comparable to the results seen with a combination of five genetic markers (icaA, IS256, sesD [bhp], mecA, and arginine catabolic mobile element [ACME]). Thus, analysis of population structure with subgenomic data shows the distinction of hospital and nonhospital sources and the near-inseparability of sources within a hospital. PMID:27076664
Yao, Guiqing Lily; Novielli, Nicola; Manaseki-Holland, Semira; Chen, Yen-Fu; van der Klink, Marcel; Barach, Paul; Chilton, Peter J; Lilford, Richard J
2012-12-01
We developed a method to estimate the expected cost-effectiveness of a service intervention at the design stage and 'road-tested' the method on an intervention to improve patient handover of care between hospital and community. The development of a nine-step evaluation framework: 1. Identification of multiple endpoints and arranging them into manageable groups; 2. Estimation of baseline overall and preventable risk; 3. Bayesian elicitation of expected effectiveness of the planned intervention; 4. Assigning utilities to groups of endpoints; 5. Costing the intervention; 6. Estimating health service costs associated with preventable adverse events; 7. Calculating health benefits; 8. Cost-effectiveness calculation; 9. Sensitivity and headroom analysis. Literature review suggested that adverse events follow 19% of patient discharges, and that one-third are preventable by improved handover (ie, 6.3% of all discharges). The intervention to improve handover would reduce the incidence of adverse events by 21% (ie, from 6.3% to 4.7%) according to the elicitation exercise. Potentially preventable adverse events were classified by severity and duration. Utilities were assigned to each category of adverse event. The costs associated with each category of event were obtained from the literature. The unit cost of the intervention was €16.6, which would yield a Quality Adjusted Life Year (QALY) gain per discharge of 0.010. The resulting cost saving was €14.3 per discharge. The intervention is cost-effective at approximately €214 per QALY under the base case, and remains cost-effective while the effectiveness is greater than 1.6%. We offer a usable framework to assist in ex ante health economic evaluations of health service interventions.
Vandergast, Amy; Wood, Dustin A.; Thompson, Andrew R.; Fisher, Mark; Barrows, Cameron W.; Grant, Tyler J.
2016-01-01
Aim The frequency and severity of habitat alterations and disturbance are predicted to increase in upcoming decades, and understanding how disturbance affects population integrity is paramount for adaptive management. Although rarely is population genetic sampling conducted at multiple time points, pre- and post-disturbance comparisons may provide one of the clearest methods to measure these impacts. We examined how genetic properties of the federally threatened Coachella Valley fringe-toed lizard (Uma inornata) responded to severe drought and habitat fragmentation across its range. Location Coachella Valley, California, USA. Methods We used 11 microsatellites to examine population genetic structure and diversity in 1996 and 2008, before and after a historic drought. We used Bayesian assignment methods and F-statistics to estimate genetic structure. We compared allelic richness across years to measure loss of genetic diversity and employed approximate Bayesian computing methods and heterozygote excess tests to explore the recent demographic history of populations. Finally, we compared effective population size across years and to abundance estimates to determine whether diversity remained low despite post-drought recovery. Results Genetic structure increased between sampling periods, likely as a result of population declines during the historic drought of the late 1990s–early 2000s, and habitat loss and fragmentation that precluded post-drought genetic rescue. Simulations supported recent demographic declines in 3 of 4 main preserves, and in one preserve, we detected significant loss of allelic richness. Effective population sizes were generally low across the range, with estimates ≤100 in most sites. Main conclusions Fragmentation and drought appear to have acted synergistically to induce genetic change over a short time frame. Progressive deterioration of connectivity, low Ne and measurable loss of genetic diversity suggest that conservation efforts have not maintained the genetic integrity of this species. Genetic sampling over time can help evaluate population trends to guide management.
Mukesh; Kumar, Ved P; Sharma, Lalit K; Shukla, Malay; Sathyakumar, Sambandam
2015-01-01
The hangul (Cervus elaphus hanglu) is of great conservation concern because it represents the easternmost and only hope for an Asiatic survivor of the red deer species in the Indian subcontinent. Despite the rigorous conservation efforts of the Department of Wildlife Protection in Jammu & Kashmir, the hangul population has experienced a severe decline in numbers and range contraction in the past few decades. The hangul population once abundant in the past has largely become confined to the Dachigam landscape, with a recent population estimate of 218 individuals. We investigated the genetic variability and demographic history of the hangul population and found that it has shown a relatively low diversity estimates when compared to other red deer populations of the world. Neutrality tests, which are used to evaluate demographic effects, did not support population expansion, and the multimodal pattern of mismatch distribution indicated that the hangul population is under demographic equilibrium. Furthermore, the hangul population did not exhibit any signature of bottleneck footprints in the past, and Coalescent Bayesian Skyline plot analysis revealed that the population had not experienced any dramatic changes in the effective population size over the last several thousand years. We observed a strong evidence of sub-structuring in the population, wherein the majority of individuals were assigned to different clusters in Bayesian cluster analysis. Population viability analysis demonstrated insignificant changes in the mean population size, with a positive growth rate projected for the next hundred years. We discuss the phylogenetic status of hangul for the first time among the other red deer subspecies of the world and strongly recommend to upgrade hangul conservation status under IUCN that should be discrete from the other red deer subspecies of the world to draw more conservation attention from national and international bodies.
Uncertainty Quantification of Hypothesis Testing for the Integrated Knowledge Engine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cuellar, Leticia
2012-05-31
The Integrated Knowledge Engine (IKE) is a tool of Bayesian analysis, based on Bayesian Belief Networks or Bayesian networks for short. A Bayesian network is a graphical model (directed acyclic graph) that allows representing the probabilistic structure of many variables assuming a localized type of dependency called the Markov property. The Markov property in this instance makes any node or random variable to be independent of any non-descendant node given information about its parent. A direct consequence of this property is that it is relatively easy to incorporate new evidence and derive the appropriate consequences, which in general is notmore » an easy or feasible task. Typically we use Bayesian networks as predictive models for a small subset of the variables, either the leave nodes or the root nodes. In IKE, since most applications deal with diagnostics, we are interested in predicting the likelihood of the root nodes given new observations on any of the children nodes. The root nodes represent the various possible outcomes of the analysis, and an important problem is to determine when we have gathered enough evidence to lean toward one of these particular outcomes. This document presents criteria to decide when the evidence gathered is sufficient to draw a particular conclusion or decide in favor of a particular outcome by quantifying the uncertainty in the conclusions that are drawn from the data. The material in this document is organized as follows: Section 2 presents briefly a forensics Bayesian network, and we explore evaluating the information provided by new evidence by looking first at the posterior distribution of the nodes of interest, and then at the corresponding posterior odds ratios. Section 3 presents a third alternative: Bayes Factors. In section 4 we finalize by showing the relation between the posterior odds ratios and Bayes factors and showing examples these cases, and in section 5 we conclude by providing clear guidelines of how to use these for the type of Bayesian networks used in IKE.« less
Bayesian energy landscape tilting: towards concordant models of molecular ensembles.
Beauchamp, Kyle A; Pande, Vijay S; Das, Rhiju
2014-03-18
Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Calculating shock arrival in expansion tubes and shock tunnels using Bayesian changepoint analysis
NASA Astrophysics Data System (ADS)
James, Christopher M.; Bourke, Emily J.; Gildfind, David E.
2018-06-01
To understand the flow conditions generated in expansion tubes and shock tunnels, shock speeds are generally calculated based on shock arrival times at high-frequency wall-mounted pressure transducers. These calculations require that the shock arrival times are obtained accurately. This can be non-trivial for expansion tubes especially because pressure rises may be small and shock speeds high. Inaccurate shock arrival times can be a significant source of uncertainty. To help address this problem, this paper investigates two separate but complimentary techniques. Principally, it proposes using a Bayesian changepoint detection method to automatically calculate shock arrival, potentially reducing error and simplifying the shock arrival finding process. To compliment this, a technique for filtering the raw data without losing the shock arrival time is also presented and investigated. To test the validity of the proposed techniques, tests are performed using both a theoretical step change with different levels of noise and real experimental data. It was found that with conditions added to ensure that a real shock arrival time was found, the Bayesian changepoint analysis method was able to automatically find the shock arrival time, even for noisy signals.
A large scale test of the gaming-enhancement hypothesis.
Przybylski, Andrew K; Wang, John C
2016-01-01
A growing research literature suggests that regular electronic game play and game-based training programs may confer practically significant benefits to cognitive functioning. Most evidence supporting this idea, the gaming-enhancement hypothesis , has been collected in small-scale studies of university students and older adults. This research investigated the hypothesis in a general way with a large sample of 1,847 school-aged children. Our aim was to examine the relations between young people's gaming experiences and an objective test of reasoning performance. Using a Bayesian hypothesis testing approach, evidence for the gaming-enhancement and null hypotheses were compared. Results provided no substantive evidence supporting the idea that having preference for or regularly playing commercially available games was positively associated with reasoning ability. Evidence ranged from equivocal to very strong in support for the null hypothesis over what was predicted. The discussion focuses on the value of Bayesian hypothesis testing for investigating electronic gaming effects, the importance of open science practices, and pre-registered designs to improve the quality of future work.
Ellis, Ian O.; Green, Andrew R.; Hanka, Rudolf
2008-01-01
Background We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. Methodology/Principal Findings We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. Conclusions/Significance Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. PMID:18698346
Determination of SB2 masses and age: introduction of the mass ratio in the Bayesian analysis
NASA Astrophysics Data System (ADS)
Giarrusso, M.; Leone, F.; Tognelli, E.; Degl'Innocenti, S.; Prada Moroni, P. G.
2018-04-01
Stellar age assignment still represents a difficult task in Astrophysics. This unobservable fundamental parameter can be estimated only through indirect methods, as well as generally the mass. Bayesian analysis is a statistical approach largely used to derive stellar properties by taking into account the available information about the quantities we are looking for. In this paper we propose to apply the method to the double-lined spectroscopic binaries (SB2), for which the only available information about masses is the observed mass ratio of the two components. We validated the method on a synthetic sample of Pre-Main Sequence (PMS) SB2 systems showing the capability of the technique to recover the simulated age and masses. Then, we applied our procedure to the PMS eclipsing binaries Parenago 1802 and RX J0529.4+0041 A, whose masses of both components are known, by treating them as SB2 systems. The estimated masses are in agreement with those dynamically measured. We conclude that the method, if based on high resolution and high signal-to-noise spectroscopy, represents a robust way to infer the masses of the very numerous SB2 systems together with their age, allowing to date the hosting astrophysical environments.
Working memory training in older adults: Bayesian evidence supporting the absence of transfer.
Guye, Sabrina; von Bastian, Claudia C
2017-12-01
The question of whether working memory training leads to generalized improvements in untrained cognitive abilities is a longstanding and heatedly debated one. Previous research provides mostly ambiguous evidence regarding the presence or absence of transfer effects in older adults. Thus, to draw decisive conclusions regarding the effectiveness of working memory training interventions, methodologically sound studies with larger sample sizes are needed. In this study, we investigated whether or not a computer-based working memory training intervention induced near and far transfer in a large sample of 142 healthy older adults (65 to 80 years). Therefore, we randomly assigned participants to either the experimental group, which completed 25 sessions of adaptive, process-based working memory training, or to the active, adaptive visual search control group. Bayesian linear mixed-effects models were used to estimate performance improvements on the level of abilities, using multiple indicator tasks for near (working memory) and far transfer (fluid intelligence, shifting, and inhibition). Our data provided consistent evidence supporting the absence of near transfer to untrained working memory tasks and the absence of far transfer effects to all of the assessed abilities. Our results suggest that working memory training is not an effective way to improve general cognitive functioning in old age. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Determination of SB2 masses and age: introduction of the mass ratio in the Bayesian analysis
NASA Astrophysics Data System (ADS)
Giarrusso, M.; Leone, F.; Tognelli, E.; Degl'Innocenti, S.; Prada Moroni, P. G.
2018-07-01
Stellar age assignment still represents a difficult task in Astrophysics. This unobservable fundamental parameter can be estimated only through indirect methods, as well as generally the mass. Bayesian analysis is a statistical approach largely used to derive stellar properties by taking into account the available information about the quantities we are looking for. In this paper, we propose to apply the method to the double-lined spectroscopic binaries (SB2), for which the only available information about masses is the observed mass ratio of the two components. We validated the method on a synthetic sample of pre-main-sequence (PMS) SB2 systems showing the capability of the technique to recover the simulated age and masses. Then, we applied our procedure to the PMS eclipsing binaries Parenago 1802 and RX J0529.4+0041 A, whose masses of both components are known, by treating them as SB2 systems. The estimated masses are in agreement with those dynamically measured. We conclude that the method, if based on high resolution and high signal-to-noise spectroscopy, represents a robust way to infer the masses of the very numerous SB2 systems together with their age, allowing to date the hosting astrophysical environments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jesus, J.F.; Valentim, R.; Andrade-Oliveira, F., E-mail: jfjesus@itapeva.unesp.br, E-mail: valentim.rodolfo@unifesp.br, E-mail: felipe.oliveira@port.ac.uk
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γmore » = 3α H {sub 0} model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.« less
Bayesian component separation: The Planck experience
NASA Astrophysics Data System (ADS)
Wehus, Ingunn Kathrine; Eriksen, Hans Kristian
2018-05-01
Bayesian component separation techniques have played a central role in the data reduction process of Planck. The most important strength of this approach is its global nature, in which a parametric and physical model is fitted to the data. Such physical modeling allows the user to constrain very general data models, and jointly probe cosmological, astrophysical and instrumental parameters. This approach also supports statistically robust goodness-of-fit tests in terms of data-minus-model residual maps, which are essential for identifying residual systematic effects in the data. The main challenges are high code complexity and computational cost. Whether or not these costs are justified for a given experiment depends on its final uncertainty budget. We therefore predict that the importance of Bayesian component separation techniques is likely to increase with time for intensity mapping experiments, similar to what has happened in the CMB field, as observational techniques mature, and their overall sensitivity improves.
Confirmatory Factor Analysis Alternative: Free, Accessible CBID Software.
Bott, Marjorie; Karanevich, Alex G; Garrard, Lili; Price, Larry R; Mudaranthakam, Dinesh Pal; Gajewski, Byron
2018-02-01
New software that performs Classical and Bayesian Instrument Development (CBID) is reported that seamlessly integrates expert (content validity) and participant data (construct validity) to produce entire reliability estimates with smaller sample requirements. The free CBID software can be accessed through a website and used by clinical investigators in new instrument development. Demonstrations are presented of the three approaches using the CBID software: (a) traditional confirmatory factor analysis (CFA), (b) Bayesian CFA using flat uninformative prior, and (c) Bayesian CFA using content expert data (informative prior). Outcomes of usability testing demonstrate the need to make the user-friendly, free CBID software available to interdisciplinary researchers. CBID has the potential to be a new and expeditious method for instrument development, adding to our current measurement toolbox. This allows for the development of new instruments for measuring determinants of health in smaller diverse populations or populations of rare diseases.
Structure Learning in Bayesian Sensorimotor Integration
Genewein, Tim; Hez, Eduard; Razzaghpanah, Zeynab; Braun, Daniel A.
2015-01-01
Previous studies have shown that sensorimotor processing can often be described by Bayesian learning, in particular the integration of prior and feedback information depending on its degree of reliability. Here we test the hypothesis that the integration process itself can be tuned to the statistical structure of the environment. We exposed human participants to a reaching task in a three-dimensional virtual reality environment where we could displace the visual feedback of their hand position in a two dimensional plane. When introducing statistical structure between the two dimensions of the displacement, we found that over the course of several days participants adapted their feedback integration process in order to exploit this structure for performance improvement. In control experiments we found that this adaptation process critically depended on performance feedback and could not be induced by verbal instructions. Our results suggest that structural learning is an important meta-learning component of Bayesian sensorimotor integration. PMID:26305797
To P or Not to P: Backing Bayesian Statistics.
Buchinsky, Farrel J; Chadha, Neil K
2017-12-01
In biomedical research, it is imperative to differentiate chance variation from truth before we generalize what we see in a sample of subjects to the wider population. For decades, we have relied on null hypothesis significance testing, where we calculate P values for our data to decide whether to reject a null hypothesis. This methodology is subject to substantial misinterpretation and errant conclusions. Instead of working backward by calculating the probability of our data if the null hypothesis were true, Bayesian statistics allow us instead to work forward, calculating the probability of our hypothesis given the available data. This methodology gives us a mathematical means of incorporating our "prior probabilities" from previous study data (if any) to produce new "posterior probabilities." Bayesian statistics tell us how confidently we should believe what we believe. It is time to embrace and encourage their use in our otolaryngology research.
Characterizing reliability in a product/process design-assurance program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerscher, W.J. III; Booker, J.M.; Bement, T.R.
1997-10-01
Over the years many advancing techniques in the area of reliability engineering have surfaced in the military sphere of influence, and one of these techniques is Reliability Growth Testing (RGT). Private industry has reviewed RGT as part of the solution to their reliability concerns, but many practical considerations have slowed its implementation. It`s objective is to demonstrate the reliability requirement of a new product with a specified confidence. This paper speaks directly to that objective but discusses a somewhat different approach to achieving it. Rather than conducting testing as a continuum and developing statistical confidence bands around the results, thismore » Bayesian updating approach starts with a reliability estimate characterized by large uncertainty and then proceeds to reduce the uncertainty by folding in fresh information in a Bayesian framework.« less
Predicting Football Matches Results using Bayesian Networks for English Premier League (EPL)
NASA Astrophysics Data System (ADS)
Razali, Nazim; Mustapha, Aida; Yatim, Faiz Ahmad; Aziz, Ruhaya Ab
2017-08-01
The issues of modeling asscoiation football prediction model has become increasingly popular in the last few years and many different approaches of prediction models have been proposed with the point of evaluating the attributes that lead a football team to lose, draw or win the match. There are three types of approaches has been considered for predicting football matches results which include statistical approaches, machine learning approaches and Bayesian approaches. Lately, many studies regarding football prediction models has been produced using Bayesian approaches. This paper proposes a Bayesian Networks (BNs) to predict the results of football matches in term of home win (H), away win (A) and draw (D). The English Premier League (EPL) for three seasons of 2010-2011, 2011-2012 and 2012-2013 has been selected and reviewed. K-fold cross validation has been used for testing the accuracy of prediction model. The required information about the football data is sourced from a legitimate site at http://www.football-data.co.uk. BNs achieved predictive accuracy of 75.09% in average across three seasons. It is hoped that the results could be used as the benchmark output for future research in predicting football matches results.
2010-01-01
Background Methods for the calculation and application of quantitative electromyographic (EMG) statistics for the characterization of EMG data detected from forearm muscles of individuals with and without pain associated with repetitive strain injury are presented. Methods A classification procedure using a multi-stage application of Bayesian inference is presented that characterizes a set of motor unit potentials acquired using needle electromyography. The utility of this technique in characterizing EMG data obtained from both normal individuals and those presenting with symptoms of "non-specific arm pain" is explored and validated. The efficacy of the Bayesian technique is compared with simple voting methods. Results The aggregate Bayesian classifier presented is found to perform with accuracy equivalent to that of majority voting on the test data, with an overall accuracy greater than 0.85. Theoretical foundations of the technique are discussed, and are related to the observations found. Conclusions Aggregation of motor unit potential conditional probability distributions estimated using quantitative electromyographic analysis, may be successfully used to perform electrodiagnostic characterization of "non-specific arm pain." It is expected that these techniques will also be able to be applied to other types of electrodiagnostic data. PMID:20156353
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
NASA Astrophysics Data System (ADS)
Fox, Neil I.; Micheas, Athanasios C.; Peng, Yuqiang
2016-07-01
This paper introduces the use of Bayesian full Procrustes shape analysis in object-oriented meteorological applications. In particular, the Procrustes methodology is used to generate mean forecast precipitation fields from a set of ensemble forecasts. This approach has advantages over other ensemble averaging techniques in that it can produce a forecast that retains the morphological features of the precipitation structures and present the range of forecast outcomes represented by the ensemble. The production of the ensemble mean avoids the problems of smoothing that result from simple pixel or cell averaging, while producing credible sets that retain information on ensemble spread. Also in this paper, the full Bayesian Procrustes scheme is used as an object verification tool for precipitation forecasts. This is an extension of a previously presented Procrustes shape analysis based verification approach into a full Bayesian format designed to handle the verification of precipitation forecasts that match objects from an ensemble of forecast fields to a single truth image. The methodology is tested on radar reflectivity nowcasts produced in the Warning Decision Support System - Integrated Information (WDSS-II) by varying parameters in the K-means cluster tracking scheme.
Nguyen, Hoang T; Merriman, Tony R; Black, Michael A
2014-01-01
Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.
Porter, Teresita M.; Golding, G. Brian
2012-01-01
Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve Bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50–100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys. PMID:22558215
ERIC Educational Resources Information Center
Wilcox, Rand R.; Serang, Sarfaraz
2017-01-01
The article provides perspectives on p values, null hypothesis testing, and alternative techniques in light of modern robust statistical methods. Null hypothesis testing and "p" values can provide useful information provided they are interpreted in a sound manner, which includes taking into account insights and advances that have…
The Use of Time Series Analysis and t Tests with Serially Correlated Data Tests.
ERIC Educational Resources Information Center
Nicolich, Mark J.; Weinstein, Carol S.
1981-01-01
Results of three methods of analysis applied to simulated autocorrelated data sets with an intervention point (varying in autocorrelation degree, variance of error term, and magnitude of intervention effect) are compared and presented. The three methods are: t tests; maximum likelihood Box-Jenkins (ARIMA); and Bayesian Box Jenkins. (Author/AEF)
ERIC Educational Resources Information Center
de la Torre, Jimmy; Patz, Richard J.
2005-01-01
This article proposes a practical method that capitalizes on the availability of information from multiple tests measuring correlated abilities given in a single test administration. By simultaneously estimating different abilities with the use of a hierarchical Bayesian framework, more precise estimates for each ability dimension are obtained.…
ERIC Educational Resources Information Center
Lee, Soo; Suh, Youngsuk
2018-01-01
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
NASA Astrophysics Data System (ADS)
Chen, X.; Murakami, H.; Hahn, M. S.; Hammond, G. E.; Rockhold, M. L.; Rubin, Y.
2010-12-01
Tracer testing under natural or forced gradient flow provides useful information for characterizing subsurface properties, by monitoring and modeling the tracer plume migration in a heterogeneous aquifer. At the Hanford 300 Area, non-reactive tracer experiments, in addition to constant-rate injection tests and electromagnetic borehole flowmeter (EBF) profiling, were conducted to characterize the heterogeneous hydraulic conductivity field. A Bayesian data assimilation technique, method of anchored distributions (MAD), is applied to assimilate the experimental tracer test data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of the Hanford formation. In this study, the prior information of the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using the constant-rate injection tests and the EBF data. The posterior distribution of the random field is obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. The parallel three-dimensional flow and transport code PFLOTRAN is implemented to cope with the highly transient flow boundary conditions at the site and to meet the computational demand of the proposed method. The validation results show that the field conditioned on the tracer test data better reproduces the tracer transport behavior compared to the field characterized previously without the tracer test data. A synthetic study proves that the proposed method can effectively assimilate tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. These characterization results will improve conceptual models developed for the site, including reactive transport models. The study successfully demonstrates the capability of MAD to assimilate multi-scale multi-type field data within a consistent Bayesian framework. The MAD framework can potentially be applied to combine geophysical data with other types of data in site characterization.
Leontaridou, Maria; Gabbert, Silke; Van Ierland, Ekko C; Worth, Andrew P; Landsiedel, Robert
2016-07-01
This paper offers a Bayesian Value-of-Information (VOI) analysis for guiding the development of non-animal testing strategies, balancing information gains from testing with the expected social gains and costs from the adoption of regulatory decisions. Testing is assumed to have value, if, and only if, the information revealed from testing triggers a welfare-improving decision on the use (or non-use) of a substance. As an illustration, our VOI model is applied to a set of five individual non-animal prediction methods used for skin sensitisation hazard assessment, seven battery combinations of these methods, and 236 sequential 2-test and 3-test strategies. Their expected values are quantified and compared to the expected value of the local lymph node assay (LLNA) as the animal method. We find that battery and sequential combinations of non-animal prediction methods reveal a significantly higher expected value than the LLNA. This holds for the entire range of prior beliefs. Furthermore, our results illustrate that the testing strategy with the highest expected value does not necessarily have to follow the order of key events in the sensitisation adverse outcome pathway (AOP). 2016 FRAME.
Fosgate, G T; Petzer, I M; Karzis, J
2013-04-01
Screening tests for mastitis can play an important role in proactive mastitis control programs. The primary objective of this study was to compare the sensitivity and specificity of milk electrical conductivity (EC) to the California mastitis test (CMT) in commercial dairy cattle in South Africa using Bayesian methods without a perfect reference test. A total of 1848 quarter milk specimens were collected from 173 cows sampled during six sequential farm visits. Of these samples, 25.8% yielded pathogenic bacterial isolates. The most frequently isolated species were coagulase negative Staphylococci (n=346), Streptococcus agalactiae (n=54), and Staphylococcus aureus (n=42). The overall cow-level prevalence of mastitis was 54% based on the Bayesian latent class (BLC) analysis. The CMT was more accurate than EC for classification of cows having somatic cell counts >200,000/mL and for isolation of a bacterial pathogen. BLC analysis also suggested an overall benefit of CMT over EC but the statistical evidence was not strong (P=0.257). The Bayesian model estimated the sensitivity and specificity of EC (measured via resistance) at a cut-point of >25 mΩ/cm to be 89.9% and 86.8%, respectively. The CMT had a sensitivity and specificity of 94.5% and 77.7%, respectively, when evaluated at the weak positive cut-point. EC was useful for identifying milk specimens harbouring pathogens but was not able to differentiate among evaluated bacterial isolates. Screening tests can be used to improve udder health as part of a proactive management plan. Copyright © 2012 Elsevier Ltd. All rights reserved.
The frequentist implications of optional stopping on Bayesian hypothesis tests.
Sanborn, Adam N; Hills, Thomas T
2014-04-01
Null hypothesis significance testing (NHST) is the most commonly used statistical methodology in psychology. The probability of achieving a value as extreme or more extreme than the statistic obtained from the data is evaluated, and if it is low enough, the null hypothesis is rejected. However, because common experimental practice often clashes with the assumptions underlying NHST, these calculated probabilities are often incorrect. Most commonly, experimenters use tests that assume that sample sizes are fixed in advance of data collection but then use the data to determine when to stop; in the limit, experimenters can use data monitoring to guarantee that the null hypothesis will be rejected. Bayesian hypothesis testing (BHT) provides a solution to these ills because the stopping rule used is irrelevant to the calculation of a Bayes factor. In addition, there are strong mathematical guarantees on the frequentist properties of BHT that are comforting for researchers concerned that stopping rules could influence the Bayes factors produced. Here, we show that these guaranteed bounds have limited scope and often do not apply in psychological research. Specifically, we quantitatively demonstrate the impact of optional stopping on the resulting Bayes factors in two common situations: (1) when the truth is a combination of the hypotheses, such as in a heterogeneous population, and (2) when a hypothesis is composite-taking multiple parameter values-such as the alternative hypothesis in a t-test. We found that, for these situations, while the Bayesian interpretation remains correct regardless of the stopping rule used, the choice of stopping rule can, in some situations, greatly increase the chance of experimenters finding evidence in the direction they desire. We suggest ways to control these frequentist implications of stopping rules on BHT.
Internal Medicine residents use heuristics to estimate disease probability.
Phang, Sen Han; Ravani, Pietro; Schaefer, Jeffrey; Wright, Bruce; McLaughlin, Kevin
2015-01-01
Training in Bayesian reasoning may have limited impact on accuracy of probability estimates. In this study, our goal was to explore whether residents previously exposed to Bayesian reasoning use heuristics rather than Bayesian reasoning to estimate disease probabilities. We predicted that if residents use heuristics then post-test probability estimates would be increased by non-discriminating clinical features or a high anchor for a target condition. We randomized 55 Internal Medicine residents to different versions of four clinical vignettes and asked them to estimate probabilities of target conditions. We manipulated the clinical data for each vignette to be consistent with either 1) using a representative heuristic, by adding non-discriminating prototypical clinical features of the target condition, or 2) using anchoring with adjustment heuristic, by providing a high or low anchor for the target condition. When presented with additional non-discriminating data the odds of diagnosing the target condition were increased (odds ratio (OR) 2.83, 95% confidence interval [1.30, 6.15], p = 0.009). Similarly, the odds of diagnosing the target condition were increased when a high anchor preceded the vignette (OR 2.04, [1.09, 3.81], p = 0.025). Our findings suggest that despite previous exposure to the use of Bayesian reasoning, residents use heuristics, such as the representative heuristic and anchoring with adjustment, to estimate probabilities. Potential reasons for attribute substitution include the relative cognitive ease of heuristics vs. Bayesian reasoning or perhaps residents in their clinical practice use gist traces rather than precise probability estimates when diagnosing.
ERIC Educational Resources Information Center
McLeod, Lori D.; Lewis, Charles; Thissen, David.
With the increased use of computerized adaptive testing, which allows for continuous testing, new concerns about test security have evolved, one being the assurance that items in an item pool are safeguarded from theft. In this paper, the risk of score inflation and procedures to detect test takers using item preknowledge are explored. When test…
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
A Rational Analysis of the Selection Task as Optimal Data Selection.
ERIC Educational Resources Information Center
Oaksford, Mike; Chater, Nick
1994-01-01
Experimental data on human reasoning in hypothesis-testing tasks is reassessed in light of a Bayesian model of optimal data selection in inductive hypothesis testing. The rational analysis provided by the model suggests that reasoning in such tasks may be rational rather than subject to systematic bias. (SLD)
Estimating and Testing Mediation Effects with Censored Data
ERIC Educational Resources Information Center
Wang, Lijuan; Zhang, Zhiyong
2011-01-01
This study investigated influences of censored data on mediation analysis. Mediation effect estimates can be biased and inefficient with censoring on any one of the input, mediation, and output variables. A Bayesian Tobit approach was introduced to estimate and test mediation effects with censored data. Simulation results showed that the Bayesian…
Bayesian Estimation of Multi-Unidimensional Graded Response IRT Models
ERIC Educational Resources Information Center
Kuo, Tzu-Chun
2015-01-01
Item response theory (IRT) has gained an increasing popularity in large-scale educational and psychological testing situations because of its theoretical advantages over classical test theory. Unidimensional graded response models (GRMs) are useful when polytomous response items are designed to measure a unified latent trait. They are limited in…
Testing Adaptive Toolbox Models: A Bayesian Hierarchical Approach
ERIC Educational Resources Information Center
Scheibehenne, Benjamin; Rieskamp, Jorg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox…
Meija, Juris; Chartrand, Michelle M G
2018-01-01
Isotope delta measurements are normalized against international reference standards. Although multi-point normalization is becoming a standard practice, the existing uncertainty evaluation practices are either undocumented or are incomplete. For multi-point normalization, we present errors-in-variables regression models for explicit accounting of the measurement uncertainty of the international standards along with the uncertainty that is attributed to their assigned values. This manuscript presents framework to account for the uncertainty that arises due to a small number of replicate measurements and discusses multi-laboratory data reduction while accounting for inevitable correlations between the laboratories due to the use of identical reference materials for calibration. Both frequentist and Bayesian methods of uncertainty analysis are discussed.
NASA Astrophysics Data System (ADS)
Camara, Johanna E.; Duewer, David L.; Gasca Aragon, Hugo; Lippa, Katrice A.; Toman, Blaza
2013-01-01
The 2009 CCQM-K80 'Comparison of value-assigned CRMs and PT materials: creatinine in human serum' is the first in a series of key comparisons directly testing the chemical measurement services provided to customers by National Metrology Institutes (NMIs) and Designated Institutes. CCQM-K80 compared the assigned serum creatinine values of certified reference materials (CRMs) using measurements made on these materials under repeatability conditions. Six NMIs submitted 17 CRM materials for evaluation, all intended for sale to customers. These materials represent nearly all of the higher-order CRMs then available for this clinically important measurand. The certified creatinine mass fraction in the materials ranged from 3 mg/kg to 57 mg/kg. All materials were stored and prepared according the specifications provided by each NMI. Samples were processed and analyzed under repeatability conditions by one analyst using isotope dilution liquid chromatography-mass spectrometry. The instrumental repeatability imprecision, expressed as a percent relative standard deviation, was 1.2%. Given the number of materials and the time required for each analysis, the measurements were made in two measurement campaigns ('runs'). In both campaigns, replicate analyses (two injections of one preparation separated in time) were made on each of two or three independently prepared aliquots from one randomly selected unit of each of the 17 materials. The mean value, between-campaign, between-aliquot and between-replicate variance components, standard uncertainty of the mean value, and the number of degrees of freedom associated with the standard uncertainty were estimated using a linear mixed model. Since several of the uncertainties estimated using this traditional frequentist approach were associated with a single degree of freedom, Markov Chain Monte Carlo Bayesian analysis was used to estimate 95% level-of-confidence coverage intervals, U95. Uncertainty-weighted generalized distance regression was used to establish the key comparison reference function (KCRF) relating the assigned values to the repeatability measurements. Parametric bootstrap Monte Carlo was used to estimate 95% level-of-confidence coverage intervals for the degrees of equivalence of materials, d +/- U95(d), and of the participating NMIs, D +/- U95(D). Because of the wide range of creatinine mass fraction in the materials, these degrees of equivalence are expressed in percent relative form: %d +/- U95(%d) and %D +/- U95(%D). On the basis of leave-one-out cross-validation, the assigned values for 16 of the 17 materials were deemed equivalent at the 95% level of confidence. These materials were used to define the KCRF. The excluded material was identified as having a marginally underestimated assigned uncertainty, giving it large and potentially anomalous influence on the KCRF. However, this material's %d of 1.4 +/- 1.5 indicates that it is equivalent with the other materials at the 95% level of confidence. The median |%d| for all 17 of the materials is 0.3 with a median U95(%d) of 1.9. All of these higher-order CRMs for creatinine in human serum are equivalent within their assigned uncertainties. The median |%D| for the participating NMIs is 0.3 with a median U95(%D) of 2.1. These results demonstrate that all participating NMIs have the ability to correctly value-assign CRMs and proficiency test materials for creatinine in human serum and similar measurands. Main text. To reach the main text of this paper, click on Final Report. Note that this text is that which appears in Appendix B of the BIPM key comparison database kcdb.bipm.org/. The final report has been peer-reviewed and approved for publication by the CCQM, according to the provisions of the CIPM Mutual Recognition Arrangement (CIPM MRA).
A novel Bayesian approach to acoustic emission data analysis.
Agletdinov, E; Pomponi, E; Merson, D; Vinogradov, A
2016-12-01
Acoustic emission (AE) technique is a popular tool for materials characterization and non-destructive testing. Originating from the stochastic motion of defects in solids, AE is a random process by nature. The challenging problem arises whenever an attempt is made to identify specific points corresponding to the changes in the trends in the fluctuating AE time series. A general Bayesian framework is proposed for the analysis of AE time series, aiming at automated finding the breakpoints signaling a crossover in the dynamics of underlying AE sources. Copyright © 2016 Elsevier B.V. All rights reserved.
A product Pearson-type VII density distribution
NASA Astrophysics Data System (ADS)
Nadarajah, Saralees; Kotz, Samuel
2008-01-01
The Pearson-type VII distributions (containing the Student's t distributions) are becoming increasing prominent and are being considered as competitors to the normal distribution. Motivated by real examples in decision sciences, Bayesian statistics, probability theory and Physics, a new Pearson-type VII distribution is introduced by taking the product of two Pearson-type VII pdfs. Various structural properties of this distribution are derived, including its cdf, moments, mean deviation about the mean, mean deviation about the median, entropy, asymptotic distribution of the extreme order statistics, maximum likelihood estimates and the Fisher information matrix. Finally, an application to a Bayesian testing problem is illustrated.
Diagnostics for insufficiencies of posterior calculations in Bayesian signal inference.
Dorn, Sebastian; Oppermann, Niels; Ensslin, Torsten A
2013-11-01
We present an error-diagnostic validation method for posterior distributions in Bayesian signal inference, an advancement of a previous work. It transfers deviations from the correct posterior into characteristic deviations from a uniform distribution of a quantity constructed for this purpose. We show that this method is able to reveal and discriminate several kinds of numerical and approximation errors, as well as their impact on the posterior distribution. For this we present four typical analytical examples of posteriors with incorrect variance, skewness, position of the maximum, or normalization. We show further how this test can be applied to multidimensional signals.
Bayesian data analysis for newcomers.
Kruschke, John K; Liddell, Torrin M
2018-02-01
This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.
Link failure detection in a parallel computer
Archer, Charles J.; Blocksome, Michael A.; Megerian, Mark G.; Smith, Brian E.
2010-11-09
Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.
Joshi, Aditya; Vaidyanathan, Srinivas; Mondol, Samrat; Edgaonkar, Advait; Ramakrishnan, Uma
2013-01-01
Today, most wild tigers live in small, isolated Protected Areas within human dominated landscapes in the Indian subcontinent. Future survival of tigers depends on increasing local population size, as well as maintaining connectivity between populations. While significant conservation effort has been invested in increasing tiger population size, few initiatives have focused on landscape-level connectivity and on understanding the effect different landscape elements have on maintaining connectivity. We combined individual-based genetic and landscape ecology approaches to address this issue in six protected areas with varying tiger densities and separation in the Central Indian tiger landscape. We non-invasively sampled 55 tigers from different protected areas within this landscape. Maximum-likelihood and Bayesian genetic assignment tests indicate long-range tiger dispersal (on the order of 650 km) between protected areas. Further geo-spatial analyses revealed that tiger connectivity was affected by landscape elements such as human settlements, road density and host-population tiger density, but not by distance between populations. Our results elucidate the importance of landscape and habitat viability outside and between protected areas and provide a quantitative approach to test functionality of tiger corridors. We suggest future management strategies aim to minimize urban expansion between protected areas to maximize tiger connectivity. Achieving this goal in the context of ongoing urbanization and need to sustain current economic growth exerts enormous pressure on the remaining tiger habitats and emerges as a big challenge to conserve wild tigers in the Indian subcontinent. PMID:24223132
Joshi, Aditya; Vaidyanathan, Srinivas; Mondol, Samrat; Edgaonkar, Advait; Ramakrishnan, Uma
2013-01-01
Today, most wild tigers live in small, isolated Protected Areas within human dominated landscapes in the Indian subcontinent. Future survival of tigers depends on increasing local population size, as well as maintaining connectivity between populations. While significant conservation effort has been invested in increasing tiger population size, few initiatives have focused on landscape-level connectivity and on understanding the effect different landscape elements have on maintaining connectivity. We combined individual-based genetic and landscape ecology approaches to address this issue in six protected areas with varying tiger densities and separation in the Central Indian tiger landscape. We non-invasively sampled 55 tigers from different protected areas within this landscape. Maximum-likelihood and Bayesian genetic assignment tests indicate long-range tiger dispersal (on the order of 650 km) between protected areas. Further geo-spatial analyses revealed that tiger connectivity was affected by landscape elements such as human settlements, road density and host-population tiger density, but not by distance between populations. Our results elucidate the importance of landscape and habitat viability outside and between protected areas and provide a quantitative approach to test functionality of tiger corridors. We suggest future management strategies aim to minimize urban expansion between protected areas to maximize tiger connectivity. Achieving this goal in the context of ongoing urbanization and need to sustain current economic growth exerts enormous pressure on the remaining tiger habitats and emerges as a big challenge to conserve wild tigers in the Indian subcontinent.
USDA-ARS?s Scientific Manuscript database
Toxoplasma gondii infects virtually all warm-blooded animals worldwide. Serological tests, including the modified agglutination test (MAT), are often used to determine exposure to the parasite. The MAT can be used for all hosts because it does not need species-specific reagents and has been shown to...
Dynamic Bayesian Networks as a Probabilistic Metamodel for Combat Simulations
2014-09-18
test is commonly used for large data sets and is the method of comparison presented in Section 5.5. 4.3.3 Kullback - Leibler Divergence Goodness of Fit ...methods exist that might improve the results. A goodness of fit test using the Kullback - Leibler Divergence was proposed in the first paper, but still... Kullback - Leibler Divergence Goodness of Fit Test . . .
ERIC Educational Resources Information Center
Denbleyker, John Nickolas
2012-01-01
The shortcomings of the proportion above cut (PAC) statistic used so prominently in the educational landscape renders it a very problematic measure for making correct inferences with student test data. The limitations of PAC-based statistics are more pronounced with cross-test comparisons due to their dependency on cut-score locations. A better…
Fournier, Auriel M. V.; Sullivan, Alexis R.; Bump, Joseph K.; Perkins, Marie; Shieldcastle, Mark C.; King, Sammy L.
2017-01-01
Stable hydrogen isotope (δD) methods for tracking animal movement are widely used yet often produce low resolution assignments. Incorporating prior knowledge of abundance, distribution or movement patterns can ameliorate this limitation, but data are lacking for most species. We demonstrate how observations reported by citizen scientists can be used to develop robust estimates of species distributions and to constrain δD assignments.We developed a Bayesian framework to refine isotopic estimates of migrant animal origins conditional on species distribution models constructed from citizen scientist observations. To illustrate this approach, we analysed the migratory connectivity of the Virginia rail Rallus limicola, a secretive and declining migratory game bird in North America.Citizen science observations enabled both estimation of sampling bias and construction of bias-corrected species distribution models. Conditioning δD assignments on these species distribution models yielded comparably high-resolution assignments.Most Virginia rails wintering across five Gulf Coast sites spent the previous summer near the Great Lakes, although a considerable minority originated from the Chesapeake Bay watershed or Prairie Pothole region of North Dakota. Conversely, the majority of migrating Virginia rails from a site in the Great Lakes most likely spent the previous winter on the Gulf Coast between Texas and Louisiana.Synthesis and applications. In this analysis, Virginia rail migratory connectivity does not fully correspond to the administrative flyways used to manage migratory birds. This example demonstrates that with the increasing availability of citizen science data to create species distribution models, our framework can produce high-resolution estimates of migratory connectivity for many animals, including cryptic species. Empirical evidence of links between seasonal habitats will help enable effective habitat management, hunting quotas and population monitoring and also highlight critical knowledge gaps.
FBST for Cointegration Problems
NASA Astrophysics Data System (ADS)
Diniz, M.; Pereira, C. A. B.; Stern, J. M.
2008-11-01
In order to estimate causal relations, the time series econometrics has to be aware of spurious correlation, a problem first mentioned by Yule [21]. To solve the problem, one can work with differenced series or use multivariate models like VAR or VEC models. In this case, the analysed series are going to present a long run relation i.e. a cointegration relation. Even though the Bayesian literature about inference on VAR/VEC models is quite advanced, Bauwens et al. [2] highlight that "the topic of selecting the cointegrating rank has not yet given very useful and convincing results." This paper presents the Full Bayesian Significance Test applied to cointegration rank selection tests in multivariate (VAR/VEC) time series models and shows how to implement it using available in the literature and simulated data sets. A standard non-informative prior is assumed.
Parameter Estimation for a Turbulent Buoyant Jet Using Approximate Bayesian Computation
NASA Astrophysics Data System (ADS)
Christopher, Jason D.; Wimer, Nicholas T.; Hayden, Torrey R. S.; Lapointe, Caelan; Grooms, Ian; Rieker, Gregory B.; Hamlington, Peter E.
2016-11-01
Approximate Bayesian Computation (ABC) is a powerful tool that allows sparse experimental or other "truth" data to be used for the prediction of unknown model parameters in numerical simulations of real-world engineering systems. In this presentation, we introduce the ABC approach and then use ABC to predict unknown inflow conditions in simulations of a two-dimensional (2D) turbulent, high-temperature buoyant jet. For this test case, truth data are obtained from a simulation with known boundary conditions and problem parameters. Using spatially-sparse temperature statistics from the 2D buoyant jet truth simulation, we show that the ABC method provides accurate predictions of the true jet inflow temperature. The success of the ABC approach in the present test suggests that ABC is a useful and versatile tool for engineering fluid dynamics research.
Bayesian operational modal analysis with asynchronous data, part I: Most probable value
NASA Astrophysics Data System (ADS)
Zhu, Yi-Chen; Au, Siu-Kui
2018-01-01
In vibration tests, multiple sensors are used to obtain detailed mode shape information about the tested structure. Time synchronisation among data channels is required in conventional modal identification approaches. Modal identification can be more flexibly conducted if this is not required. Motivated by the potential gain in feasibility and economy, this work proposes a Bayesian frequency domain method for modal identification using asynchronous 'output-only' ambient data, i.e. 'operational modal analysis'. It provides a rigorous means for identifying the global mode shape taking into account the quality of the measured data and their asynchronous nature. This paper (Part I) proposes an efficient algorithm for determining the most probable values of modal properties. The method is validated using synthetic and laboratory data. The companion paper (Part II) investigates identification uncertainty and challenges in applications to field vibration data.
Bayesian approach to MSD-based analysis of particle motion in live cells.
Monnier, Nilah; Guo, Syuan-Ming; Mori, Masashi; He, Jun; Lénárt, Péter; Bathe, Mark
2012-08-08
Quantitative tracking of particle motion using live-cell imaging is a powerful approach to understanding the mechanism of transport of biological molecules, organelles, and cells. However, inferring complex stochastic motion models from single-particle trajectories in an objective manner is nontrivial due to noise from sampling limitations and biological heterogeneity. Here, we present a systematic Bayesian approach to multiple-hypothesis testing of a general set of competing motion models based on particle mean-square displacements that automatically classifies particle motion, properly accounting for sampling limitations and correlated noise while appropriately penalizing model complexity according to Occam's Razor to avoid over-fitting. We test the procedure rigorously using simulated trajectories for which the underlying physical process is known, demonstrating that it chooses the simplest physical model that explains the observed data. Further, we show that computed model probabilities provide a reliability test for the downstream biological interpretation of associated parameter values. We subsequently illustrate the broad utility of the approach by applying it to disparate biological systems including experimental particle trajectories from chromosomes, kinetochores, and membrane receptors undergoing a variety of complex motions. This automated and objective Bayesian framework easily scales to large numbers of particle trajectories, making it ideal for classifying the complex motion of large numbers of single molecules and cells from high-throughput screens, as well as single-cell-, tissue-, and organism-level studies. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Mawardi, Muhamad Iqbal; Padmadisastra, Septiadi; Tantular, Bertho
2018-03-01
Configural Frequency Analysis is a method for cell-wise testing in contingency tables for exploratory search type and antitype, that can see the existence of discrepancy on the model by existence of a significant difference between the frequency of observation and frequency of expectation. This analysis focuses on whether or not the interaction among categories from different variables, and not the interaction among variables. One of the extensions of CFA method is Bayesian CFA, this alternative method pursue the same goal as frequentist version of CFA with the advantage that adjustment of the experiment-wise significance level α is not necessary and test whether groups of types and antitypes form composite types or composite antitypes. Hence, this research will present the concept of the Bayesian CFA and how it works for the real data. The data on this paper is based on case studies in a company about decrease Brand Awareness & Image motor X on Top Of Mind Unit indicator in Cirebon City for user 30.8% and non user 9.8%. From the result of B-CFA have four characteristics from deviation, one of the four characteristics above that is the configuration 2212 need more attention by company to determine promotion strategy to maintain and improve Top Of Mind Unit in Cirebon City.
Testing Bayesian and heuristic predictions of mass judgments of colliding objects
Sanborn, Adam N.
2014-01-01
Mass judgments of colliding objects have been used to explore people's understanding of the physical world because they are ecologically relevant, yet people display biases that are most easily explained by a small set of heuristics. Recent work has challenged the heuristic explanation, by producing the same biases from a model that copes with perceptual uncertainty by using Bayesian inference with a prior based on the correct combination rules from Newtonian mechanics (noisy Newton). Here I test the predictions of the leading heuristic model (Gilden and Proffitt, 1989) against the noisy Newton model using a novel manipulation of the standard mass judgment task: making one of the objects invisible post-collision. The noisy Newton model uses the remaining information to predict above-chance performance, while the leading heuristic model predicts chance performance when one or the other final velocity is occluded. An experiment using two different types of occlusion showed better-than-chance performance and response patterns that followed the predictions of the noisy Newton model. The results demonstrate that people can make sensible physical judgments even when information critical for the judgment is missing, and that a Bayesian model can serve as a guide in these situations. Possible algorithmic-level accounts of this task that more closely correspond to the noisy Newton model are explored. PMID:25206345
van der Meer, Aize Franciscus; Touw, Daniël J; Marcus, Marco A E; Neef, Cornelis; Proost, Johannes H
2012-10-01
Observational data sets can be used for population pharmacokinetic (PK) modeling. However, these data sets are generally less precisely recorded than experimental data sets. This article aims to investigate the influence of erroneous records on population PK modeling and individual maximum a posteriori Bayesian (MAPB) estimation. A total of 1123 patient records of neonates who were administered vancomycin were used for population PK modeling by iterative 2-stage Bayesian (ITSB) analysis. Cut-off values for weighted residuals were tested for exclusion of records from the analysis. A simulation study was performed to assess the influence of erroneous records on population modeling and individual MAPB estimation. Also the cut-off values for weighted residuals were tested in the simulation study. Errors in registration have limited the influence on outcomes of population PK modeling but can have detrimental effects on individual MAPB estimation. A population PK model created from a data set with many registration errors has little influence on subsequent MAPB estimates for precisely recorded data. A weighted residual value of 2 for concentration measurements has good discriminative power for identification of erroneous records. ITSB analysis and its individual estimates are hardly affected by most registration errors. Large registration errors can be detected by weighted residuals of concentration.
Remaining useful life assessment of lithium-ion batteries in implantable medical devices
NASA Astrophysics Data System (ADS)
Hu, Chao; Ye, Hui; Jain, Gaurav; Schmidt, Craig
2018-01-01
This paper presents a prognostic study on lithium-ion batteries in implantable medical devices, in which a hybrid data-driven/model-based method is employed for remaining useful life assessment. The method is developed on and evaluated against data from two sets of lithium-ion prismatic cells used in implantable applications exhibiting distinct fade performance: 1) eight cells from Medtronic, PLC whose rates of capacity fade appear to be stable and gradually decrease over a 10-year test duration; and 2) eight cells from Manufacturer X whose rates appear to be greater and show sharp increase after some period over a 1.8-year test duration. The hybrid method enables online prediction of remaining useful life for predictive maintenance/control. It consists of two modules: 1) a sparse Bayesian learning module (data-driven) for inferring capacity from charge-related features; and 2) a recursive Bayesian filtering module (model-based) for updating empirical capacity fade models and predicting remaining useful life. A generic particle filter is adopted to implement recursive Bayesian filtering for the cells from the first set, whose capacity fade behavior can be represented by a single fade model; a multiple model particle filter with fixed-lag smoothing is proposed for the cells from the second data set, whose capacity fade behavior switches between multiple fade models.
Microcomputer Network for Computerized Adaptive Testing (CAT)
1984-03-01
PRDC TR 84-33 \\Q.�d-33- \\ MICROCOMPUTER NETWOJlt FOR COMPUTERIZED ADAPTIVE TESTING ( CAT ) Baldwin Quan Thomas A . Park Gary Sandahl John H...ACCEIIION NO NPRDC TR 84-33 4. TITLE (-d Sul>tlllo) MICROCOMP UTER NETWORK FOR COMPUTERIZED ADA PTIVE TESTING ( CAT ) 1. Q B. uan T. A . Park...adaptive testing ( CAT ) Bayesian sequential testing 20. ABSTitACT (Continuo on ro•••• aide II noco .. _, _., ld-tlly ,.,. t.loclt _._.) DO Computerized
NASA Astrophysics Data System (ADS)
Skataric, Maja; Bose, Sandip; Zeroug, Smaine; Tilke, Peter
2017-02-01
It is not uncommon in the field of non-destructive evaluation that multiple measurements encompassing a variety of modalities are available for analysis and interpretation for determining the underlying states of nature of the materials or parts being tested. Despite and sometimes due to the richness of data, significant challenges arise in the interpretation manifested as ambiguities and inconsistencies due to various uncertain factors in the physical properties (inputs), environment, measurement device properties, human errors, and the measurement data (outputs). Most of these uncertainties cannot be described by any rigorous mathematical means, and modeling of all possibilities is usually infeasible for many real time applications. In this work, we will discuss an approach based on Hierarchical Bayesian Graphical Models (HBGM) for the improved interpretation of complex (multi-dimensional) problems with parametric uncertainties that lack usable physical models. In this setting, the input space of the physical properties is specified through prior distributions based on domain knowledge and expertise, which are represented as Gaussian mixtures to model the various possible scenarios of interest for non-destructive testing applications. Forward models are then used offline to generate the expected distribution of the proposed measurements which are used to train a hierarchical Bayesian network. In Bayesian analysis, all model parameters are treated as random variables, and inference of the parameters is made on the basis of posterior distribution given the observed data. Learned parameters of the posterior distribution obtained after the training can therefore be used to build an efficient classifier for differentiating new observed data in real time on the basis of pre-trained models. We will illustrate the implementation of the HBGM approach to ultrasonic measurements used for cement evaluation of cased wells in the oil industry.
NASA Astrophysics Data System (ADS)
Titus, Benjamin M.; Daly, Marymegan
2017-03-01
Specialist and generalist life histories are expected to result in contrasting levels of genetic diversity at the population level, and symbioses are expected to lead to patterns that reflect a shared biogeographic history and co-diversification. We test these assumptions using mtDNA sequencing and a comparative phylogeographic approach for six co-occurring crustacean species that are symbiotic with sea anemones on western Atlantic coral reefs, yet vary in their host specificities: four are host specialists and two are host generalists. We first conducted species discovery analyses to delimit cryptic lineages, followed by classic population genetic diversity analyses for each delimited taxon, and then reconstructed the demographic history for each taxon using traditional summary statistics, Bayesian skyline plots, and approximate Bayesian computation to test for signatures of recent and concerted population expansion. The genetic diversity values recovered here contravene the expectations of the specialist-generalist variation hypothesis and classic population genetics theory; all specialist lineages had greater genetic diversity than generalists. Demography suggests recent population expansions in all taxa, although Bayesian skyline plots and approximate Bayesian computation suggest the timing and magnitude of these events were idiosyncratic. These results do not meet the a priori expectation of concordance among symbiotic taxa and suggest that intrinsic aspects of species biology may contribute more to phylogeographic history than extrinsic forces that shape whole communities. The recovery of two cryptic specialist lineages adds an additional layer of biodiversity to this symbiosis and contributes to an emerging pattern of cryptic speciation in the specialist taxa. Our results underscore the differences in the evolutionary processes acting on marine systems from the terrestrial processes that often drive theory. Finally, we continue to highlight the Florida Reef Tract as an important biodiversity hotspot.
Bayesian Inference for Signal-Based Seismic Monitoring
NASA Astrophysics Data System (ADS)
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sigeti, David E.; Pelak, Robert A.
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis withmore » an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a general beta-function prior for {theta}, enabling sequential analysis in which a small number of new simulations may be done and the resulting posterior for {theta} used as a prior to inform the next stage of power analysis.« less
Vexler, Albert; Tanajian, Hovig; Hutson, Alan D
In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K -sample distributions. Recognizing that recent statistical software packages do not sufficiently address K -sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p -values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p -value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p -value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Isotropy of low redshift type Ia supernovae: A Bayesian analysis
NASA Astrophysics Data System (ADS)
Andrade, U.; Bengaly, C. A. P.; Alcaniz, J. S.; Santos, B.
2018-04-01
The standard cosmology strongly relies upon the cosmological principle, which consists on the hypotheses of large scale isotropy and homogeneity of the Universe. Testing these assumptions is, therefore, crucial to determining if there are deviations from the standard cosmological paradigm. In this paper, we use the latest type Ia supernova compilations, namely JLA and Union2.1 to test the cosmological isotropy at low redshift ranges (z <0.1 ). This is performed through a Bayesian selection analysis, in which we compare the standard, isotropic model, with another one including a dipole correction due to peculiar velocities. The full covariance matrix of SN distance uncertainties are taken into account. We find that the JLA sample favors the standard model, whilst the Union2.1 results are inconclusive, yet the constraints from both compilations are in agreement with previous analyses. We conclude that there is no evidence for a dipole anisotropy from nearby supernova compilations, albeit this test should be greatly improved with the much-improved data sets from upcoming cosmological surveys.
The Development of Bayesian Theory and Its Applications in Business and Bioinformatics
NASA Astrophysics Data System (ADS)
Zhang, Yifei
2018-03-01
Bayesian Theory originated from an Essay of a British mathematician named Thomas Bayes in 1763, and after its development in 20th century, Bayesian Statistics has been taking a significant part in statistical study of all fields. Due to the recent breakthrough of high-dimensional integral, Bayesian Statistics has been improved and perfected, and now it can be used to solve problems that Classical Statistics failed to solve. This paper summarizes Bayesian Statistics’ history, concepts and applications, which are illustrated in five parts: the history of Bayesian Statistics, the weakness of Classical Statistics, Bayesian Theory and its development and applications. The first two parts make a comparison between Bayesian Statistics and Classical Statistics in a macroscopic aspect. And the last three parts focus on Bayesian Theory in specific -- from introducing some particular Bayesian Statistics’ concepts to listing their development and finally their applications.
Bayesian demography 250 years after Bayes
Bijak, Jakub; Bryant, John
2016-01-01
Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889
Advances in the Application of Decision Theory to Test-Based Decision Making.
ERIC Educational Resources Information Center
van der Linden, Wim J.
This paper reviews recent research in the Netherlands on the application of decision theory to test-based decision making about personnel selection and student placement. The review is based on an earlier model proposed for the classification of decision problems, and emphasizes an empirical Bayesian framework. Classification decisions with…
A Tutorial on Multiple Testing: False Discovery Control
NASA Astrophysics Data System (ADS)
Chatelain, F.
2016-09-01
This paper presents an overview of criteria and methods in multiple testing, with an emphasis on the false discovery rate control. The popular Benjamini and Hochberg procedure is described. The rationale for this approach is explained through a simple Bayesian interpretation. Some state-of-the-art variations and extensions are also presented.
A large scale test of the gaming-enhancement hypothesis
Wang, John C.
2016-01-01
A growing research literature suggests that regular electronic game play and game-based training programs may confer practically significant benefits to cognitive functioning. Most evidence supporting this idea, the gaming-enhancement hypothesis, has been collected in small-scale studies of university students and older adults. This research investigated the hypothesis in a general way with a large sample of 1,847 school-aged children. Our aim was to examine the relations between young people’s gaming experiences and an objective test of reasoning performance. Using a Bayesian hypothesis testing approach, evidence for the gaming-enhancement and null hypotheses were compared. Results provided no substantive evidence supporting the idea that having preference for or regularly playing commercially available games was positively associated with reasoning ability. Evidence ranged from equivocal to very strong in support for the null hypothesis over what was predicted. The discussion focuses on the value of Bayesian hypothesis testing for investigating electronic gaming effects, the importance of open science practices, and pre-registered designs to improve the quality of future work. PMID:27896035
The Long Exercise Test in Periodic Paralysis: A Bayesian Analysis.
Simmons, Daniel B; Lanning, Julie; Cleland, James C; Puwanant, Araya; Twydell, Paul T; Griggs, Robert C; Tawil, Rabi; Logigian, Eric L
2018-05-12
The long exercise test (LET) is used to assess the diagnosis of periodic paralysis (PP), but LET methodology and normal "cut-off" values vary. To determine optimal LET methodology and cut-offs, we reviewed LET data (abductor digiti minimi (ADM) motor response amplitude, area) from 55 PP patients (32 genetically definite) and 125 controls. Receiver operating characteristic (ROC) curves were constructed and area-under-the-curve (AUC) calculated to compare 1) peak-to-nadir versus baseline-to-nadir methodologies, and 2) amplitude versus area decrements. Using Bayesian principles, optimal "cut-off" decrements that achieved 95% post-test probability of PP were calculated for various pre-test probabilities (PreTPs). AUC was highest for peak-to-nadir methodology and equal for amplitude and area decrements. For PreTP ≤50%, optimal decrement cut-offs (peak-to-nadir) were >40% (amplitude) or >50% (area). For confirmation of PP, our data endorse the diagnostic utility of peak-to-nadir LET methodology using 40% amplitude or 50% area decrement cut-offs for PreTPs ≤50%. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.
Fuentes-Contreras, Eduardo; Basoalto, Esteban; Franck, Pierre; Lavandero, Blas; Knight, Alan L; Ramírez, Claudio C
2014-04-01
The genetic structure of adult codling moth, Cydia pomonella (L.), populations was characterized both inside a managed apple, Malus domestica Borkdhausen, orchard and in surrounding unmanaged hosts and nonhost trees in central Chile during 2006-2007. Adult males were collected using an array of sex pheromone-baited traps. Five microsatellite genetic markers were used to study the population genetic structure across both spatial (1-100 ha) and temporal (generations within a season) gradients. Analysis of molecular variance (AMOVA) found a significant, but weak, association in both the spatial and temporal genetic structures. Discriminant analysis also found significant differentiation between the first and second generation for traps located either inside or outside the managed orchard. The Bayesian assignment test detected three genetic clusters during each of the two generations, which corresponded to different areas within the unmanaged and managed apple orchard interface. The lack of a strong spatial structure at a local scale was hypothesized to be because of active adult movement between the managed and unmanaged hosts and the asymmetry in the insecticide selection pressure inside and outside the managed habitats. These data highlight the importance of developing area-wide management programs that incorporate management tactics effective at the landscape level for successful codling moth control.
Kimura, Yuri; Hawkins, Melissa T R; McDonough, Molly M; Jacobs, Louis L; Flynn, Lawrence J
2015-09-28
Time calibration derived from the fossil record is essential for molecular phylogenetic and evolutionary studies. Fossil mice and rats, discovered in the Siwalik Group of Pakistan, have served as one of the best-known fossil calibration points in molecular phylogenic studies. Although these fossils have been widely used as the 12 Ma date for the Mus/Rattus split or a more basal split, conclusive paleontological evidence for the nodal assignments has been absent. This study analyzes newly recognized characters that demonstrate lineage separation in the fossil record of Siwalik murines and examines the most reasonable nodal placement of the diverging lineages in a molecular phylogenetic tree by ancestral state reconstruction. Our specimen-based approach strongly indicates that Siwalik murines of the Karnimata clade are fossil members of the Arvicanthini-Otomyini-Millardini clade, which excludes Rattus and its relatives. Combining the new interpretation with the widely accepted hypothesis that the Progonomys clade includes Mus, the lineage separation event in the Siwalik fossil record represents the Mus/Arvicanthis split. Our test analysis on Bayesian age estimates shows that this new calibration point provides more accurate estimates of murine divergence than previous applications. Thus, we define this fossil calibration point and refine two other fossil-based points for molecular dating.
Kimura, Yuri; Hawkins, Melissa T. R.; McDonough, Molly M.; Jacobs, Louis L.; Flynn, Lawrence J.
2015-01-01
Time calibration derived from the fossil record is essential for molecular phylogenetic and evolutionary studies. Fossil mice and rats, discovered in the Siwalik Group of Pakistan, have served as one of the best-known fossil calibration points in molecular phylogenic studies. Although these fossils have been widely used as the 12 Ma date for the Mus/Rattus split or a more basal split, conclusive paleontological evidence for the nodal assignments has been absent. This study analyzes newly recognized characters that demonstrate lineage separation in the fossil record of Siwalik murines and examines the most reasonable nodal placement of the diverging lineages in a molecular phylogenetic tree by ancestral state reconstruction. Our specimen-based approach strongly indicates that Siwalik murines of the Karnimata clade are fossil members of the Arvicanthini-Otomyini-Millardini clade, which excludes Rattus and its relatives. Combining the new interpretation with the widely accepted hypothesis that the Progonomys clade includes Mus, the lineage separation event in the Siwalik fossil record represents the Mus/Arvicanthis split. Our test analysis on Bayesian age estimates shows that this new calibration point provides more accurate estimates of murine divergence than previous applications. Thus, we define this fossil calibration point and refine two other fossil-based points for molecular dating. PMID:26411391
Sample Size for Tablet Compression and Capsule Filling Events During Process Validation.
Charoo, Naseem Ahmad; Durivage, Mark; Rahman, Ziyaur; Ayad, Mohamad Haitham
2017-12-01
During solid dosage form manufacturing, the uniformity of dosage units (UDU) is ensured by testing samples at 2 stages, that is, blend stage and tablet compression or capsule/powder filling stage. The aim of this work is to propose a sample size selection approach based on quality risk management principles for process performance qualification (PPQ) and continued process verification (CPV) stages by linking UDU to potential formulation and process risk factors. Bayes success run theorem appeared to be the most appropriate approach among various methods considered in this work for computing sample size for PPQ. The sample sizes for high-risk (reliability level of 99%), medium-risk (reliability level of 95%), and low-risk factors (reliability level of 90%) were estimated to be 299, 59, and 29, respectively. Risk-based assignment of reliability levels was supported by the fact that at low defect rate, the confidence to detect out-of-specification units would decrease which must be supplemented with an increase in sample size to enhance the confidence in estimation. Based on level of knowledge acquired during PPQ and the level of knowledge further required to comprehend process, sample size for CPV was calculated using Bayesian statistics to accomplish reduced sampling design for CPV. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Martínez-Díaz, Yesenia; González-Rodríguez, Antonio; Rico-Ponce, Héctor Rómulo; Rocha-Ramírez, Víctor; Ovando-Medina, Isidro; Espinosa-García, Francisco J
2017-01-01
Jatropha curcas L. (Euphorbiaceae) is a shrub native to Mexico and Central America, which produces seeds with a high oil content that can be converted to biodiesel. The genetic diversity of this plant has been widely studied, but it is not known whether the diversity of the seed oil chemical composition correlates with neutral genetic diversity. The total seed oil content, the diversity of profiles of fatty acids and phorbol esters were quantified, also, the genetic diversity obtained from simple sequence repeats was analyzed in native populations of J. curcas in Mexico. Using the fatty acids profiles, a discriminant analysis recognized three groups of individuals according to geographical origin. Bayesian assignment analysis revealed two genetic groups, while the genetic structure of the populations could not be explained by isolation-by-distance. Genetic and fatty acid profile data were not correlated based on Mantel test. Also, phorbol ester content and genetic diversity were not associated. Multiple linear regression analysis showed that total oil content was associated with altitude and seasonality of temperature. The content of unsaturated fatty acids was associated with altitude. Therefore, the cultivation planning of J. curcas should take into account chemical variation related to environmental factors. © 2017 Wiley-VHCA AG, Zurich, Switzerland.
Using population genetic tools to develop a control strategy for feral cats (Felis catus) in Hawai'i
Hansen, H.; Hess, S.C.; Cole, D.; Banko, P.C.
2007-01-01
Population genetics can provide information about the demographics and dynamics of invasive species that is beneficial for developing effective control strategies. We studied the population genetics of feral cats on Hawai'i Island by microsatellite analysis to evaluate genetic diversity and population structure, assess gene flow and connectivity among three populations, identify potential source populations, characterise population dynamics, and evaluate sex-biased dispersal. High genetic diversity, low structure, and high number of migrants per generation supported high gene flow that was not limited spatially. Migration rates revealed that most migration occurred out of West Mauna Kea. Effective population size estimates indicated increasing cat populations despite control efforts. Despite high gene flow, relatedness estimates declined significantly with increased geographic distance and Bayesian assignment tests revealed the presence of three population clusters. Genetic structure and relatedness estimates indicated male-biased dispersal, primarily from Mauna Kea, suggesting that this population should be targeted for control. However, recolonisation seems likely, given the great dispersal ability that may not be inhibited by barriers such as lava flows. Genetic monitoring will be necessary to assess the effectiveness of future control efforts. Management of other invasive species may benefit by employing these population genetic tools. ?? CSIRO 2007.
Samberg, Leah H; Fishman, Lila; Allendorf, Fred W
2013-01-01
Conservation strategies are increasingly driven by our understanding of the processes and patterns of gene flow across complex landscapes. The expansion of population genetic approaches into traditional agricultural systems requires understanding how social factors contribute to that landscape, and thus to gene flow. This study incorporates extensive farmer interviews and population genetic analysis of barley landraces (Hordeum vulgare) to build a holistic picture of farmer-mediated geneflow in an ancient, traditional agricultural system in the highlands of Ethiopia. We analyze barley samples at 14 microsatellite loci across sites at varying elevations and locations across a contiguous mountain range, and across farmer-identified barley types and management strategies. Genetic structure is analyzed using population-based and individual-based methods, including measures of population differentiation and genetic distance, multivariate Principal Coordinate Analysis, and Bayesian assignment tests. Phenotypic analysis links genetic patterns to traits identified by farmers. We find that differential farmer management strategies lead to markedly different patterns of population structure across elevation classes and barley types. The extent to which farmer seed management appears as a stronger determinant of spatial structure than the physical landscape highlights the need for incorporation of social, landscape, and genetic data for the design of conservation strategies in human-influenced landscapes. PMID:24478796
Huang, Jie; Chen, Zigui; Song, Weibo; Berger, Helmut
2014-01-01
Classifications of the Urostyloidea were mainly based on morphology and morphogenesis. Since molecular phylogeny largely focused on limited sampling using mostly the one-gene information, the incongruence between morphological data and gene sequences have risen. In this work, the three-gene data (SSU-rDNA, ITS1-5.8S-ITS2 and LSU-rDNA) comprising 12 genera in the “core urostyloids” are sequenced, and the phylogenies based on these different markers are compared using maximum-likelihood and Bayesian algorithms and tested by unconstrained and constrained analyses. The molecular phylogeny supports the following conclusions: (1) the monophyly of the core group of Urostyloidea is well supported while the whole Urostyloidea is not monophyletic; (2) Thigmokeronopsis and Apokeronopsis are clearly separated from the pseudokeronopsids in analyses of all three gene markers, supporting their exclusion from the Pseudokeronopsidae and the inclusion in the Urostylidae; (3) Diaxonella and Apobakuella should be assigned to the Urostylidae; (4) Bergeriella, Monocoronella and Neourostylopsis flavicana share a most recent common ancestor; (5) all molecular trees support the transfer of Metaurostylopsis flavicana to the recently proposed genus Neourostylopsis; (6) all molecular phylogenies fail to separate the morphologically well-defined genera Uroleptopsis and Pseudokeronopsis; and (7) Arcuseries gen. nov. containing three distinctly deviating Anteholosticha species is established. PMID:24140978
Somers, Christopher M; Graham, Carly F; Martino, Jessica A; Frasier, Timothy R; Lance, Stacey L; Gardiner, Laura E; Poulin, Ray G
2017-01-01
On the North American Great Plains, several snake species reach their northern range limit where they rely on sparsely distributed hibernacula located in major river valleys. Independent colonization histories for the river valleys and barriers to gene flow caused by the lack of suitable habitat between them may have produced genetically differentiated snake populations. To test this hypothesis, we used 10 microsatellite loci to examine the population structure of two species of conservation concern in Canada: the eastern yellow-bellied racer (Coluber constrictor flaviventris) and bullsnake (Pituophis catenifer sayi) in 3 major river valleys in southern Saskatchewan. Fixation indices (FST) showed that populations in river valleys were significantly differentiated for both species (racers, FST = 0.096, P = 0.001; bullsnakes FST = 0.045-0.157, P = 0.001). Bayesian assignment (STRUCTURE) and ordination (DAPC) strongly supported genetically differentiated groups in the geographically distinct river valleys. Finer-scale subdivision of populations within river valleys was not apparent based on our data, but is a topic that should be investigated further. Our findings highlight the importance of major river valleys for snakes at the northern extent of their ranges, and raise the possibility that populations in each river valley may warrant separate management strategies.
A Bayesian framework for knowledge attribution: evidence from semantic integration.
Powell, Derek; Horne, Zachary; Pinillos, N Ángel; Holyoak, Keith J
2015-06-01
We propose a Bayesian framework for the attribution of knowledge, and apply this framework to generate novel predictions about knowledge attribution for different types of "Gettier cases", in which an agent is led to a justified true belief yet has made erroneous assumptions. We tested these predictions using a paradigm based on semantic integration. We coded the frequencies with which participants falsely recalled the word "thought" as "knew" (or a near synonym), yielding an implicit measure of conceptual activation. Our experiments confirmed the predictions of our Bayesian account of knowledge attribution across three experiments. We found that Gettier cases due to counterfeit objects were not treated as knowledge (Experiment 1), but those due to intentionally-replaced evidence were (Experiment 2). Our findings are not well explained by an alternative account focused only on luck, because accidentally-replaced evidence activated the knowledge concept more strongly than did similar false belief cases (Experiment 3). We observed a consistent pattern of results across a number of different vignettes that varied the quality and type of evidence available to agents, the relative stakes involved, and surface details of content. Accordingly, the present findings establish basic phenomena surrounding people's knowledge attributions in Gettier cases, and provide explanations of these phenomena within a Bayesian framework. Copyright © 2015 Elsevier B.V. All rights reserved.
Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer
2018-01-01
This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.
Natural frequencies improve Bayesian reasoning in simple and complex inference tasks
Hoffrage, Ulrich; Krauss, Stefan; Martignon, Laura; Gigerenzer, Gerd
2015-01-01
Representing statistical information in terms of natural frequencies rather than probabilities improves performance in Bayesian inference tasks. This beneficial effect of natural frequencies has been demonstrated in a variety of applied domains such as medicine, law, and education. Yet all the research and applications so far have been limited to situations where one dichotomous cue is used to infer which of two hypotheses is true. Real-life applications, however, often involve situations where cues (e.g., medical tests) have more than one value, where more than two hypotheses (e.g., diseases) are considered, or where more than one cue is available. In Study 1, we show that natural frequencies, compared to information stated in terms of probabilities, consistently increase the proportion of Bayesian inferences made by medical students in four conditions—three cue values, three hypotheses, two cues, or three cues—by an average of 37 percentage points. In Study 2, we show that teaching natural frequencies for simple tasks with one dichotomous cue and two hypotheses leads to a transfer of learning to complex tasks with three cue values and two cues, with a proportion of 40 and 81% correct inferences, respectively. Thus, natural frequencies facilitate Bayesian reasoning in a much broader class of situations than previously thought. PMID:26528197
Spatial distribution of psychotic disorders in an urban area of France: an ecological study.
Pignon, Baptiste; Schürhoff, Franck; Baudin, Grégoire; Ferchiou, Aziz; Richard, Jean-Romain; Saba, Ghassen; Leboyer, Marion; Kirkbride, James B; Szöke, Andrei
2016-05-18
Previous analyses of neighbourhood variations of non-affective psychotic disorders (NAPD) have focused mainly on incidence. However, prevalence studies provide important insights on factors associated with disease evolution as well as for healthcare resource allocation. This study aimed to investigate the distribution of prevalent NAPD cases in an urban area in France. The number of cases in each neighbourhood was modelled as a function of potential confounders and ecological variables, namely: migrant density, economic deprivation and social fragmentation. This was modelled using statistical models of increasing complexity: frequentist models (using Poisson and negative binomial regressions), and several Bayesian models. For each model, assumptions validity were checked and compared as to how this fitted to the data, in order to test for possible spatial variation in prevalence. Data showed significant overdispersion (invalidating the Poisson regression model) and residual autocorrelation (suggesting the need to use Bayesian models). The best Bayesian model was Leroux's model (i.e. a model with both strong correlation between neighbouring areas and weaker correlation between areas further apart), with economic deprivation as an explanatory variable (OR = 1.13, 95% CI [1.02-1.25]). In comparison with frequentist methods, the Bayesian model showed a better fit. The number of cases showed non-random spatial distribution and was linked to economic deprivation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Jiangjiang; Li, Weixuan; Zeng, Lingzao
Surrogate models are commonly used in Bayesian approaches such as Markov Chain Monte Carlo (MCMC) to avoid repetitive CPU-demanding model evaluations. However, the approximation error of a surrogate may lead to biased estimations of the posterior distribution. This bias can be corrected by constructing a very accurate surrogate or implementing MCMC in a two-stage manner. Since the two-stage MCMC requires extra original model evaluations, the computational cost is still high. If the information of measurement is incorporated, a locally accurate approximation of the original model can be adaptively constructed with low computational cost. Based on this idea, we propose amore » Gaussian process (GP) surrogate-based Bayesian experimental design and parameter estimation approach for groundwater contaminant source identification problems. A major advantage of the GP surrogate is that it provides a convenient estimation of the approximation error, which can be incorporated in the Bayesian formula to avoid over-confident estimation of the posterior distribution. The proposed approach is tested with a numerical case study. Without sacrificing the estimation accuracy, the new approach achieves about 200 times of speed-up compared to our previous work using two-stage MCMC.« less
NASA Astrophysics Data System (ADS)
Gopalan, Giri; Hrafnkelsson, Birgir; Aðalgeirsdóttir, Guðfinna; Jarosch, Alexander H.; Pálsson, Finnur
2018-03-01
Bayesian hierarchical modeling can assist the study of glacial dynamics and ice flow properties. This approach will allow glaciologists to make fully probabilistic predictions for the thickness of a glacier at unobserved spatio-temporal coordinates, and it will also allow for the derivation of posterior probability distributions for key physical parameters such as ice viscosity and basal sliding. The goal of this paper is to develop a proof of concept for a Bayesian hierarchical model constructed, which uses exact analytical solutions for the shallow ice approximation (SIA) introduced by Bueler et al. (2005). A suite of test simulations utilizing these exact solutions suggests that this approach is able to adequately model numerical errors and produce useful physical parameter posterior distributions and predictions. A byproduct of the development of the Bayesian hierarchical model is the derivation of a novel finite difference method for solving the SIA partial differential equation (PDE). An additional novelty of this work is the correction of numerical errors induced through a numerical solution using a statistical model. This error correcting process models numerical errors that accumulate forward in time and spatial variation of numerical errors between the dome, interior, and margin of a glacier.
Shankle, William R; Pooley, James P; Steyvers, Mark; Hara, Junko; Mangrola, Tushar; Reisberg, Barry; Lee, Michael D
2013-01-01
Determining how cognition affects functional abilities is important in Alzheimer disease and related disorders. A total of 280 patients (normal or Alzheimer disease and related disorders) received a total of 1514 assessments using the functional assessment staging test (FAST) procedure and the MCI Screen. A hierarchical Bayesian cognitive processing model was created by embedding a signal detection theory model of the MCI Screen-delayed recognition memory task into a hierarchical Bayesian framework. The signal detection theory model used latent parameters of discriminability (memory process) and response bias (executive function) to predict, simultaneously, recognition memory performance for each patient and each FAST severity group. The observed recognition memory data did not distinguish the 6 FAST severity stages, but the latent parameters completely separated them. The latent parameters were also used successfully to transform the ordinal FAST measure into a continuous measure reflecting the underlying continuum of functional severity. Hierarchical Bayesian cognitive processing models applied to recognition memory data from clinical practice settings accurately translated a latent measure of cognition into a continuous measure of functional severity for both individuals and FAST groups. Such a translation links 2 levels of brain information processing and may enable more accurate correlations with other levels, such as those characterized by biomarkers.
Bayesian approach for assessing non-inferiority in a three-arm trial with pre-specified margin.
Ghosh, Samiran; Ghosh, Santu; Tiwari, Ram C
2016-02-28
Non-inferiority trials are becoming increasingly popular for comparative effectiveness research. However, inclusion of the placebo arm, whenever possible, gives rise to a three-arm trial which has lesser burdensome assumptions than a standard two-arm non-inferiority trial. Most of the past developments in a three-arm trial consider defining a pre-specified fraction of unknown effect size of reference drug, that is, without directly specifying a fixed non-inferiority margin. However, in some recent developments, a more direct approach is being considered with pre-specified fixed margin albeit in the frequentist setup. Bayesian paradigm provides a natural path to integrate historical and current trials' information via sequential learning. In this paper, we propose a Bayesian approach for simultaneous testing of non-inferiority and assay sensitivity in a three-arm trial with normal responses. For the experimental arm, in absence of historical information, non-informative priors are assumed under two situations, namely when (i) variance is known and (ii) variance is unknown. A Bayesian decision criteria is derived and compared with the frequentist method using simulation studies. Finally, several published clinical trial examples are reanalyzed to demonstrate the benefit of the proposed procedure. Copyright © 2015 John Wiley & Sons, Ltd.
Internal Medicine residents use heuristics to estimate disease probability
Phang, Sen Han; Ravani, Pietro; Schaefer, Jeffrey; Wright, Bruce; McLaughlin, Kevin
2015-01-01
Background Training in Bayesian reasoning may have limited impact on accuracy of probability estimates. In this study, our goal was to explore whether residents previously exposed to Bayesian reasoning use heuristics rather than Bayesian reasoning to estimate disease probabilities. We predicted that if residents use heuristics then post-test probability estimates would be increased by non-discriminating clinical features or a high anchor for a target condition. Method We randomized 55 Internal Medicine residents to different versions of four clinical vignettes and asked them to estimate probabilities of target conditions. We manipulated the clinical data for each vignette to be consistent with either 1) using a representative heuristic, by adding non-discriminating prototypical clinical features of the target condition, or 2) using anchoring with adjustment heuristic, by providing a high or low anchor for the target condition. Results When presented with additional non-discriminating data the odds of diagnosing the target condition were increased (odds ratio (OR) 2.83, 95% confidence interval [1.30, 6.15], p = 0.009). Similarly, the odds of diagnosing the target condition were increased when a high anchor preceded the vignette (OR 2.04, [1.09, 3.81], p = 0.025). Conclusions Our findings suggest that despite previous exposure to the use of Bayesian reasoning, residents use heuristics, such as the representative heuristic and anchoring with adjustment, to estimate probabilities. Potential reasons for attribute substitution include the relative cognitive ease of heuristics vs. Bayesian reasoning or perhaps residents in their clinical practice use gist traces rather than precise probability estimates when diagnosing. PMID:27004080
Bayesian Estimation of Small Effects in Exercise and Sports Science.
Mengersen, Kerrie L; Drovandi, Christopher C; Robert, Christian P; Pyne, David B; Gore, Christopher J
2016-01-01
The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a 'magnitude-based inference' approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.
Bayesian Knowledge Fusion in Prognostics and Health Management—A Case Study
NASA Astrophysics Data System (ADS)
Rabiei, Masoud; Modarres, Mohammad; Mohammad-Djafari, Ali
2011-03-01
In the past few years, a research effort has been in progress at University of Maryland to develop a Bayesian framework based on Physics of Failure (PoF) for risk assessment and fleet management of aging airframes. Despite significant achievements in modelling of crack growth behavior using fracture mechanics, it is still of great interest to find practical techniques for monitoring the crack growth instances using nondestructive inspection and to integrate such inspection results with the fracture mechanics models to improve the predictions. The ultimate goal of this effort is to develop an integrated probabilistic framework for utilizing all of the available information to come up with enhanced (less uncertain) predictions for structural health of the aircraft in future missions. Such information includes material level fatigue models and test data, health monitoring measurements and inspection field data. In this paper, a case study of using Bayesian fusion technique for integrating information from multiple sources in a structural health management problem is presented.
A Bayesian Approach for Sensor Optimisation in Impact Identification
Mallardo, Vincenzo; Sharif Khodaei, Zahra; Aliabadi, Ferri M. H.
2016-01-01
This paper presents a Bayesian approach for optimizing the position of sensors aimed at impact identification in composite structures under operational conditions. The uncertainty in the sensor data has been represented by statistical distributions of the recorded signals. An optimisation strategy based on the genetic algorithm is proposed to find the best sensor combination aimed at locating impacts on composite structures. A Bayesian-based objective function is adopted in the optimisation procedure as an indicator of the performance of meta-models developed for different sensor combinations to locate various impact events. To represent a real structure under operational load and to increase the reliability of the Structural Health Monitoring (SHM) system, the probability of malfunctioning sensors is included in the optimisation. The reliability and the robustness of the procedure is tested with experimental and numerical examples. Finally, the proposed optimisation algorithm is applied to a composite stiffened panel for both the uniform and non-uniform probability of impact occurrence. PMID:28774064
Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures
Moore, Brian R.; Höhna, Sebastian; May, Michael R.; Rannala, Bruce; Huelsenbeck, John P.
2016-01-01
Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM. PMID:27512038
Convergence among cave catfishes: long-branch attraction and a Bayesian relative rates test.
Wilcox, T P; García de León, F J; Hendrickson, D A; Hillis, D M
2004-06-01
Convergence has long been of interest to evolutionary biologists. Cave organisms appear to be ideal candidates for studying convergence in morphological, physiological, and developmental traits. Here we report apparent convergence in two cave-catfishes that were described on morphological grounds as congeners: Prietella phreatophila and Prietella lundbergi. We collected mitochondrial DNA sequence data from 10 species of catfishes, representing five of the seven genera in Ictaluridae, as well as seven species from a broad range of siluriform outgroups. Analysis of the sequence data under parsimony supports a monophyletic Prietella. However, both maximum-likelihood and Bayesian analyses support polyphyly of the genus, with P. lundbergi sister to Ictalurus and P. phreatophila sister to Ameiurus. The topological difference between parsimony and the other methods appears to result from long-branch attraction between the Prietella species. Similarly, the sequence data do not support several other relationships within Ictaluridae supported by morphology. We develop a new Bayesian method for examining variation in molecular rates of evolution across a phylogeny.
Validation of the thermal challenge problem using Bayesian Belief Networks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McFarland, John; Swiler, Laura Painton
The thermal challenge problem has been developed at Sandia National Laboratories as a testbed for demonstrating various types of validation approaches and prediction methods. This report discusses one particular methodology to assess the validity of a computational model given experimental data. This methodology is based on Bayesian Belief Networks (BBNs) and can incorporate uncertainty in experimental measurements, in physical quantities, and model uncertainties. The approach uses the prior and posterior distributions of model output to compute a validation metric based on Bayesian hypothesis testing (a Bayes' factor). This report discusses various aspects of the BBN, specifically in the context ofmore » the thermal challenge problem. A BBN is developed for a given set of experimental data in a particular experimental configuration. The development of the BBN and the method for ''solving'' the BBN to develop the posterior distribution of model output through Monte Carlo Markov Chain sampling is discussed in detail. The use of the BBN to compute a Bayes' factor is demonstrated.« less
ERIC Educational Resources Information Center
Marmolejo-Ramos, Fernando; Cousineau, Denis
2017-01-01
The number of articles showing dissatisfaction with the null hypothesis statistical testing (NHST) framework has been progressively increasing over the years. Alternatives to NHST have been proposed and the Bayesian approach seems to have achieved the highest amount of visibility. In this last part of the special issue, a few alternative…
NASA Astrophysics Data System (ADS)
Ni, Yanchun; Lu, Xilin; Lu, Wensheng
2017-03-01
The field non-destructive vibration test plays an important role in the area of structural health monitoring. It assists in monitoring the health status and reducing the risk caused by the poor performance of structures. As the most economic field test among the various vibration tests, the ambient vibration test is the most popular and is widely used to assess the physical condition of a structure under operational service. Based on the ambient vibration data, modal identification can help provide significant previous study for model updating and damage detection during the service life of a structure. It has been proved that modal identification works well in the investigation of the dynamic performance of different kinds of structures. In this paper, the objective structure is a high-rise multi-function office building. The whole building is composed of seven three-story structural units. Each unit comprises one complete floor and two L shaped floors to form large spaces along the vertical direction. There are 56 viscous dampers installed in the building to improve the energy dissipation capacity. Due to the special feature of the structure, field vibration tests and further modal identification were performed to investigate its dynamic performance. Twenty-nine setups were designed to cover all the degrees of freedom of interest. About two years later, another field test was carried out to measure the building for 48 h to investigate the performance variance and the distribution of the modal parameters. A Fast Bayesian FFT method was employed to perform the modal identification. This Bayesian method not only provides the most probable values of the modal parameters but also assesses the associated posterior uncertainty analytically, which is especially relevant in field vibration tests arising due to measurement noise, sensor alignment error, modelling error, etc. A shaking table test was also implemented including cases with and without dampers, which assists in investigating the effect of dampers. The modal parameters obtained from different tests were investigated separately and then compared with each other.
Hypothesis Testing as an Act of Rationality
NASA Astrophysics Data System (ADS)
Nearing, Grey
2017-04-01
Statistical hypothesis testing is ad hoc in two ways. First, setting probabilistic rejection criteria is, as Neyman (1957) put it, an act of will rather than an act of rationality. Second, physical theories like conservation laws do not inherently admit probabilistic predictions, and so we must use what are called epistemic bridge principles to connect model predictions with the actual methods of hypothesis testing. In practice, these bridge principles are likelihood functions, error functions, or performance metrics. I propose that the reason we are faced with these problems is because we have historically failed to account for a fundamental component of basic logic - namely the portion of logic that explains how epistemic states evolve in the presence of empirical data. This component of Cox' (1946) calculitic logic is called information theory (Knuth, 2005), and adding information theory our hypothetico-deductive account of science yields straightforward solutions to both of the above problems. This also yields a straightforward method for dealing with Popper's (1963) problem of verisimilitude by facilitating a quantitative approach to measuring process isomorphism. In practice, this involves data assimilation. Finally, information theory allows us to reliably bound measures of epistemic uncertainty, thereby avoiding the problem of Bayesian incoherency under misspecified priors (Grünwald, 2006). I therefore propose solutions to four of the fundamental problems inherent in both hypothetico-deductive and/or Bayesian hypothesis testing. - Neyman (1957) Inductive Behavior as a Basic Concept of Philosophy of Science. - Cox (1946) Probability, Frequency and Reasonable Expectation. - Knuth (2005) Lattice Duality: The Origin of Probability and Entropy. - Grünwald (2006). Bayesian Inconsistency under Misspecification. - Popper (1963) Conjectures and Refutations: The Growth of Scientific Knowledge.
Emerging Concepts of Data Integration in Pathogen Phylodynamics.
Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.
Emerging Concepts of Data Integration in Pathogen Phylodynamics
Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504
Peterson, M A; de Gelder, B; Rapcsak, S Z; Gerhardstein, P C; Bachoud-Lévi, A
2000-01-01
In three experiments we investigated whether conscious object recognition is necessary or sufficient for effects of object memories on figure assignment. In experiment 1, we examined a brain-damaged participant, AD, whose conscious object recognition is severely impaired. AD's responses about figure assignment do reveal effects from memories of object structure, indicating that conscious object recognition is not necessary for these effects, and identifying the figure-ground test employed here as a new implicit test of access to memories of object structure. In experiments 2 and 3, we tested a second brain-damaged participant, WG, for whom conscious object recognition was relatively spared. Nevertheless, effects from memories of object structure on figure assignment were not evident in WG's responses about figure assignment in experiment 2, indicating that conscious object recognition is not sufficient for effects of object memories on figure assignment. WG's performance sheds light on AD's performance, and has implications for the theoretical understanding of object memory effects on figure assignment.
Bayesian model for matching the radiometric measurements of aerospace and field ocean color sensors.
Salama, Mhd Suhyb; Su, Zhongbo
2010-01-01
A Bayesian model is developed to match aerospace ocean color observation to field measurements and derive the spatial variability of match-up sites. The performance of the model is tested against populations of synthesized spectra and full and reduced resolutions of MERIS data. The model derived the scale difference between synthesized satellite pixel and point measurements with R(2) > 0.88 and relative error < 21% in the spectral range from 400 nm to 695 nm. The sub-pixel variabilities of reduced resolution MERIS image are derived with less than 12% of relative errors in heterogeneous region. The method is generic and applicable to different sensors.
NASA Astrophysics Data System (ADS)
Walker, David M.; Allingham, David; Lee, Heung Wing Joseph; Small, Michael
2010-02-01
Small world network models have been effective in capturing the variable behaviour of reported case data of the SARS coronavirus outbreak in Hong Kong during 2003. Simulations of these models have previously been realized using informed “guesses” of the proposed model parameters and tested for consistency with the reported data by surrogate analysis. In this paper we attempt to provide statistically rigorous parameter distributions using Approximate Bayesian Computation sampling methods. We find that such sampling schemes are a useful framework for fitting parameters of stochastic small world network models where simulation of the system is straightforward but expressing a likelihood is cumbersome.
Bayesian estimation of self-similarity exponent
NASA Astrophysics Data System (ADS)
Makarava, Natallia; Benmehdi, Sabah; Holschneider, Matthias
2011-08-01
In this study we propose a Bayesian approach to the estimation of the Hurst exponent in terms of linear mixed models. Even for unevenly sampled signals and signals with gaps, our method is applicable. We test our method by using artificial fractional Brownian motion of different length and compare it with the detrended fluctuation analysis technique. The estimation of the Hurst exponent of a Rosenblatt process is shown as an example of an H-self-similar process with non-Gaussian dimensional distribution. Additionally, we perform an analysis with real data, the Dow-Jones Industrial Average closing values, and analyze its temporal variation of the Hurst exponent.
Partitioning Ocean Wave Spectra Obtained from Radar Observations
NASA Astrophysics Data System (ADS)
Delaye, Lauriane; Vergely, Jean-Luc; Hauser, Daniele; Guitton, Gilles; Mouche, Alexis; Tison, Celine
2016-08-01
2D wave spectra of ocean waves can be partitioned into several wave components to better characterize the scene. We present here two methods of component detection: one based on watershed algorithm and the other based on a Bayesian approach. We tested both methods on a set of simulated SWIM data, the Ku-band real aperture radar embarked on the CFOSAT (China- France Oceanography Satellite) mission which launch is planned mid-2018. We present the results and the limits of both approaches and show that Bayesian method can also be applied to other kind of wave spectra observations as those obtained with the radar KuROS, an airborne radar wave spectrometer.
Utility-based designs for randomized comparative trials with categorical outcomes
Murray, Thomas A.; Thall, Peter F.; Yuan, Ying
2016-01-01
A general utility-based testing methodology for design and conduct of randomized comparative clinical trials with categorical outcomes is presented. Numerical utilities of all elementary events are elicited to quantify their desirabilities. These numerical values are used to map the categorical outcome probability vector of each treatment to a mean utility, which is used as a one-dimensional criterion for constructing comparative tests. Bayesian tests are presented, including fixed sample and group sequential procedures, assuming Dirichlet-multinomial models for the priors and likelihoods. Guidelines are provided for establishing priors, eliciting utilities, and specifying hypotheses. Efficient posterior computation is discussed, and algorithms are provided for jointly calibrating test cutoffs and sample size to control overall type I error and achieve specified power. Asymptotic approximations for the power curve are used to initialize the algorithms. The methodology is applied to re-design a completed trial that compared two chemotherapy regimens for chronic lymphocytic leukemia, in which an ordinal efficacy outcome was dichotomized and toxicity was ignored to construct the trial’s design. The Bayesian tests also are illustrated by several types of categorical outcomes arising in common clinical settings. Freely available computer software for implementation is provided. PMID:27189672
Uncertainty Estimates of Psychoacoustic Thresholds Obtained from Group Tests
NASA Technical Reports Server (NTRS)
Rathsam, Jonathan; Christian, Andrew
2016-01-01
Adaptive psychoacoustic test methods, in which the next signal level depends on the response to the previous signal, are the most efficient for determining psychoacoustic thresholds of individual subjects. In many tests conducted in the NASA psychoacoustic labs, the goal is to determine thresholds representative of the general population. To do this economically, non-adaptive testing methods are used in which three or four subjects are tested at the same time with predetermined signal levels. This approach requires us to identify techniques for assessing the uncertainty in resulting group-average psychoacoustic thresholds. In this presentation we examine the Delta Method of frequentist statistics, the Generalized Linear Model (GLM), the Nonparametric Bootstrap, a frequentist method, and Markov Chain Monte Carlo Posterior Estimation and a Bayesian approach. Each technique is exercised on a manufactured, theoretical dataset and then on datasets from two psychoacoustics facilities at NASA. The Delta Method is the simplest to implement and accurate for the cases studied. The GLM is found to be the least robust, and the Bootstrap takes the longest to calculate. The Bayesian Posterior Estimate is the most versatile technique examined because it allows the inclusion of prior information.
Incorporating uncertainty into medical decision making: an approach to unexpected test results.
Bianchi, Matt T; Alexander, Brian M; Cash, Sydney S
2009-01-01
The utility of diagnostic tests derives from the ability to translate the population concepts of sensitivity and specificity into information that will be useful for the individual patient: the predictive value of the result. As the array of available diagnostic testing broadens, there is a temptation to de-emphasize history and physical findings and defer to the objective rigor of technology. However, diagnostic test interpretation is not always straightforward. One significant barrier to routine use of probability-based test interpretation is the uncertainty inherent in pretest probability estimation, the critical first step of Bayesian reasoning. The context in which this uncertainty presents the greatest challenge is when test results oppose clinical judgment. It is this situation when decision support would be most helpful. The authors propose a simple graphical approach that incorporates uncertainty in pretest probability and has specific application to the interpretation of unexpected results. This method quantitatively demonstrates how uncertainty in disease probability may be amplified when test results are unexpected (opposing clinical judgment), even for tests with high sensitivity and specificity. The authors provide a simple nomogram for determining whether an unexpected test result suggests that one should "switch diagnostic sides.'' This graphical framework overcomes the limitation of pretest probability uncertainty in Bayesian analysis and guides decision making when it is most challenging: interpretation of unexpected test results.
Bayesian Forecasting Tool to Predict the Need for Antidote in Acute Acetaminophen Overdose.
Desrochers, Julie; Wojciechowski, Jessica; Klein-Schwartz, Wendy; Gobburu, Jogarao V S; Gopalakrishnan, Mathangi
2017-08-01
Acetaminophen (APAP) overdose is the leading cause of acute liver injury in the United States. Patients with elevated plasma acetaminophen concentrations (PACs) require hepatoprotective treatment with N-acetylcysteine (NAC). These patients have been primarily risk-stratified using the Rumack-Matthew nomogram. Previous studies of acute APAP overdoses found that the nomogram failed to accurately predict the need for the antidote. The objectives of this study were to develop a population pharmacokinetic (PK) model for APAP following acute overdose and evaluate the utility of population PK model-based Bayesian forecasting in NAC administration decisions. Limited APAP concentrations from a retrospective cohort of acute overdosed subjects from the Maryland Poison Center were used to develop the population PK model and to investigate the effect of type of APAP products and other prognostic factors. The externally validated population PK model was used a prior for Bayesian forecasting to predict the individual PK profile when one or two observed PACs were available. The utility of Bayesian forecasted APAP concentration-time profiles inferred from one (first) or two (first and second) PAC observations were also tested in their ability to predict the observed NAC decisions. A one-compartment model with first-order absorption and elimination adequately described the data with single activated charcoal and APAP products as significant covariates on absorption and bioavailability. The Bayesian forecasted individual concentration-time profiles had acceptable bias (6.2% and 9.8%) and accuracy (40.5% and 41.9%) when either one or two PACs were considered, respectively. The sensitivity and negative predictive value of the Bayesian forecasted NAC decisions using one PAC were 84% and 92.6%, respectively. The population PK analysis provided a platform for acceptably predicting an individual's concentration-time profile following acute APAP overdose with at least one PAC, and the individual's covariate profile, and can potentially be used for making early NAC administration decisions. © 2017 Pharmacotherapy Publications, Inc.
Prediction of new bioactive molecules using a Bayesian belief network.
Abdo, Ammar; Leclère, Valérie; Jacques, Philippe; Salim, Naomie; Pupin, Maude
2014-01-27
Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.
Signatures of selection in five Italian cattle breeds detected by a 54K SNP panel.
Mancini, Giordano; Gargani, Maria; Chillemi, Giovanni; Nicolazzi, Ezequiel Luis; Marsan, Paolo Ajmone; Valentini, Alessio; Pariset, Lorraine
2014-02-01
In this study we used a medium density panel of SNP markers to perform population genetic analysis in five Italian cattle breeds. The BovineSNP50 BeadChip was used to genotype a total of 2,935 bulls of Piedmontese, Marchigiana, Italian Holstein, Italian Brown and Italian Pezzata Rossa breeds. To determine a genome-wide pattern of positive selection we mapped the F st values against genome location. The highest F st peaks were obtained on BTA6 and BTA13 where some candidate genes are located. We identified selection signatures peculiar of each breed which suggest selection for genes involved in milk or meat traits. The genetic structure was investigated by using a multidimensional scaling of the genetic distance matrix and a Bayesian approach implemented in the STRUCTURE software. The genotyping data showed a clear partitioning of the cattle genetic diversity into distinct breeds if a number of clusters equal to the number of populations were given. Assuming a lower number of clusters beef breeds group together. Both methods showed all five breeds separated in well defined clusters and the Bayesian approach assigned individuals to the breed of origin. The work is of interest not only because it enriches the knowledge on the process of evolution but also because the results generated could have implications for selective breeding programs.
NASA Astrophysics Data System (ADS)
Arabzadeh, Vida; Niaki, S. T. A.; Arabzadeh, Vahid
2017-10-01
One of the most important processes in the early stages of construction projects is to estimate the cost involved. This process involves a wide range of uncertainties, which make it a challenging task. Because of unknown issues, using the experience of the experts or looking for similar cases are the conventional methods to deal with cost estimation. The current study presents data-driven methods for cost estimation based on the application of artificial neural network (ANN) and regression models. The learning algorithms of the ANN are the Levenberg-Marquardt and the Bayesian regulated. Moreover, regression models are hybridized with a genetic algorithm to obtain better estimates of the coefficients. The methods are applied in a real case, where the input parameters of the models are assigned based on the key issues involved in a spherical tank construction. The results reveal that while a high correlation between the estimated cost and the real cost exists; both ANNs could perform better than the hybridized regression models. In addition, the ANN with the Levenberg-Marquardt learning algorithm (LMNN) obtains a better estimation than the ANN with the Bayesian-regulated learning algorithm (BRNN). The correlation between real data and estimated values is over 90%, while the mean square error is achieved around 0.4. The proposed LMNN model can be effective to reduce uncertainty and complexity in the early stages of the construction project.
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363
Determining open cluster membership. A Bayesian framework for quantitative member classification
NASA Astrophysics Data System (ADS)
Stott, Jonathan J.
2018-01-01
Aims: My goal is to develop a quantitative algorithm for assessing open cluster membership probabilities. The algorithm is designed to work with single-epoch observations. In its simplest form, only one set of program images and one set of reference images are required. Methods: The algorithm is based on a two-stage joint astrometric and photometric assessment of cluster membership probabilities. The probabilities were computed within a Bayesian framework using any available prior information. Where possible, the algorithm emphasizes simplicity over mathematical sophistication. Results: The algorithm was implemented and tested against three observational fields using published survey data. M 67 and NGC 654 were selected as cluster examples while a third, cluster-free, field was used for the final test data set. The algorithm shows good quantitative agreement with the existing surveys and has a false-positive rate significantly lower than the astrometric or photometric methods used individually.
The development of a probabilistic approach to forecast coastal change
Lentz, Erika E.; Hapke, Cheryl J.; Rosati, Julie D.; Wang, Ping; Roberts, Tiffany M.
2011-01-01
This study demonstrates the applicability of a Bayesian probabilistic model as an effective tool in predicting post-storm beach changes along sandy coastlines. Volume change and net shoreline movement are modeled for two study sites at Fire Island, New York in response to two extratropical storms in 2007 and 2009. Both study areas include modified areas adjacent to unmodified areas in morphologically different segments of coast. Predicted outcomes are evaluated against observed changes to test model accuracy and uncertainty along 163 cross-shore transects. Results show strong agreement in the cross validation of predictions vs. observations, with 70-82% accuracies reported. Although no consistent spatial pattern in inaccurate predictions could be determined, the highest prediction uncertainties appeared in locations that had been recently replenished. Further testing and model refinement are needed; however, these initial results show that Bayesian networks have the potential to serve as important decision-support tools in forecasting coastal change.
Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.
Model Diagnostics for Bayesian Networks
ERIC Educational Resources Information Center
Sinharay, Sandip
2006-01-01
Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B; Neyer, Franz J; van Aken, Marcel AG
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are introduced using a simplified example. Thereafter, the advantages and pitfalls of the specification of prior knowledge are discussed. To illustrate Bayesian methods explained in this study, in a second example a series of studies that examine the theoretical framework of dynamic interactionism are considered. In the Discussion the advantages and disadvantages of using Bayesian statistics are reviewed, and guidelines on how to report on Bayesian statistics are provided. PMID:24116396
Community Detection Algorithm Combining Stochastic Block Model and Attribute Data Clustering
NASA Astrophysics Data System (ADS)
Kataoka, Shun; Kobayashi, Takuto; Yasuda, Muneki; Tanaka, Kazuyuki
2016-11-01
We propose a new algorithm to detect the community structure in a network that utilizes both the network structure and vertex attribute data. Suppose we have the network structure together with the vertex attribute data, that is, the information assigned to each vertex associated with the community to which it belongs. The problem addressed this paper is the detection of the community structure from the information of both the network structure and the vertex attribute data. Our approach is based on the Bayesian approach that models the posterior probability distribution of the community labels. The detection of the community structure in our method is achieved by using belief propagation and an EM algorithm. We numerically verified the performance of our method using computer-generated networks and real-world networks.
Towards an Artificial Space Object Taxonomy
NASA Astrophysics Data System (ADS)
Wilkins, M.; Schumacher, P.; Jah, M.; Pfeffer, A.
2013-09-01
Object recognition is the first step in positively identifying a resident space object (RSO), i.e. assigning an RSO to a category such as GPS satellite or space debris. Object identification is the process of deciding that two RSOs are in fact one and the same. Provided we have appropriately defined a satellite taxonomy that allows us to place a given RSO into a particular class of object without any ambiguity, one can assess the probability of assignment to a particular class by determining how well the object satisfies the unique criteria of belonging to that class. Ultimately, tree-based taxonomies delineate unique signatures by defining the minimum amount of information required to positively identify a RSO. Therefore, taxonomic trees can be used to depict hypotheses in a Bayesian object recognition and identification process. This work describes a new RSO taxonomy along with specific reasoning behind the choice of groupings. An alternative taxonomy was recently presented at the Sixth Conference on Space Debris in Darmstadt, Germany. [1] The best example of a taxonomy that enjoys almost universal scientific acceptance is the classical Linnaean biological taxonomy. A strength of Linnaean taxonomy is that it can be used to organize the different kinds of living organisms, simply and practically. Every species can be given a unique name. This uniqueness and stability are a result of the acceptance by biologists specializing in taxonomy, not merely of the binomial names themselves. Fundamentally, the taxonomy is governed by rules for the use of these names, and these are laid down in formal Nomenclature Codes. We seek to provide a similar formal nomenclature system for RSOs through a defined tree-based taxonomy structure. Each categorization, beginning with the most general or inclusive, at any level is called a taxon. Taxon names are defined by a type, which can be a specimen or a taxon of lower rank, and a diagnosis, a statement intended to supply characters that differentiate the taxon from others with which it is likely to be confused. Each taxon will have a set of uniquely distinguishing features that will allow one to place a given object into a specific group without any ambiguity. When a new object does not fall into a specific taxon that is already defined, the entire tree structure will need to be evaluated to determine if a new taxon should be created. Ultimately, an online learning process to facilitate tree growth would be desirable. One can assess the probability of assignment to a particular taxon by determining how well the object satisfies the unique criteria of belonging to that taxon. Therefore, we can use taxonomic trees in a Bayesian process to assign prior probabilities to each of our object recognition and identification hypotheses. We will show that this taxonomy is robust by demonstrating specific stressing classification examples. We will also demonstrate how to implement this taxonomy in Figaro, an open source probabilistic programming language.
BiomeNet: A Bayesian Model for Inference of Metabolic Divergence among Microbial Communities
Chipman, Hugh; Gu, Hong; Bielawski, Joseph P.
2014-01-01
Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks), for inferring differential prevalence of metabolic subnetworks among microbial communities. To infer the structure of community-level metabolic interactions, BiomeNet applies a mixed-membership modelling framework to enzyme abundance information. The basic idea is that the mixture components of the model (metabolic reactions, subnetworks, and networks) are shared across all groups (microbiome samples), but the mixture proportions vary from group to group. Through this framework, the model can capture nested structures within the data. BiomeNet is unique in modeling each metagenome sample as a mixture of complex metabolic systems (metabosystems). The metabosystems are composed of mixtures of tightly connected metabolic subnetworks. BiomeNet differs from other unsupervised methods by allowing researchers to discriminate groups of samples through the metabolic patterns it discovers in the data, and by providing a framework for interpreting them. We describe a collapsed Gibbs sampler for inference of the mixture weights under BiomeNet, and we use simulation to validate the inference algorithm. Application of BiomeNet to human gut metagenomes revealed a metabosystem with greater prevalence among inflammatory bowel disease (IBD) patients. Based on the discriminatory subnetworks for this metabosystem, we inferred that the community is likely to be closely associated with the human gut epithelium, resistant to dietary interventions, and interfere with human uptake of an antioxidant connected to IBD. Because this metabosystem has a greater capacity to exploit host-associated glycans, we speculate that IBD-associated communities might arise from opportunist growth of bacteria that can circumvent the host's nutrient-based mechanism for bacterial partner selection. PMID:25412107
Bayesian decision support for coding occupational injury data.
Nanda, Gaurav; Grattan, Kathleen M; Chu, MyDzung T; Davis, Letitia K; Lehto, Mark R
2016-06-01
Studies on autocoding injury data have found that machine learning algorithms perform well for categories that occur frequently but often struggle with rare categories. Therefore, manual coding, although resource-intensive, cannot be eliminated. We propose a Bayesian decision support system to autocode a large portion of the data, filter cases for manual review, and assist human coders by presenting them top k prediction choices and a confusion matrix of predictions from Bayesian models. We studied the prediction performance of Single-Word (SW) and Two-Word-Sequence (TW) Naïve Bayes models on a sample of data from the 2011 Survey of Occupational Injury and Illness (SOII). We used the agreement in prediction results of SW and TW models, and various prediction strength thresholds for autocoding and filtering cases for manual review. We also studied the sensitivity of the top k predictions of the SW model, TW model, and SW-TW combination, and then compared the accuracy of the manually assigned codes to SOII data with that of the proposed system. The accuracy of the proposed system, assuming well-trained coders reviewing a subset of only 26% of cases flagged for review, was estimated to be comparable (86.5%) to the accuracy of the original coding of the data set (range: 73%-86.8%). Overall, the TW model had higher sensitivity than the SW model, and the accuracy of the prediction results increased when the two models agreed, and for higher prediction strength thresholds. The sensitivity of the top five predictions was 93%. The proposed system seems promising for coding injury data as it offers comparable accuracy and less manual coding. Accurate and timely coded occupational injury data is useful for surveillance as well as prevention activities that aim to make workplaces safer. Copyright © 2016 Elsevier Ltd and National Safety Council. All rights reserved.
Bayesian Hypothesis Testing for Psychologists: A Tutorial on the Savage-Dickey Method
ERIC Educational Resources Information Center
Wagenmakers, Eric-Jan; Lodewyckx, Tom; Kuriyal, Himanshu; Grasman, Raoul
2010-01-01
In the field of cognitive psychology, the "p"-value hypothesis test has established a stranglehold on statistical reporting. This is unfortunate, as the "p"-value provides at best a rough estimate of the evidence that the data provide for the presence of an experimental effect. An alternative and arguably more appropriate measure of evidence is…
A Bayesian Hierarchical Selection Model for Academic Growth with Missing Data
ERIC Educational Resources Information Center
Allen, Jeff
2017-01-01
Using a sample of schools testing annually in grades 9-11 with a vertically linked series of assessments, a latent growth curve model is used to model test scores with student intercepts and slopes nested within school. Missed assessments can occur because of student mobility, student dropout, absenteeism, and other reasons. Missing data…
On the predictive information criteria for model determination in seismic hazard analysis
NASA Astrophysics Data System (ADS)
Varini, Elisa; Rotondi, Renata
2016-04-01
Many statistical tools have been developed for evaluating, understanding, and comparing models, from both frequentist and Bayesian perspectives. In particular, the problem of model selection can be addressed according to whether the primary goal is explanation or, alternatively, prediction. In the former case, the criteria for model selection are defined over the parameter space whose physical interpretation can be difficult; in the latter case, they are defined over the space of the observations, which has a more direct physical meaning. In the frequentist approaches, model selection is generally based on an asymptotic approximation which may be poor for small data sets (e.g. the F-test, the Kolmogorov-Smirnov test, etc.); moreover, these methods often apply under specific assumptions on models (e.g. models have to be nested in the likelihood ratio test). In the Bayesian context, among the criteria for explanation, the ratio of the observed marginal densities for two competing models, named Bayes Factor (BF), is commonly used for both model choice and model averaging (Kass and Raftery, J. Am. Stat. Ass., 1995). But BF does not apply to improper priors and, even when the prior is proper, it is not robust to the specification of the prior. These limitations can be extended to two famous penalized likelihood methods as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), since they are proved to be approximations of -2log BF . In the perspective that a model is as good as its predictions, the predictive information criteria aim at evaluating the predictive accuracy of Bayesian models or, in other words, at estimating expected out-of-sample prediction error using a bias-correction adjustment of within-sample error (Gelman et al., Stat. Comput., 2014). In particular, the Watanabe criterion is fully Bayesian because it averages the predictive distribution over the posterior distribution of parameters rather than conditioning on a point estimate, but it is hardly applicable to data which are not independent given parameters (Watanabe, J. Mach. Learn. Res., 2010). A solution is given by Ando and Tsay criterion where the joint density may be decomposed into the product of the conditional densities (Ando and Tsay, Int. J. Forecast., 2010). The above mentioned criteria are global summary measures of model performance, but more detailed analysis could be required to discover the reasons for poor global performance. In this latter case, a retrospective predictive analysis is performed on each individual observation. In this study we performed the Bayesian analysis of Italian data sets by four versions of a long-term hazard model known as the stress release model (Vere-Jones, J. Physics Earth, 1978; Bebbington and Harte, Geophys. J. Int., 2003; Varini and Rotondi, Environ. Ecol. Stat., 2015). Then we illustrate the results on their performance evaluated by Bayes Factor, predictive information criteria and retrospective predictive analysis.
Biasing Influences on Test Level Assignments for Hearing Impaired Students.
ERIC Educational Resources Information Center
Wolk, Steve
1985-01-01
Possible biasing influences of student characteristics were considered for teachers' judgments of appropriate test level assignments for about 1,300 hearing impaired special education students. Analyses indicated the presence of strong influences of race and severity of handicapping condition, as well as of sex, upon change in level assignments,…
Understanding Test-Type Assignment: Why Do Special Educators Make Unexpected Test-Type Assignments?
ERIC Educational Resources Information Center
Cho, Hyun-Jeong; Kingston, Neal
2014-01-01
We interviewed special educators (a) whose students with disabilities (SWDs) were proficient on the 2008 general education assessment but were assigned to the 2009 alternate assessment based on modified achievement standards (AA-MAS), and (b) whose students with mild disabilities took the 2008 alternate assessment based on alternate achievement…
12 CFR 563e.28 - Assigned ratings.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 5 2010-01-01 2010-01-01 false Assigned ratings. 563e.28 Section 563e.28 Banks... for Assessing Performance § 563e.28 Assigned ratings. (a) Ratings in general. Subject to paragraphs (b... performance under the lending, investment and service tests, the community development test, the small savings...
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
ERIC Educational Resources Information Center
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B.; Neyer, Franz J.; van Aken, Marcel A. G.
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are…
Bartlett, Jonathan W; Keogh, Ruth H
2018-06-01
Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.
A Bayesian Model for the Prediction and Early Diagnosis of Alzheimer's Disease.
Alexiou, Athanasios; Mantzavinos, Vasileios D; Greig, Nigel H; Kamal, Mohammad A
2017-01-01
Alzheimer's disease treatment is still an open problem. The diversity of symptoms, the alterations in common pathophysiology, the existence of asymptomatic cases, the different types of sporadic and familial Alzheimer's and their relevance with other types of dementia and comorbidities, have already created a myth-fear against the leading disease of the twenty first century. Many failed latest clinical trials and novel medications have revealed the early diagnosis as the most critical treatment solution, even though scientists tested the amyloid hypothesis and few related drugs. Unfortunately, latest studies have indicated that the disease begins at the very young ages thus making it difficult to determine the right time of proper treatment. By taking into consideration all these multivariate aspects and unreliable factors against an appropriate treatment, we focused our research on a non-classic statistical evaluation of the most known and accepted Alzheimer's biomarkers. Therefore, in this paper, the code and few experimental results of a computational Bayesian tool have being reported, dedicated to the correlation and assessment of several Alzheimer's biomarkers to export a probabilistic medical prognostic process. This new statistical software is executable in the Bayesian software Winbugs, based on the latest Alzheimer's classification and the formulation of the known relative probabilities of the various biomarkers, correlated with Alzheimer's progression, through a set of discrete distributions. A user-friendly web page has been implemented for the supporting of medical doctors and researchers, to upload Alzheimer's tests and receive statistics on the occurrence of Alzheimer's disease development or presence, due to abnormal testing in one or more biomarkers.
Pfeiffer, John M.; Johnson, Nathan A.; Randklev, Charles R.; Howells, Robert G.; Williams, James D.
2016-01-01
The Central Texas endemic freshwater mussel, Quadrula mitchelli (Simpson in Dall, 1896), had been presumed extinct until relict populations were recently rediscovered. To help guide ongoing and future conservation efforts focused on Q. mitchelli we set out to resolve several uncertainties regarding its evolutionary history, specifically its unknown generic position and untested species boundaries. We designed a molecular matrix consisting of two loci (cytochrome c oxidase subunit I and internal transcribed spacer I) and 57 terminal taxa to test the generic position of Q. mitchelli using Bayesian inference and maximum likelihood phylogenetic reconstruction. We also employed two Bayesian species validation methods to test five a priori species models (i.e. hypotheses of species delimitation). Our study is the first to test the generic position of Q.mitchelli and we found robust support for its inclusion in the genusFusconaia. Accordingly, we introduce the binomial, Fusconaia mitchelli comb. nov., to accurately represent the systematic position of the species. We resolved F. mitchelli individuals in two well supported and divergent clades that were generally distinguished as distinct species using Bayesian species validation methods, although alternative hypotheses of species delineation were also supported. Despite strong evidence of genetic isolation within F. mitchelli, we do not advocate for species-level status of the two clades as they are allopatrically distributed and no morphological, behavioral, or ecological characters are known to distinguish them. These results are discussed in the context of the systematics, distribution, and conservation ofF. mitchelli.
Developing and Testing a Model to Predict Outcomes of Organizational Change
Gustafson, David H; Sainfort, François; Eichler, Mary; Adams, Laura; Bisognano, Maureen; Steudel, Harold
2003-01-01
Objective To test the effectiveness of a Bayesian model employing subjective probability estimates for predicting success and failure of health care improvement projects. Data Sources Experts' subjective assessment data for model development and independent retrospective data on 221 healthcare improvement projects in the United States, Canada, and the Netherlands collected between 1996 and 2000 for validation. Methods A panel of theoretical and practical experts and literature in organizational change were used to identify factors predicting the outcome of improvement efforts. A Bayesian model was developed to estimate probability of successful change using subjective estimates of likelihood ratios and prior odds elicited from the panel of experts. A subsequent retrospective empirical analysis of change efforts in 198 health care organizations was performed to validate the model. Logistic regression and ROC analysis were used to evaluate the model's performance using three alternative definitions of success. Data Collection For the model development, experts' subjective assessments were elicited using an integrative group process. For the validation study, a staff person intimately involved in each improvement project responded to a written survey asking questions about model factors and project outcomes. Results Logistic regression chi-square statistics and areas under the ROC curve demonstrated a high level of model performance in predicting success. Chi-square statistics were significant at the 0.001 level and areas under the ROC curve were greater than 0.84. Conclusions A subjective Bayesian model was effective in predicting the outcome of actual improvement projects. Additional prospective evaluations as well as testing the impact of this model as an intervention are warranted. PMID:12785571
Bayesian Estimation in the One-Parameter Latent Trait Model.
1980-03-01
Journal of Mathematical and Statistical Psychology , 1973, 26, 31-44. (a) Andersen, E. B. A goodness of fit test for the Rasch model. Psychometrika, 1973, 28...technique for estimating latent trait mental test parameters. Educational and Psychological Measurement, 1976, 36, 705-715. Lindley, D. V. The...Lord, F. M. An analysis of verbal Scholastic Aptitude Test using Birnbaum’s three-parameter logistic model. Educational and Psychological
Simultaneous Estimation of Regression Functions for Marine Corps Technical Training Specialties.
1985-01-03
Edmonton, Alberta CANADA 1 Dr. Frederic M. Lord Educational Testing Service 1 Dr. Earl Hunt Princeton, NJ 08541 Dept, of Psychology University of...111111-1.6 MICROCOPY RESOLUTION TEST CHART NATIONAL BUREAU OF STANDARDS-1963-A SIMIULTANEOUS ESTIMATION OF REGRESSION FUNCTIONS FOR MARINE CORPS...Bayesian techniques for simul- taneous estimation to the specification of regression weights for selection tests used in various technical training courses
Wu, X; Lund, M S; Sun, D; Zhang, Q; Su, G
2015-10-01
One of the factors affecting the reliability of genomic prediction is the relationship among the animals of interest. This study investigated the reliability of genomic prediction in various scenarios with regard to the relationship between test and training animals, and among animals within the training data set. Different training data sets were generated from EuroGenomics data and a group of Nordic Holstein bulls (born in 2005 and afterwards) as a common test data set. Genomic breeding values were predicted using a genomic best linear unbiased prediction model and a Bayesian mixture model. The results showed that a closer relationship between test and training animals led to a higher reliability of genomic predictions for the test animals, while a closer relationship among training animals resulted in a lower reliability. In addition, the Bayesian mixture model in general led to a slightly higher reliability of genomic prediction, especially for the scenario of distant relationships between training and test animals. Therefore, to prevent a decrease in reliability, constant updates of the training population with animals from more recent generations are required. Moreover, a training population consisting of less-related animals is favourable for reliability of genomic prediction. © 2015 Blackwell Verlag GmbH.