Memory Indexing: A Novel Method for Tracing Memory Processes in Complex Cognitive Tasks
ERIC Educational Resources Information Center
Renkewitz, Frank; Jahn, Georg
2012-01-01
We validate an eye-tracking method applicable for studying memory processes in complex cognitive tasks. The method is tested with a task on probabilistic inferences from memory. It provides valuable data on the time course of processing, thus clarifying previous results on heuristic probabilistic inference. Participants learned cue values of…
Donnarumma, Francesco; Maisto, Domenico; Pezzulo, Giovanni
2016-01-01
How do humans and other animals face novel problems for which predefined solutions are not available? Human problem solving links to flexible reasoning and inference rather than to slow trial-and-error learning. It has received considerable attention since the early days of cognitive science, giving rise to well known cognitive architectures such as SOAR and ACT-R, but its computational and brain mechanisms remain incompletely known. Furthermore, it is still unclear whether problem solving is a “specialized” domain or module of cognition, in the sense that it requires computations that are fundamentally different from those supporting perception and action systems. Here we advance a novel view of human problem solving as probabilistic inference with subgoaling. In this perspective, key insights from cognitive architectures are retained such as the importance of using subgoals to split problems into subproblems. However, here the underlying computations use probabilistic inference methods analogous to those that are increasingly popular in the study of perception and action systems. To test our model we focus on the widely used Tower of Hanoi (ToH) task, and show that our proposed method can reproduce characteristic idiosyncrasies of human problem solvers: their sensitivity to the “community structure” of the ToH and their difficulties in executing so-called “counterintuitive” movements. Our analysis reveals that subgoals have two key roles in probabilistic inference and problem solving. First, prior beliefs on (likely) useful subgoals carve the problem space and define an implicit metric for the problem at hand—a metric to which humans are sensitive. Second, subgoals are used as waypoints in the probabilistic problem solving inference and permit to find effective solutions that, when unavailable, lead to problem solving deficits. Our study thus suggests that a probabilistic inference scheme enhanced with subgoals provides a comprehensive framework to study problem solving and its deficits. PMID:27074140
Donnarumma, Francesco; Maisto, Domenico; Pezzulo, Giovanni
2016-04-01
How do humans and other animals face novel problems for which predefined solutions are not available? Human problem solving links to flexible reasoning and inference rather than to slow trial-and-error learning. It has received considerable attention since the early days of cognitive science, giving rise to well known cognitive architectures such as SOAR and ACT-R, but its computational and brain mechanisms remain incompletely known. Furthermore, it is still unclear whether problem solving is a "specialized" domain or module of cognition, in the sense that it requires computations that are fundamentally different from those supporting perception and action systems. Here we advance a novel view of human problem solving as probabilistic inference with subgoaling. In this perspective, key insights from cognitive architectures are retained such as the importance of using subgoals to split problems into subproblems. However, here the underlying computations use probabilistic inference methods analogous to those that are increasingly popular in the study of perception and action systems. To test our model we focus on the widely used Tower of Hanoi (ToH) task, and show that our proposed method can reproduce characteristic idiosyncrasies of human problem solvers: their sensitivity to the "community structure" of the ToH and their difficulties in executing so-called "counterintuitive" movements. Our analysis reveals that subgoals have two key roles in probabilistic inference and problem solving. First, prior beliefs on (likely) useful subgoals carve the problem space and define an implicit metric for the problem at hand-a metric to which humans are sensitive. Second, subgoals are used as waypoints in the probabilistic problem solving inference and permit to find effective solutions that, when unavailable, lead to problem solving deficits. Our study thus suggests that a probabilistic inference scheme enhanced with subgoals provides a comprehensive framework to study problem solving and its deficits.
Campbell, Kieran R; Yau, Christopher
2017-03-15
Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses.
A note on probabilistic models over strings: the linear algebra approach.
Bouchard-Côté, Alexandre
2013-12-01
Probabilistic models over strings have played a key role in developing methods that take into consideration indels as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important theoretical question is the complexity of computing the normalization of a class of string-valued graphical models. This question has been investigated using tools from combinatorics, dynamic programming, and graph theory, and has practical applications in Bayesian phylogenetics. In this work, we revisit this theoretical question from a different point of view, based on linear algebra. The main contribution is a set of results based on this linear algebra view that facilitate the analysis and design of inference algorithms on string-valued graphical models. As an illustration, we use this method to give a new elementary proof of a known result on the complexity of inference on the "TKF91" model, a well-known probabilistic model over strings. Compared to previous work, our proving method is easier to extend to other models, since it relies on a novel weak condition, triangular transducers, which is easy to establish in practice. The linear algebra view provides a concise way of describing transducer algorithms and their compositions, opens the possibility of transferring fast linear algebra libraries (for example, based on GPUs), as well as low rank matrix approximation methods, to string-valued inference problems.
Zhang, Lei; Zeng, Zhi; Ji, Qiang
2011-09-01
Chain graph (CG) is a hybrid probabilistic graphical model (PGM) capable of modeling heterogeneous relationships among random variables. So far, however, its application in image and video analysis is very limited due to lack of principled learning and inference methods for a CG of general topology. To overcome this limitation, we introduce methods to extend the conventional chain-like CG model to CG model with more general topology and the associated methods for learning and inference in such a general CG model. Specifically, we propose techniques to systematically construct a generally structured CG, to parameterize this model, to derive its joint probability distribution, to perform joint parameter learning, and to perform probabilistic inference in this model. To demonstrate the utility of such an extended CG, we apply it to two challenging image and video analysis problems: human activity recognition and image segmentation. The experimental results show improved performance of the extended CG model over the conventional directed or undirected PGMs. This study demonstrates the promise of the extended CG for effective modeling and inference of complex real-world problems.
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called "causal independence"). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to O(k log(k)2) and the space to O(k log(k)) where k is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions.
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Orhan, A Emin; Ma, Wei Ji
2017-07-26
Animals perform near-optimal probabilistic inference in a wide range of psychophysical tasks. Probabilistic inference requires trial-to-trial representation of the uncertainties associated with task variables and subsequent use of this representation. Previous work has implemented such computations using neural networks with hand-crafted and task-dependent operations. We show that generic neural networks trained with a simple error-based learning rule perform near-optimal probabilistic inference in nine common psychophysical tasks. In a probabilistic categorization task, error-based learning in a generic network simultaneously explains a monkey's learning curve and the evolution of qualitative aspects of its choice behavior. In all tasks, the number of neurons required for a given level of performance grows sublinearly with the input population size, a substantial improvement on previous implementations of probabilistic inference. The trained networks develop a novel sparsity-based probabilistic population code. Our results suggest that probabilistic inference emerges naturally in generic neural networks trained with error-based learning rules.Behavioural tasks often require probability distributions to be inferred about task specific variables. Here, the authors demonstrate that generic neural networks can be trained using a simple error-based learning rule to perform such probabilistic computations efficiently without any need for task specific operations.
A Scalable Approach to Probabilistic Latent Space Inference of Large-Scale Networks
Yin, Junming; Ho, Qirong; Xing, Eric P.
2014-01-01
We propose a scalable approach for making inference about latent spaces of large networks. With a succinct representation of networks as a bag of triangular motifs, a parsimonious statistical model, and an efficient stochastic variational inference algorithm, we are able to analyze real networks with over a million vertices and hundreds of latent roles on a single machine in a matter of hours, a setting that is out of reach for many existing methods. When compared to the state-of-the-art probabilistic approaches, our method is several orders of magnitude faster, with competitive or improved accuracy for latent space recovery and link prediction. PMID:25400487
Real-time value-driven diagnosis
NASA Technical Reports Server (NTRS)
Dambrosio, Bruce
1995-01-01
Diagnosis is often thought of as an isolated task in theoretical reasoning (reasoning with the goal of updating our beliefs about the world). We present a decision-theoretic interpretation of diagnosis as a task in practical reasoning (reasoning with the goal of acting in the world), and sketch components of our approach to this task. These components include an abstract problem description, a decision-theoretic model of the basic task, a set of inference methods suitable for evaluating the decision representation in real-time, and a control architecture to provide the needed continuing coordination between the agent and its environment. A principal contribution of this work is the representation and inference methods we have developed, which extend previously available probabilistic inference methods and narrow, somewhat, the gap between probabilistic and logical models of diagnosis.
Praveen, Paurush; Fröhlich, Holger
2013-01-01
Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.
Teaching machines to find mantle composition
NASA Astrophysics Data System (ADS)
Atkins, Suzanne; Tackley, Paul; Trampert, Jeannot; Valentine, Andrew
2017-04-01
The composition of the mantle affects many geodynamical processes by altering factors such as the density, the location of phase changes, and melting temperature. The inferences we make about mantle composition also determine how we interpret the changes in velocity, reflections, attenuation and scattering seen by seismologists. However, the bulk composition of the mantle is very poorly constrained. Inferences are made from meteorite samples, rock samples from the Earth and inferences made from geophysical data. All of these approaches require significant assumptions and the inferences made are subject to large uncertainties. Here we present a new method for inferring mantle composition, based on pattern recognition machine learning, which uses large scale in situ observations of the mantle to make fully probabilistic inferences of composition for convection simulations. Our method has an advantage over other petrological approaches because we use large scale geophysical observations. This means that we average over much greater length scales and do not need to rely on extrapolating from localised samples of the mantle or planetary disk. Another major advantage of our method is that it is fully probabilistic. This allows us to include all of the uncertainties inherent in the inference process, giving us far more information about the reliability of the result than other methods. Finally our method includes the impact of composition on mantle convection. This allows us to make much more precise inferences from geophysical data than other geophysical approaches, which attempt to invert one observation with no consideration of the relationship between convection and composition. We use a sampling based inversion method, using hundreds of convection simulations run using StagYY with self consistent mineral physics properties calculated using the PerpleX package. The observations from these simulations are used to train a neural network to make a probabilistic inference for major element oxide composition of the mantle. We find we can constrain bulk mantle FeO molar percent, FeO/MgO and FeO/SiO2 using observations of the temperature and density structure of the mantle in convection simulations.
Bröder, A
2000-09-01
The boundedly rational 'Take-The-Best" heuristic (TTB) was proposed by G. Gigerenzer, U. Hoffrage, and H. Kleinbölting (1991) as a model of fast and frugal probabilistic inferences. Although the simple lexicographic rule proved to be successful in computer simulations, direct empirical demonstrations of its adequacy as a psychological model are lacking because of several methodical problems. In 4 experiments with a total of 210 participants, this question was addressed. Whereas Experiment 1 showed that TTB is not valid as a universal hypothesis about probabilistic inferences, up to 28% of participants in Experiment 2 and 53% of participants in Experiment 3 were classified as TTB users. Experiment 4 revealed that investment costs for information seem to be a relevant factor leading participants to switch to a noncompensatory TTB strategy. The observed individual differences in strategy use imply the recommendation of an idiographic approach to decision-making research.
Praveen, Paurush; Fröhlich, Holger
2013-01-01
Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available. PMID:23826291
USDA-ARS?s Scientific Manuscript database
Objective: To examine the risk factors of developing functional decline and make probabilistic predictions by using a tree-based method that allows higher order polynomials and interactions of the risk factors. Methods: The conditional inference tree analysis, a data mining approach, was used to con...
POPPER, a simple programming language for probabilistic semantic inference in medicine.
Robson, Barry
2015-01-01
Our previous reports described the use of the Hyperbolic Dirac Net (HDN) as a method for probabilistic inference from medical data, and a proposed probabilistic medical Semantic Web (SW) language Q-UEL to provide that data. Rather like a traditional Bayes Net, that HDN provided estimates of joint and conditional probabilities, and was static, with no need for evolution due to "reasoning". Use of the SW will require, however, (a) at least the semantic triple with more elaborate relations than conditional ones, as seen in use of most verbs and prepositions, and (b) rules for logical, grammatical, and definitional manipulation that can generate changes in the inference net. Here is described the simple POPPER language for medical inference. It can be automatically written by Q-UEL, or by hand. Based on studies with our medical students, it is believed that a tool like this may help in medical education and that a physician unfamiliar with SW science can understand it. It is here used to explore the considerable challenges of assigning probabilities, and not least what the meaning and utility of inference net evolution would be for a physician. Copyright © 2014 Elsevier Ltd. All rights reserved.
Campbell, Kieran R.
2016-01-01
Single cell gene expression profiling can be used to quantify transcriptional dynamics in temporal processes, such as cell differentiation, using computational methods to label each cell with a ‘pseudotime’ where true time series experimentation is too difficult to perform. However, owing to the high variability in gene expression between individual cells, there is an inherent uncertainty in the precise temporal ordering of the cells. Pre-existing methods for pseudotime estimation have predominantly given point estimates precluding a rigorous analysis of the implications of uncertainty. We use probabilistic modelling techniques to quantify pseudotime uncertainty and propagate this into downstream differential expression analysis. We demonstrate that reliance on a point estimate of pseudotime can lead to inflated false discovery rates and that probabilistic approaches provide greater robustness and measures of the temporal resolution that can be obtained from pseudotime inference. PMID:27870852
Heck, Daniel W; Hilbig, Benjamin E; Moshagen, Morten
2017-08-01
Decision strategies explain how people integrate multiple sources of information to make probabilistic inferences. In the past decade, increasingly sophisticated methods have been developed to determine which strategy explains decision behavior best. We extend these efforts to test psychologically more plausible models (i.e., strategies), including a new, probabilistic version of the take-the-best (TTB) heuristic that implements a rank order of error probabilities based on sequential processing. Within a coherent statistical framework, deterministic and probabilistic versions of TTB and other strategies can directly be compared using model selection by minimum description length or the Bayes factor. In an experiment with inferences from given information, only three of 104 participants were best described by the psychologically plausible, probabilistic version of TTB. Similar as in previous studies, most participants were classified as users of weighted-additive, a strategy that integrates all available information and approximates rational decisions. Copyright © 2017 Elsevier Inc. All rights reserved.
OncoNEM: inferring tumor evolution from single-cell sequencing data.
Ross, Edith M; Markowetz, Florian
2016-04-15
Single-cell sequencing promises a high-resolution view of genetic heterogeneity and clonal evolution in cancer. However, methods to infer tumor evolution from single-cell sequencing data lag behind methods developed for bulk-sequencing data. Here, we present OncoNEM, a probabilistic method for inferring intra-tumor evolutionary lineage trees from somatic single nucleotide variants of single cells. OncoNEM identifies homogeneous cellular subpopulations and infers their genotypes as well as a tree describing their evolutionary relationships. In simulation studies, we assess OncoNEM's robustness and benchmark its performance against competing methods. Finally, we show its applicability in case studies of muscle-invasive bladder cancer and essential thrombocythemia.
Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs
LeGault, Laura H.; Dewey, Colin N.
2013-01-01
Motivation: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell. RNA-Seq is a promising technology for analyzing alternative transcripts, as it does not require prior knowledge of transcript structures or genome sequences. However, analysis of RNA-Seq data in the presence of genes with large numbers of alternative transcripts is currently challenging due to efficiency, identifiability and representation issues. Results: We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues. We prove that our models are often identifiable and demonstrate that our inference methods for quantification and differential processing detection are efficient and accurate. Availability: Software implementing our methods is available at http://deweylab.biostat.wisc.edu/psginfer. Contact: cdewey@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23846746
General Purpose Probabilistic Programming Platform with Effective Stochastic Inference
2018-04-01
2.2 Venture 10 2.3 BayesDB 12 2.4 Picture 17 2.5 MetaProb 20 3.0 METHODS , ASSUMPTIONS, AND PROCEDURES 22 4.0 RESULTS AND DISCUSSION 23 4.1...The methods section outlines the research approach. The results and discussion section gives representative quantitative and qualitative results...modeling via CrossCat, a probabilistic method that emulates many of the judgment calls ordinarily made by a human data analyst. This AI assistance
Nonadditive entropy maximization is inconsistent with Bayesian updating.
Pressé, Steve
2014-11-01
The maximum entropy method-used to infer probabilistic models from data-is a special case of Bayes's model inference prescription which, in turn, is grounded in basic propositional logic. By contrast to the maximum entropy method, the compatibility of nonadditive entropy maximization with Bayes's model inference prescription has never been established. Here we demonstrate that nonadditive entropy maximization is incompatible with Bayesian updating and discuss the immediate implications of this finding. We focus our attention on special cases as illustrations.
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
NASA Technical Reports Server (NTRS)
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Event-Based Media Enrichment Using an Adaptive Probabilistic Hypergraph Model.
Liu, Xueliang; Wang, Meng; Yin, Bao-Cai; Huet, Benoit; Li, Xuelong
2015-11-01
Nowadays, with the continual development of digital capture technologies and social media services, a vast number of media documents are captured and shared online to help attendees record their experience during events. In this paper, we present a method combining semantic inference and multimodal analysis for automatically finding media content to illustrate events using an adaptive probabilistic hypergraph model. In this model, media items are taken as vertices in the weighted hypergraph and the task of enriching media to illustrate events is formulated as a ranking problem. In our method, each hyperedge is constructed using the K-nearest neighbors of a given media document. We also employ a probabilistic representation, which assigns each vertex to a hyperedge in a probabilistic way, to further exploit the correlation among media data. Furthermore, we optimize the hypergraph weights in a regularization framework, which is solved as a second-order cone problem. The approach is initiated by seed media and then used to rank the media documents using a transductive inference process. The results obtained from validating the approach on an event dataset collected from EventMedia demonstrate the effectiveness of the proposed approach.
Statistical inference for remote sensing-based estimates of net deforestation
Ronald E. McRoberts; Brian F. Walters
2012-01-01
Statistical inference requires expression of an estimate in probabilistic terms, usually in the form of a confidence interval. An approach to constructing confidence intervals for remote sensing-based estimates of net deforestation is illustrated. The approach is based on post-classification methods using two independent forest/non-forest classifications because...
Design-based Sample and Probability Law-Assumed Sample: Their Role in Scientific Investigation.
ERIC Educational Resources Information Center
Ojeda, Mario Miguel; Sahai, Hardeo
2002-01-01
Discusses some key statistical concepts in probabilistic and non-probabilistic sampling to provide an overview for understanding the inference process. Suggests a statistical model constituting the basis of statistical inference and provides a brief review of the finite population descriptive inference and a quota sampling inferential theory.…
Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR)
NASA Astrophysics Data System (ADS)
Peters, Christina; Malz, Alex; Hlozek, Renée
2018-01-01
The Bayesian Estimation Applied to Multiple Species (BEAMS) framework employs probabilistic supernova type classifications to do photometric SN cosmology. This work extends BEAMS to replace high-confidence spectroscopic redshifts with photometric redshift probability density functions, a capability that will be essential in the era the Large Synoptic Survey Telescope and other next-generation photometric surveys where it will not be possible to perform spectroscopic follow up on every SN. We present the Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR) Bayesian hierarchical model for constraining the cosmological parameters from photometric lightcurves and host galaxy photometry, which includes selection effects and is extensible to uncertainty in the redshift-dependent supernova type proportions. We create a pair of realistic mock catalogs of joint posteriors over supernova type, redshift, and distance modulus informed by photometric supernova lightcurves and over redshift from simulated host galaxy photometry. We perform inference under our model to obtain a joint posterior probability distribution over the cosmological parameters and compare our results with other methods, namely: a spectroscopic subset, a subset of high probability photometrically classified supernovae, and reducing the photometric redshift probability to a single measurement and error bar.
Quantum Enhanced Inference in Markov Logic Networks
NASA Astrophysics Data System (ADS)
Wittek, Peter; Gogolin, Christian
2017-04-01
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Quantum Enhanced Inference in Markov Logic Networks.
Wittek, Peter; Gogolin, Christian
2017-04-19
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Quantum Enhanced Inference in Markov Logic Networks
Wittek, Peter; Gogolin, Christian
2017-01-01
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning. PMID:28422093
Probabilistic drug connectivity mapping
2014-01-01
Background The aim of connectivity mapping is to match drugs using drug-treatment gene expression profiles from multiple cell lines. This can be viewed as an information retrieval task, with the goal of finding the most relevant profiles for a given query drug. We infer the relevance for retrieval by data-driven probabilistic modeling of the drug responses, resulting in probabilistic connectivity mapping, and further consider the available cell lines as different data sources. We use a special type of probabilistic model to separate what is shared and specific between the sources, in contrast to earlier connectivity mapping methods that have intentionally aggregated all available data, neglecting information about the differences between the cell lines. Results We show that the probabilistic multi-source connectivity mapping method is superior to alternatives in finding functionally and chemically similar drugs from the Connectivity Map data set. We also demonstrate that an extension of the method is capable of retrieving combinations of drugs that match different relevant parts of the query drug response profile. Conclusions The probabilistic modeling-based connectivity mapping method provides a promising alternative to earlier methods. Principled integration of data from different cell lines helps to identify relevant responses for specific drug repositioning applications. PMID:24742351
Probabilistic numerical methods for PDE-constrained Bayesian inverse problems
NASA Astrophysics Data System (ADS)
Cockayne, Jon; Oates, Chris; Sullivan, Tim; Girolami, Mark
2017-06-01
This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significant solver error. Theoretical results are presented describing rates of convergence for the posteriors in both the forward and inverse problems. This method is tested on a challenging inverse problem with a nonlinear forward model.
NASA Astrophysics Data System (ADS)
Ndu, Obibobi Kamtochukwu
To ensure that estimates of risk and reliability inform design and resource allocation decisions in the development of complex engineering systems, early engagement in the design life cycle is necessary. An unfortunate constraint on the accuracy of such estimates at this stage of concept development is the limited amount of high fidelity design and failure information available on the actual system under development. Applying the human ability to learn from experience and augment our state of knowledge to evolve better solutions mitigates this limitation. However, the challenge lies in formalizing a methodology that takes this highly abstract, but fundamentally human cognitive, ability and extending it to the field of risk analysis while maintaining the tenets of generalization, Bayesian inference, and probabilistic risk analysis. We introduce an integrated framework for inferring the reliability, or other probabilistic measures of interest, of a new system or a conceptual variant of an existing system. Abstractly, our framework is based on learning from the performance of precedent designs and then applying the acquired knowledge, appropriately adjusted based on degree of relevance, to the inference process. This dissertation presents a method for inferring properties of the conceptual variant using a pseudo-spatial model that describes the spatial configuration of the family of systems to which the concept belongs. Through non-metric multidimensional scaling, we formulate the pseudo-spatial model based on rank-ordered subjective expert perception of design similarity between systems that elucidate the psychological space of the family. By a novel extension of Kriging methods for analysis of geospatial data to our "pseudo-space of comparable engineered systems", we develop a Bayesian inference model that allows prediction of the probabilistic measure of interest.
Localization of the lumbar discs using machine learning and exact probabilistic inference.
Oktay, Ayse Betul; Akgul, Yusuf Sinan
2011-01-01
We propose a novel fully automatic approach to localize the lumbar intervertebral discs in MR images with PHOG based SVM and a probabilistic graphical model. At the local level, our method assigns a score to each pixel in target image that indicates whether it is a disc center or not. At the global level, we define a chain-like graphical model that represents the lumbar intervertebral discs and we use an exact inference algorithm to localize the discs. Our main contributions are the employment of the SVM with the PHOG based descriptor which is robust against variations of the discs and a graphical model that reflects the linear nature of the vertebral column. Our inference algorithm runs in polynomial time and produces globally optimal results. The developed system is validated on a real spine MRI dataset and the final localization results are favorable compared to the results reported in the literature.
Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang
2011-01-01
An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows (“explaining away”) and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons. PMID:22219717
Pajak, Bozena; Fine, Alex B; Kleinschmidt, Dave F; Jaeger, T Florian
2016-12-01
We present a framework of second and additional language (L2/L n ) acquisition motivated by recent work on socio-indexical knowledge in first language (L1) processing. The distribution of linguistic categories covaries with socio-indexical variables (e.g., talker identity, gender, dialects). We summarize evidence that implicit probabilistic knowledge of this covariance is critical to L1 processing, and propose that L2/L n learning uses the same type of socio-indexical information to probabilistically infer latent hierarchical structure over previously learned and new languages. This structure guides the acquisition of new languages based on their inferred place within that hierarchy, and is itself continuously revised based on new input from any language. This proposal unifies L1 processing and L2/L n acquisition as probabilistic inference under uncertainty over socio-indexical structure. It also offers a new perspective on crosslinguistic influences during L2/L n learning, accommodating gradient and continued transfer (both negative and positive) from previously learned to novel languages, and vice versa.
Pajak, Bozena; Fine, Alex B.; Kleinschmidt, Dave F.; Jaeger, T. Florian
2015-01-01
We present a framework of second and additional language (L2/Ln) acquisition motivated by recent work on socio-indexical knowledge in first language (L1) processing. The distribution of linguistic categories covaries with socio-indexical variables (e.g., talker identity, gender, dialects). We summarize evidence that implicit probabilistic knowledge of this covariance is critical to L1 processing, and propose that L2/Ln learning uses the same type of socio-indexical information to probabilistically infer latent hierarchical structure over previously learned and new languages. This structure guides the acquisition of new languages based on their inferred place within that hierarchy, and is itself continuously revised based on new input from any language. This proposal unifies L1 processing and L2/Ln acquisition as probabilistic inference under uncertainty over socio-indexical structure. It also offers a new perspective on crosslinguistic influences during L2/Ln learning, accommodating gradient and continued transfer (both negative and positive) from previously learned to novel languages, and vice versa. PMID:28348442
Nonadditive entropy maximization is inconsistent with Bayesian updating
NASA Astrophysics Data System (ADS)
Pressé, Steve
2014-11-01
The maximum entropy method—used to infer probabilistic models from data—is a special case of Bayes's model inference prescription which, in turn, is grounded in basic propositional logic. By contrast to the maximum entropy method, the compatibility of nonadditive entropy maximization with Bayes's model inference prescription has never been established. Here we demonstrate that nonadditive entropy maximization is incompatible with Bayesian updating and discuss the immediate implications of this finding. We focus our attention on special cases as illustrations.
Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity.
Pecevski, Dejan; Maass, Wolfgang
2016-01-01
Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p (*) that generates the examples it receives. This holds even if p (*) contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference.
Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity123
Pecevski, Dejan
2016-01-01
Abstract Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p* that generates the examples it receives. This holds even if p* contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference. PMID:27419214
Probabilistic Methods for Structural Design and Reliability
NASA Technical Reports Server (NTRS)
Chamis, Christos C.; Whitlow, Woodrow, Jr. (Technical Monitor)
2002-01-01
This report describes a formal method to quantify structural damage tolerance and reliability in the presence of a multitude of uncertainties in turbine engine components. The method is based at the material behavior level where primitive variables with their respective scatter ranges are used to describe behavior. Computational simulation is then used to propagate the uncertainties to the structural scale where damage tolerance and reliability are usually specified. Several sample cases are described to illustrate the effectiveness, versatility, and maturity of the method. Typical results from this method demonstrate, that it is mature and that it can be used to probabilistically evaluate turbine engine structural components. It may be inferred from the results that the method is suitable for probabilistically predicting the remaining life in aging or in deteriorating structures, for making strategic projections and plans, and for achieving better, cheaper, faster products that give competitive advantages in world markets.
The Probability Heuristics Model of Syllogistic Reasoning.
ERIC Educational Resources Information Center
Chater, Nick; Oaksford, Mike
1999-01-01
Proposes a probability heuristic model for syllogistic reasoning and confirms the rationality of this heuristic by an analysis of the probabilistic validity of syllogistic reasoning that treats logical inference as a limiting case of probabilistic inference. Meta-analysis and two experiments involving 40 adult participants and using generalized…
Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks
NASA Astrophysics Data System (ADS)
Sun, Wei; Chang, K. C.
2005-05-01
Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.
Probabilistic numerics and uncertainty in computations
Hennig, Philipp; Osborne, Michael A.; Girolami, Mark
2015-01-01
We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data have led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimizers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations. PMID:26346321
Probabilistic numerics and uncertainty in computations.
Hennig, Philipp; Osborne, Michael A; Girolami, Mark
2015-07-08
We deliver a call to arms for probabilistic numerical methods : algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data have led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimizers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations.
Speech Enhancement Using Gaussian Scale Mixture Models
Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.
2011-01-01
This paper presents a novel probabilistic approach to speech enhancement. Instead of a deterministic logarithmic relationship, we assume a probabilistic relationship between the frequency coefficients and the log-spectra. The speech model in the log-spectral domain is a Gaussian mixture model (GMM). The frequency coefficients obey a zero-mean Gaussian whose covariance equals to the exponential of the log-spectra. This results in a Gaussian scale mixture model (GSMM) for the speech signal in the frequency domain, since the log-spectra can be regarded as scaling factors. The probabilistic relation between frequency coefficients and log-spectra allows these to be treated as two random variables, both to be estimated from the noisy signals. Expectation-maximization (EM) was used to train the GSMM and Bayesian inference was used to compute the posterior signal distribution. Because exact inference of this full probabilistic model is computationally intractable, we developed two approaches to enhance the efficiency: the Laplace method and a variational approximation. The proposed methods were applied to enhance speech corrupted by Gaussian noise and speech-shaped noise (SSN). For both approximations, signals reconstructed from the estimated frequency coefficients provided higher signal-to-noise ratio (SNR) and those reconstructed from the estimated log-spectra produced lower word recognition error rate because the log-spectra fit the inputs to the recognizer better. Our algorithms effectively reduced the SSN, which algorithms based on spectral analysis were not able to suppress. PMID:21359139
Probabilistic brains: knowns and unknowns
Pouget, Alexandre; Beck, Jeffrey M; Ma, Wei Ji; Latham, Peter E
2015-01-01
There is strong behavioral and physiological evidence that the brain both represents probability distributions and performs probabilistic inference. Computational neuroscientists have started to shed light on how these probabilistic representations and computations might be implemented in neural circuits. One particularly appealing aspect of these theories is their generality: they can be used to model a wide range of tasks, from sensory processing to high-level cognition. To date, however, these theories have only been applied to very simple tasks. Here we discuss the challenges that will emerge as researchers start focusing their efforts on real-life computations, with a focus on probabilistic learning, structural learning and approximate inference. PMID:23955561
On the psychology of the recognition heuristic: retrieval primacy as a key determinant of its use.
Pachur, Thorsten; Hertwig, Ralph
2006-09-01
The recognition heuristic is a prime example of a boundedly rational mind tool that rests on an evolved capacity, recognition, and exploits environmental structures. When originally proposed, it was conjectured that no other probabilistic cue reverses the recognition-based inference (D. G. Goldstein & G. Gigerenzer, 2002). More recent studies challenged this view and gave rise to the argument that recognition enters inferences just like any other probabilistic cue. By linking research on the heuristic with research on recognition memory, the authors argue that the retrieval of recognition information is not tantamount to the retrieval of other probabilistic cues. Specifically, the retrieval of subjective recognition precedes that of an objective probabilistic cue and occurs at little to no cognitive cost. This retrieval primacy gives rise to 2 predictions, both of which have been empirically supported: Inferences in line with the recognition heuristic (a) are made faster than inferences inconsistent with it and (b) are more prevalent under time pressure. Suspension of the heuristic, in contrast, requires additional time, and direct knowledge of the criterion variable, if available, can trigger such suspension. Copyright 2006 APA
Modeling the Evolution of Beliefs Using an Attentional Focus Mechanism
Marković, Dimitrije; Gläscher, Jan; Bossaerts, Peter; O’Doherty, John; Kiebel, Stefan J.
2015-01-01
For making decisions in everyday life we often have first to infer the set of environmental features that are relevant for the current task. Here we investigated the computational mechanisms underlying the evolution of beliefs about the relevance of environmental features in a dynamical and noisy environment. For this purpose we designed a probabilistic Wisconsin card sorting task (WCST) with belief solicitation, in which subjects were presented with stimuli composed of multiple visual features. At each moment in time a particular feature was relevant for obtaining reward, and participants had to infer which feature was relevant and report their beliefs accordingly. To test the hypothesis that attentional focus modulates the belief update process, we derived and fitted several probabilistic and non-probabilistic behavioral models, which either incorporate a dynamical model of attentional focus, in the form of a hierarchical winner-take-all neuronal network, or a diffusive model, without attention-like features. We used Bayesian model selection to identify the most likely generative model of subjects’ behavior and found that attention-like features in the behavioral model are essential for explaining subjects’ responses. Furthermore, we demonstrate a method for integrating both connectionist and Bayesian models of decision making within a single framework that allowed us to infer hidden belief processes of human subjects. PMID:26495984
Stan : A Probabilistic Programming Language
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew D.
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectationmore » propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can also be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.« less
Stan : A Probabilistic Programming Language
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew D.; ...
2017-01-01
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectationmore » propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can also be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.« less
Inference for Continuous-Time Probabilistic Programming
2017-12-01
Parzen window density estimator to jointly model the inter-camera travel time intervals, locations of exit/entrances, and velocities of ob- jects...asked to travel across the scene multiple times . Even in such a scenario they formed groups and made social interactions, which Fig. 7: Topology of...INFERENCE FOR CONTINUOUS- TIME PROBABILISTIC PROGRAMMING UNIVERSITY OF CALIFORNIA AT RIVERSIDE DECEMBER 2017 FINAL TECHNICAL REPORT APPROVED FOR
The Emergence of Probabilistic Reasoning in Very Young Infants: Evidence from 4.5- and 6-Month-Olds
ERIC Educational Resources Information Center
Denison, Stephanie; Reed, Christie; Xu, Fei
2013-01-01
How do people make rich inferences from such sparse data? Recent research has explored this inferential ability by investigating probabilistic reasoning in infancy. For example, 8- and 11-month-old infants can make inferences from samples to populations and vice versa (Denison & Xu, 2010a; Xu & Denison, 2009; Xu & Garcia, 2008a). The…
Bhaskar, Anand; Javanmard, Adel; Courtade, Thomas A; Tse, David
2017-03-15
Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis (PCA) and multidimensional scaling, or using explicit spatial probabilistic models of allele frequency evolution. We develop a general probabilistic model and an associated inference algorithm that unify the model-based and data-driven approaches to visualizing and inferring population structure. Our spatial inference algorithm can also be effectively applied to the problem of population stratification in genome-wide association studies (GWAS), where hidden population structure can create fictitious associations when population ancestry is correlated with both the genotype and the trait. Our algorithm Geographic Ancestry Positioning (GAP) relates local genetic distances between samples to their spatial distances, and can be used for visually discerning population structure as well as accurately inferring the spatial origin of individuals on a two-dimensional continuum. On both simulated and several real datasets from diverse human populations, GAP exhibits substantially lower error in reconstructing spatial ancestry coordinates compared to PCA. We also develop an association test that uses the ancestry coordinates inferred by GAP to accurately account for ancestry-induced correlations in GWAS. Based on simulations and analysis of a dataset of 10 metabolic traits measured in a Northern Finland cohort, which is known to exhibit significant population structure, we find that our method has superior power to current approaches. Our software is available at https://github.com/anand-bhaskar/gap . abhaskar@stanford.edu or ajavanma@usc.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323
Data analysis using scale-space filtering and Bayesian probabilistic reasoning
NASA Technical Reports Server (NTRS)
Kulkarni, Deepak; Kutulakos, Kiriakos; Robinson, Peter
1991-01-01
This paper describes a program for analysis of output curves from Differential Thermal Analyzer (DTA). The program first extracts probabilistic qualitative features from a DTA curve of a soil sample, and then uses Bayesian probabilistic reasoning to infer the mineral in the soil. The qualifier module employs a simple and efficient extension of scale-space filtering suitable for handling DTA data. We have observed that points can vanish from contours in the scale-space image when filtering operations are not highly accurate. To handle the problem of vanishing points, perceptual organizations heuristics are used to group the points into lines. Next, these lines are grouped into contours by using additional heuristics. Probabilities are associated with these contours using domain-specific correlations. A Bayes tree classifier processes probabilistic features to infer the presence of different minerals in the soil. Experiments show that the algorithm that uses domain-specific correlation to infer qualitative features outperforms a domain-independent algorithm that does not.
Divide et impera: subgoaling reduces the complexity of probabilistic inference and problem solving
Maisto, Domenico; Donnarumma, Francesco; Pezzulo, Giovanni
2015-01-01
It has long been recognized that humans (and possibly other animals) usually break problems down into smaller and more manageable problems using subgoals. Despite a general consensus that subgoaling helps problem solving, it is still unclear what the mechanisms guiding online subgoal selection are during the solution of novel problems for which predefined solutions are not available. Under which conditions does subgoaling lead to optimal behaviour? When is subgoaling better than solving a problem from start to finish? Which is the best number and sequence of subgoals to solve a given problem? How are these subgoals selected during online inference? Here, we present a computational account of subgoaling in problem solving. Following Occam's razor, we propose that good subgoals are those that permit planning solutions and controlling behaviour using less information resources, thus yielding parsimony in inference and control. We implement this principle using approximate probabilistic inference: subgoals are selected using a sampling method that considers the descriptive complexity of the resulting sub-problems. We validate the proposed method using a standard reinforcement learning benchmark (four-rooms scenario) and show that the proposed method requires less inferential steps and permits selecting more compact control programs compared to an equivalent procedure without subgoaling. Furthermore, we show that the proposed method offers a mechanistic explanation of the neuronal dynamics found in the prefrontal cortex of monkeys that solve planning problems. Our computational framework provides a novel integrative perspective on subgoaling and its adaptive advantages for planning, control and learning, such as for example lowering cognitive effort and working memory load. PMID:25652466
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tobias, Benjamin John; Palaniyappan, Sasikumar; Gautier, Donald Cort
Images of the R2DTO resolution target were obtained during laser-driven-radiography experiments performed at the TRIDENT laser facility, and analysis of these images using the Bayesian Inference Engine (BIE) determines a most probable full-width half maximum (FWHM) spot size of 78 μm. However, significant uncertainty prevails due to variation in the measured detector blur. Propagating this uncertainty in detector blur through the forward model results in an interval of probabilistic ambiguity spanning approximately 35-195 μm when the laser energy impinges on a thick (1 mm) tantalum target. In other phases of the experiment, laser energy is deposited on a thin (~100more » nm) aluminum target placed 250 μm ahead of the tantalum converter. When the energetic electron beam is generated in this manner, upstream from the bremsstrahlung converter, the inferred spot size shifts to a range of much larger values, approximately 270-600 μm FWHM. This report discusses methods applied to obtain these intervals as well as concepts necessary for interpreting the result within a context of probabilistic quantitative inference.« less
2018-02-15
address the problem that probabilistic inference algorithms are diÿcult and tedious to implement, by expressing them in terms of a small number of...building blocks, which are automatic transformations on probabilistic programs. On one hand, our curation of these building blocks reflects the way human...reasoning with low-level computational optimization, so the speed and accuracy of the generated solvers are competitive with state-of-the-art systems. 15
Probabilistic generation of random networks taking into account information on motifs occurrence.
Bois, Frederic Y; Gayraud, Ghislaine
2015-01-01
Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli.
Probabilistic Generation of Random Networks Taking into Account Information on Motifs Occurrence
Bois, Frederic Y.
2015-01-01
Abstract Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli. PMID:25493547
Dorrough, Angela R; Glöckner, Andreas; Betsch, Tilmann; Wille, Anika
2017-10-01
To make decisions in probabilistic inference tasks, individuals integrate relevant information partly in an automatic manner. Thereby, potentially irrelevant stimuli that are additionally presented can intrude on the decision process (e.g., Söllner, Bröder, Glöckner, & Betsch, 2014). We investigate whether such an intrusion effect can also be caused by potentially irrelevant or even misleading knowledge activated from memory. In four studies that combine a standard information board paradigm from decision research with a standard manipulation from social psychology, we investigate the case of stereotypes and demonstrate that stereotype knowledge can yield intrusion biases in probabilistic inferences from description. The magnitude of these biases increases with stereotype accessibility and decreases with a clarification of the rational solution. Copyright © 2017 Elsevier B.V. All rights reserved.
A prior-based integrative framework for functional transcriptional regulatory network inference
Siahpirani, Alireza F.
2017-01-01
Abstract Transcriptional regulatory networks specify regulatory proteins controlling the context-specific expression levels of genes. Inference of genome-wide regulatory networks is central to understanding gene regulation, but remains an open challenge. Expression-based network inference is among the most popular methods to infer regulatory networks, however, networks inferred from such methods have low overlap with experimentally derived (e.g. ChIP-chip and transcription factor (TF) knockouts) networks. Currently we have a limited understanding of this discrepancy. To address this gap, we first develop a regulatory network inference algorithm, based on probabilistic graphical models, to integrate expression with auxiliary datasets supporting a regulatory edge. Second, we comprehensively analyze our and other state-of-the-art methods on different expression perturbation datasets. Networks inferred by integrating sequence-specific motifs with expression have substantially greater agreement with experimentally derived networks, while remaining more predictive of expression than motif-based networks. Our analysis suggests natural genetic variation as the most informative perturbation for network inference, and, identifies core TFs whose targets are predictable from expression. Multiple reasons make the identification of targets of other TFs difficult, including network architecture and insufficient variation of TF mRNA level. Finally, we demonstrate the utility of our inference algorithm to infer stress-specific regulatory networks and for regulator prioritization. PMID:27794550
High Resolution Soil Water from Regional Databases and Satellite Images
NASA Technical Reports Server (NTRS)
Morris, Robin D.; Smelyanskly, Vadim N.; Coughlin, Joseph; Dungan, Jennifer; Clancy, Daniel (Technical Monitor)
2002-01-01
This viewgraph presentation provides information on the ways in which plant growth can be inferred from satellite data and can then be used to infer soil water. There are several steps in this process, the first of which is the acquisition of data from satellite observations and relevant information databases such as the State Soil Geographic Database (STATSGO). Then probabilistic analysis and inversion with the Bayes' theorem reveals sources of uncertainty. The Markov chain Monte Carlo method is also used.
Improved probabilistic inference as a general learning mechanism with action video games.
Green, C Shawn; Pouget, Alexandre; Bavelier, Daphne
2010-09-14
Action video game play benefits performance in an array of sensory, perceptual, and attentional tasks that go well beyond the specifics of game play [1-9]. That a training regimen may induce improvements in so many different skills is notable because the majority of studies on training-induced learning report improvements on the trained task but limited transfer to other, even closely related, tasks ([10], but see also [11-13]). Here we ask whether improved probabilistic inference may explain such broad transfer. By using a visual perceptual decision making task [14, 15], the present study shows for the first time that action video game experience does indeed improve probabilistic inference. A neural model of this task [16] establishes how changing a single parameter, namely the strength of the connections between the neural layer providing the momentary evidence and the layer integrating the evidence over time, captures improvements in action-gamers behavior. These results were established in a visual, but also in a novel auditory, task, indicating generalization across modalities. Thus, improved probabilistic inference provides a general mechanism for why action video game playing enhances performance in a wide variety of tasks. In addition, this mechanism may serve as a signature of training regimens that are likely to produce transfer of learning. Copyright © 2010 Elsevier Ltd. All rights reserved.
Acoustic emission based damage localization in composites structures using Bayesian identification
NASA Astrophysics Data System (ADS)
Kundu, A.; Eaton, M. J.; Al-Jumali, S.; Sikdar, S.; Pullin, R.
2017-05-01
Acoustic emission based damage detection in composite structures is based on detection of ultra high frequency packets of acoustic waves emitted from damage sources (such as fibre breakage, fatigue fracture, amongst others) with a network of distributed sensors. This non-destructive monitoring scheme requires solving an inverse problem where the measured signals are linked back to the location of the source. This in turn enables rapid deployment of mitigative measures. The presence of significant amount of uncertainty associated with the operating conditions and measurements makes the problem of damage identification quite challenging. The uncertainties stem from the fact that the measured signals are affected by the irregular geometries, manufacturing imprecision, imperfect boundary conditions, existing damages/structural degradation, amongst others. This work aims to tackle these uncertainties within a framework of automated probabilistic damage detection. The method trains a probabilistic model of the parametrized input and output model of the acoustic emission system with experimental data to give probabilistic descriptors of damage locations. A response surface modelling the acoustic emission as a function of parametrized damage signals collected from sensors would be calibrated with a training dataset using Bayesian inference. This is used to deduce damage locations in the online monitoring phase. During online monitoring, the spatially correlated time data is utilized in conjunction with the calibrated acoustic emissions model to infer the probabilistic description of the acoustic emission source within a hierarchical Bayesian inference framework. The methodology is tested on a composite structure consisting of carbon fibre panel with stiffeners and damage source behaviour has been experimentally simulated using standard H-N sources. The methodology presented in this study would be applicable in the current form to structural damage detection under varying operational loads and would be investigated in future studies.
Cartwright, Reed A; Hussin, Julie; Keebler, Jonathan E M; Stone, Eric A; Awadalla, Philip
2012-01-06
Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.
Divide et impera: subgoaling reduces the complexity of probabilistic inference and problem solving.
Maisto, Domenico; Donnarumma, Francesco; Pezzulo, Giovanni
2015-03-06
It has long been recognized that humans (and possibly other animals) usually break problems down into smaller and more manageable problems using subgoals. Despite a general consensus that subgoaling helps problem solving, it is still unclear what the mechanisms guiding online subgoal selection are during the solution of novel problems for which predefined solutions are not available. Under which conditions does subgoaling lead to optimal behaviour? When is subgoaling better than solving a problem from start to finish? Which is the best number and sequence of subgoals to solve a given problem? How are these subgoals selected during online inference? Here, we present a computational account of subgoaling in problem solving. Following Occam's razor, we propose that good subgoals are those that permit planning solutions and controlling behaviour using less information resources, thus yielding parsimony in inference and control. We implement this principle using approximate probabilistic inference: subgoals are selected using a sampling method that considers the descriptive complexity of the resulting sub-problems. We validate the proposed method using a standard reinforcement learning benchmark (four-rooms scenario) and show that the proposed method requires less inferential steps and permits selecting more compact control programs compared to an equivalent procedure without subgoaling. Furthermore, we show that the proposed method offers a mechanistic explanation of the neuronal dynamics found in the prefrontal cortex of monkeys that solve planning problems. Our computational framework provides a novel integrative perspective on subgoaling and its adaptive advantages for planning, control and learning, such as for example lowering cognitive effort and working memory load. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
NASA Technical Reports Server (NTRS)
Warner, James E.; Zubair, Mohammad; Ranjan, Desh
2017-01-01
This work investigates novel approaches to probabilistic damage diagnosis that utilize surrogate modeling and high performance computing (HPC) to achieve substantial computational speedup. Motivated by Digital Twin, a structural health management (SHM) paradigm that integrates vehicle-specific characteristics with continual in-situ damage diagnosis and prognosis, the methods studied herein yield near real-time damage assessments that could enable monitoring of a vehicle's health while it is operating (i.e. online SHM). High-fidelity modeling and uncertainty quantification (UQ), both critical to Digital Twin, are incorporated using finite element method simulations and Bayesian inference, respectively. The crux of the proposed Bayesian diagnosis methods, however, is the reformulation of the numerical sampling algorithms (e.g. Markov chain Monte Carlo) used to generate the resulting probabilistic damage estimates. To this end, three distinct methods are demonstrated for rapid sampling that utilize surrogate modeling and exploit various degrees of parallelism for leveraging HPC. The accuracy and computational efficiency of the methods are compared on the problem of strain-based crack identification in thin plates. While each approach has inherent problem-specific strengths and weaknesses, all approaches are shown to provide accurate probabilistic damage diagnoses and several orders of magnitude computational speedup relative to a baseline Bayesian diagnosis implementation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brewer, Brendon J.; Foreman-Mackey, Daniel; Hogg, David W., E-mail: bj.brewer@auckland.ac.nz
We present and implement a probabilistic (Bayesian) method for producing catalogs from images of stellar fields. The method is capable of inferring the number of sources N in the image and can also handle the challenges introduced by noise, overlapping sources, and an unknown point-spread function. The luminosity function of the stars can also be inferred, even when the precise luminosity of each star is uncertain, via the use of a hierarchical Bayesian model. The computational feasibility of the method is demonstrated on two simulated images with different numbers of stars. We find that our method successfully recovers the inputmore » parameter values along with principled uncertainties even when the field is crowded. We also compare our results with those obtained from the SExtractor software. While the two approaches largely agree about the fluxes of the bright stars, the Bayesian approach provides more accurate inferences about the faint stars and the number of stars, particularly in the crowded case.« less
NASA Astrophysics Data System (ADS)
Gromek, Katherine Emily
A novel computational and inference framework of the physics-of-failure (PoF) reliability modeling for complex dynamic systems has been established in this research. The PoF-based reliability models are used to perform a real time simulation of system failure processes, so that the system level reliability modeling would constitute inferences from checking the status of component level reliability at any given time. The "agent autonomy" concept is applied as a solution method for the system-level probabilistic PoF-based (i.e. PPoF-based) modeling. This concept originated from artificial intelligence (AI) as a leading intelligent computational inference in modeling of multi agents systems (MAS). The concept of agent autonomy in the context of reliability modeling was first proposed by M. Azarkhail [1], where a fundamentally new idea of system representation by autonomous intelligent agents for the purpose of reliability modeling was introduced. Contribution of the current work lies in the further development of the agent anatomy concept, particularly the refined agent classification within the scope of the PoF-based system reliability modeling, new approaches to the learning and the autonomy properties of the intelligent agents, and modeling interacting failure mechanisms within the dynamic engineering system. The autonomous property of intelligent agents is defined as agent's ability to self-activate, deactivate or completely redefine their role in the analysis. This property of agents and the ability to model interacting failure mechanisms of the system elements makes the agent autonomy fundamentally different from all existing methods of probabilistic PoF-based reliability modeling. 1. Azarkhail, M., "Agent Autonomy Approach to Physics-Based Reliability Modeling of Structures and Mechanical Systems", PhD thesis, University of Maryland, College Park, 2007.
Bapst, D W; Wright, A M; Matzke, N J; Lloyd, G T
2016-07-01
Dated phylogenies of fossil taxa allow palaeobiologists to estimate the timing of major divergences and placement of extinct lineages, and to test macroevolutionary hypotheses. Recently developed Bayesian 'tip-dating' methods simultaneously infer and date the branching relationships among fossil taxa, and infer putative ancestral relationships. Using a previously published dataset for extinct theropod dinosaurs, we contrast the dated relationships inferred by several tip-dating approaches and evaluate potential downstream effects on phylogenetic comparative methods. We also compare tip-dating analyses to maximum-parsimony trees time-scaled via alternative a posteriori approaches including via the probabilistic cal3 method. Among tip-dating analyses, we find opposing but strongly supported relationships, despite similarity in inferred ancestors. Overall, tip-dating methods infer divergence dates often millions (or tens of millions) of years older than the earliest stratigraphic appearance of that clade. Model-comparison analyses of the pattern of body-size evolution found that the support for evolutionary mode can vary across and between tree samples from cal3 and tip-dating approaches. These differences suggest that model and software choice in dating analyses can have a substantial impact on the dated phylogenies obtained and broader evolutionary inferences. © 2016 The Author(s).
Probing the Small-scale Structure in Strongly Lensed Systems via Transdimensional Inference
NASA Astrophysics Data System (ADS)
Daylan, Tansu; Cyr-Racine, Francis-Yan; Diaz Rivero, Ana; Dvorkin, Cora; Finkbeiner, Douglas P.
2018-02-01
Strong lensing is a sensitive probe of the small-scale density fluctuations in the Universe. We implement a pipeline to model strongly lensed systems using probabilistic cataloging, which is a transdimensional, hierarchical, and Bayesian framework to sample from a metamodel (union of models with different dimensionality) consistent with observed photon count maps. Probabilistic cataloging allows one to robustly characterize modeling covariances within and across lens models with different numbers of subhalos. Unlike traditional cataloging of subhalos, it does not require model subhalos to improve the goodness of fit above the detection threshold. Instead, it allows the exploitation of all information contained in the photon count maps—for instance, when constraining the subhalo mass function. We further show that, by not including these small subhalos in the lens model, fixed-dimensional inference methods can significantly mismodel the data. Using a simulated Hubble Space Telescope data set, we show that the subhalo mass function can be probed even when many subhalos in the sample catalogs are individually below the detection threshold and would be absent in a traditional catalog. The implemented software, Probabilistic Cataloger (PCAT) is made publicly available at https://github.com/tdaylan/pcat.
Modality, probability, and mental models.
Hinterecker, Thomas; Knauff, Markus; Johnson-Laird, P N
2016-10-01
We report 3 experiments investigating novel sorts of inference, such as: A or B or both. Therefore, possibly (A and B). Where the contents were sensible assertions, for example, Space tourism will achieve widespread popularity in the next 50 years or advances in material science will lead to the development of antigravity materials in the next 50 years, or both . Most participants accepted the inferences as valid, though they are invalid in modal logic and in probabilistic logic too. But, the theory of mental models predicts that individuals should accept them. In contrast, inferences of this sort—A or B but not both. Therefore, A or B or both—are both logically valid and probabilistically valid. Yet, as the model theory also predicts, most reasoners rejected them. The participants’ estimates of probabilities showed that their inferences tended not to be based on probabilistic validity, but that they did rate acceptable conclusions as more probable than unacceptable conclusions. We discuss the implications of the results for current theories of reasoning. PsycINFO Database Record (c) 2016 APA, all rights reserved
Ambiguity and Uncertainty in Probabilistic Inference.
1983-09-01
whether one was to judge the like- lihood that the majority or minority position was true . In order to sample a wide range of values of n and p, 40...AFD-A133 418 AMBIGUITY AND UNCERTAINTY IN PROBABILISTIC INFERENCE i/i U CLRS (U) CHICGO UNIT’ IL CENTER FOR DECISION RESERCH H J EINHORN ET AL. SEP...been demonstrated experimentally (Becker & Brownson, 1964; Yates & Zukowski, 1976). On the other hand, the process by which such second-order uncertainty
Encoding probabilistic brain atlases using Bayesian inference.
Van Leemput, Koen
2009-06-01
This paper addresses the problem of creating probabilistic brain atlases from manually labeled training data. Probabilistic atlases are typically constructed by counting the relative frequency of occurrence of labels in corresponding locations across the training images. However, such an "averaging" approach generalizes poorly to unseen cases when the number of training images is limited, and provides no principled way of aligning the training datasets using deformable registration. In this paper, we generalize the generative image model implicitly underlying standard "average" atlases, using mesh-based representations endowed with an explicit deformation model. Bayesian inference is used to infer the optimal model parameters from the training data, leading to a simultaneous group-wise registration and atlas estimation scheme that encompasses standard averaging as a special case. We also use Bayesian inference to compare alternative atlas models in light of the training data, and show how this leads to a data compression problem that is intuitive to interpret and computationally feasible. Using this technique, we automatically determine the optimal amount of spatial blurring, the best deformation field flexibility, and the most compact mesh representation. We demonstrate, using 2-D training datasets, that the resulting models are better at capturing the structure in the training data than conventional probabilistic atlases. We also present experiments of the proposed atlas construction technique in 3-D, and show the resulting atlases' potential in fully-automated, pulse sequence-adaptive segmentation of 36 neuroanatomical structures in brain MRI scans.
Source Detection with Bayesian Inference on ROSAT All-Sky Survey Data Sample
NASA Astrophysics Data System (ADS)
Guglielmetti, F.; Voges, W.; Fischer, R.; Boese, G.; Dose, V.
2004-07-01
We employ Bayesian inference for the joint estimation of sources and background on ROSAT All-Sky Survey (RASS) data. The probabilistic method allows for detection improvement of faint extended celestial sources compared to the Standard Analysis Software System (SASS). Background maps were estimated in a single step together with the detection of sources without pixel censoring. Consistent uncertainties of background and sources are provided. The source probability is evaluated for single pixels as well as for pixel domains to enhance source detection of weak and extended sources.
Data free inference with processed data products
Chowdhary, K.; Najm, H. N.
2014-07-12
Here, we consider the context of probabilistic inference of model parameters given error bars or confidence intervals on model output values, when the data is unavailable. We introduce a class of algorithms in a Bayesian framework, relying on maximum entropy arguments and approximate Bayesian computation methods, to generate consistent data with the given summary statistics. Once we obtain consistent data sets, we pool the respective posteriors, to arrive at a single, averaged density on the parameters. This approach allows us to perform accurate forward uncertainty propagation consistent with the reported statistics.
Galtier, N; Boursot, P
2000-03-01
A new, model-based method was devised to locate nucleotide changes in a given phylogenetic tree. For each site, the posterior probability of any possible change in each branch of the tree is computed. This probabilistic method is a valuable alternative to the maximum parsimony method when base composition is skewed (i.e., different from 25% A, 25% C, 25% G, 25% T): computer simulations showed that parsimony misses more rare --> common than common --> rare changes, resulting in biased inferred change matrices, whereas the new method appeared unbiased. The probabilistic method was applied to the analysis of the mutation and substitution processes in the mitochondrial control region of mouse. Distinct change patterns were found at the polymorphism (within species) and divergence (between species) levels, rejecting the hypothesis of a neutral evolution of base composition in mitochondrial DNA.
Network inference using informative priors.
Mukherjee, Sach; Speed, Terence P
2008-09-23
Recent years have seen much interest in the study of systems characterized by multiple interacting components. A class of statistical models called graphical models, in which graphs are used to represent probabilistic relationships between variables, provides a framework for formal inference regarding such systems. In many settings, the object of inference is the network structure itself. This problem of "network inference" is well known to be a challenging one. However, in scientific settings there is very often existing information regarding network connectivity. A natural idea then is to take account of such information during inference. This article addresses the question of incorporating prior information into network inference. We focus on directed models called Bayesian networks, and use Markov chain Monte Carlo to draw samples from posterior distributions over network structures. We introduce prior distributions on graphs capable of capturing information regarding network features including edges, classes of edges, degree distributions, and sparsity. We illustrate our approach in the context of systems biology, applying our methods to network inference in cancer signaling.
Perceptual learning as improved probabilistic inference in early sensory areas.
Bejjanki, Vikranth R; Beck, Jeffrey M; Lu, Zhong-Lin; Pouget, Alexandre
2011-05-01
Extensive training on simple tasks such as fine orientation discrimination results in large improvements in performance, a form of learning known as perceptual learning. Previous models have argued that perceptual learning is due to either sharpening and amplification of tuning curves in early visual areas or to improved probabilistic inference in later visual areas (at the decision stage). However, early theories are inconsistent with the conclusions of psychophysical experiments manipulating external noise, whereas late theories cannot explain the changes in neural responses that have been reported in cortical areas V1 and V4. Here we show that we can capture both the neurophysiological and behavioral aspects of perceptual learning by altering only the feedforward connectivity in a recurrent network of spiking neurons so as to improve probabilistic inference in early visual areas. The resulting network shows modest changes in tuning curves, in line with neurophysiological reports, along with a marked reduction in the amplitude of pairwise noise correlations.
Damage Tolerance and Reliability of Turbine Engine Components
NASA Technical Reports Server (NTRS)
Chamis, Christos C.
1999-01-01
This report describes a formal method to quantify structural damage tolerance and reliability in the presence of a multitude of uncertainties in turbine engine components. The method is based at the material behavior level where primitive variables with their respective scatter ranges are used to describe behavior. Computational simulation is then used to propagate the uncertainties to the structural scale where damage tolerance and reliability are usually specified. Several sample cases are described to illustrate the effectiveness, versatility, and maturity of the method. Typical results from this method demonstrate that it is mature and that it can be used to probabilistically evaluate turbine engine structural components. It may be inferred from the results that the method is suitable for probabilistically predicting the remaining life in aging or deteriorating structures, for making strategic projections and plans, and for achieving better, cheaper, faster products that give competitive advantages in world markets.
SEX-DETector: A Probabilistic Approach to Study Sex Chromosomes in Non-Model Organisms
Muyle, Aline; Käfer, Jos; Zemp, Niklaus; Mousset, Sylvain; Picard, Franck; Marais, Gabriel AB
2016-01-01
We propose a probabilistic framework to infer autosomal and sex-linked genes from RNA-seq data of a cross for any sex chromosome type (XY, ZW, and UV). Sex chromosomes (especially the non-recombining and repeat-dense Y, W, U, and V) are notoriously difficult to sequence. Strategies have been developed to obtain partially assembled sex chromosome sequences. Most of them remain difficult to apply to numerous non-model organisms, either because they require a reference genome, or because they are designed for evolutionarily old systems. Sequencing a cross (parents and progeny) by RNA-seq to study the segregation of alleles and infer sex-linked genes is a cost-efficient strategy, which also provides expression level estimates. However, the lack of a proper statistical framework has limited a broader application of this approach. Tests on empirical Silene data show that our method identifies 20–35% more sex-linked genes than existing pipelines, while making reliable inferences for downstream analyses. Approximately 12 individuals are needed for optimal results based on simulations. For species with an unknown sex-determination system, the method can assess the presence and type (XY vs. ZW) of sex chromosomes through a model comparison strategy. The method is particularly well optimized for sex chromosomes of young or intermediate age, which are expected in thousands of yet unstudied lineages. Any organisms, including non-model ones for which nothing is known a priori, that can be bred in the lab, are suitable for our method. SEX-DETector and its implementation in a Galaxy workflow are made freely available. PMID:27492231
Data Analysis with Graphical Models: Software Tools
NASA Technical Reports Server (NTRS)
Buntine, Wray L.
1994-01-01
Probabilistic graphical models (directed and undirected Markov fields, and combined in chain graphs) are used widely in expert systems, image processing and other areas as a framework for representing and reasoning with probabilities. They come with corresponding algorithms for performing probabilistic inference. This paper discusses an extension to these models by Spiegelhalter and Gilks, plates, used to graphically model the notion of a sample. This offers a graphical specification language for representing data analysis problems. When combined with general methods for statistical inference, this also offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper outlines the framework and then presents some basic tools for the task: a graphical version of the Pitman-Koopman Theorem for the exponential family, problem decomposition, and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
NASA Astrophysics Data System (ADS)
Nawaz, Muhammad Atif; Curtis, Andrew
2018-04-01
We introduce a new Bayesian inversion method that estimates the spatial distribution of geological facies from attributes of seismic data, by showing how the usual probabilistic inverse problem can be solved using an optimization framework still providing full probabilistic results. Our mathematical model consists of seismic attributes as observed data, which are assumed to have been generated by the geological facies. The method infers the post-inversion (posterior) probability density of the facies plus some other unknown model parameters, from the seismic attributes and geological prior information. Most previous research in this domain is based on the localized likelihoods assumption, whereby the seismic attributes at a location are assumed to depend on the facies only at that location. Such an assumption is unrealistic because of imperfect seismic data acquisition and processing, and fundamental limitations of seismic imaging methods. In this paper, we relax this assumption: we allow probabilistic dependence between seismic attributes at a location and the facies in any neighbourhood of that location through a spatial filter. We term such likelihoods quasi-localized.
Bishop, Christopher M
2013-02-13
Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications.
Bishop, Christopher M.
2013-01-01
Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications. PMID:23277612
Rasmussen, Peter M.; Smith, Amy F.; Sakadžić, Sava; Boas, David A.; Pries, Axel R.; Secomb, Timothy W.; Østergaard, Leif
2017-01-01
Objective In vivo imaging of the microcirculation and network-oriented modeling have emerged as powerful means of studying microvascular function and understanding its physiological significance. Network-oriented modeling may provide the means of summarizing vast amounts of data produced by high-throughput imaging techniques in terms of key, physiological indices. To estimate such indices with sufficient certainty, however, network-oriented analysis must be robust to the inevitable presence of uncertainty due to measurement errors as well as model errors. Methods We propose the Bayesian probabilistic data analysis framework as a means of integrating experimental measurements and network model simulations into a combined and statistically coherent analysis. The framework naturally handles noisy measurements and provides posterior distributions of model parameters as well as physiological indices associated with uncertainty. Results We applied the analysis framework to experimental data from three rat mesentery networks and one mouse brain cortex network. We inferred distributions for more than five hundred unknown pressure and hematocrit boundary conditions. Model predictions were consistent with previous analyses, and remained robust when measurements were omitted from model calibration. Conclusion Our Bayesian probabilistic approach may be suitable for optimizing data acquisition and for analyzing and reporting large datasets acquired as part of microvascular imaging studies. PMID:27987383
Modeling Array Stations in SIG-VISA
NASA Astrophysics Data System (ADS)
Ding, N.; Moore, D.; Russell, S.
2013-12-01
We add support for array stations to SIG-VISA, a system for nuclear monitoring using probabilistic inference on seismic signals. Array stations comprise a large portion of the IMS network; they can provide increased sensitivity and more accurate directional information compared to single-component stations. Our existing model assumed that signals were independent at each station, which is false when lots of stations are close together, as in an array. The new model removes that assumption by jointly modeling signals across array elements. This is done by extending our existing Gaussian process (GP) regression models, also known as kriging, from a 3-dimensional single-component space of events to a 6-dimensional space of station-event pairs. For each array and each event attribute (including coda decay, coda height, amplitude transfer and travel time), we model the joint distribution across array elements using a Gaussian process that learns the correlation lengthscale across the array, thereby incorporating information of array stations into the probabilistic inference framework. To evaluate the effectiveness of our model, we perform ';probabilistic beamforming' on new events using our GP model, i.e., we compute the event azimuth having highest posterior probability under the model, conditioned on the signals at array elements. We compare the results from our probabilistic inference model to the beamforming currently performed by IMS station processing.
Sukumaran, Jeet; Knowles, L Lacey
2018-06-01
The development of process-based probabilistic models for historical biogeography has transformed the field by grounding it in modern statistical hypothesis testing. However, most of these models abstract away biological differences, reducing species to interchangeable lineages. We present here the case for reintegration of biology into probabilistic historical biogeographical models, allowing a broader range of questions about biogeographical processes beyond ancestral range estimation or simple correlation between a trait and a distribution pattern, as well as allowing us to assess how inferences about ancestral ranges themselves might be impacted by differential biological traits. We show how new approaches to inference might cope with the computational challenges resulting from the increased complexity of these trait-based historical biogeographical models. Copyright © 2018 Elsevier Ltd. All rights reserved.
Network inference using informative priors
Mukherjee, Sach; Speed, Terence P.
2008-01-01
Recent years have seen much interest in the study of systems characterized by multiple interacting components. A class of statistical models called graphical models, in which graphs are used to represent probabilistic relationships between variables, provides a framework for formal inference regarding such systems. In many settings, the object of inference is the network structure itself. This problem of “network inference” is well known to be a challenging one. However, in scientific settings there is very often existing information regarding network connectivity. A natural idea then is to take account of such information during inference. This article addresses the question of incorporating prior information into network inference. We focus on directed models called Bayesian networks, and use Markov chain Monte Carlo to draw samples from posterior distributions over network structures. We introduce prior distributions on graphs capable of capturing information regarding network features including edges, classes of edges, degree distributions, and sparsity. We illustrate our approach in the context of systems biology, applying our methods to network inference in cancer signaling. PMID:18799736
Probabilistic Low-Rank Multitask Learning.
Kong, Yu; Shao, Ming; Li, Kang; Fu, Yun
2018-03-01
In this paper, we consider the problem of learning multiple related tasks simultaneously with the goal of improving the generalization performance of individual tasks. The key challenge is to effectively exploit the shared information across multiple tasks as well as preserve the discriminative information for each individual task. To address this, we propose a novel probabilistic model for multitask learning (MTL) that can automatically balance between low-rank and sparsity constraints. The former assumes a low-rank structure of the underlying predictive hypothesis space to explicitly capture the relationship of different tasks and the latter learns the incoherent sparse patterns private to each task. We derive and perform inference via variational Bayesian methods. Experimental results on both regression and classification tasks on real-world applications demonstrate the effectiveness of the proposed method in dealing with the MTL problems.
Data Analysis Recipes: Using Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Hogg, David W.; Foreman-Mackey, Daniel
2018-05-01
Markov Chain Monte Carlo (MCMC) methods for sampling probability density functions (combined with abundant computational resources) have transformed the sciences, especially in performing probabilistic inferences, or fitting models to data. In this primarily pedagogical contribution, we give a brief overview of the most basic MCMC method and some practical advice for the use of MCMC in real inference problems. We give advice on method choice, tuning for performance, methods for initialization, tests of convergence, troubleshooting, and use of the chain output to produce or report parameter estimates with associated uncertainties. We argue that autocorrelation time is the most important test for convergence, as it directly connects to the uncertainty on the sampling estimate of any quantity of interest. We emphasize that sampling is a method for doing integrals; this guides our thinking about how MCMC output is best used. .
Balkanization and Unification of Probabilistic Inferences
ERIC Educational Resources Information Center
Yu, Chong-Ho
2005-01-01
Many research-related classes in social sciences present probability as a unified approach based upon mathematical axioms, but neglect the diversity of various probability theories and their associated philosophical assumptions. Although currently the dominant statistical and probabilistic approach is the Fisherian tradition, the use of Fisherian…
The cerebellum and decision making under uncertainty.
Blackwood, Nigel; Ffytche, Dominic; Simmons, Andrew; Bentall, Richard; Murray, Robin; Howard, Robert
2004-06-01
This study aimed to identify the neural basis of probabilistic reasoning, a type of inductive inference that aids decision making under conditions of uncertainty. Eight normal subjects performed two separate two-alternative-choice tasks (the balls in a bottle and personality survey tasks) while undergoing functional magnetic resonance imaging (fMRI). The experimental conditions within each task were chosen so that they differed only in their requirement to make a decision under conditions of uncertainty (probabilistic reasoning and frequency determination required) or under conditions of certainty (frequency determination required). The same visual stimuli and motor responses were used in the experimental conditions. We provide evidence that the neo-cerebellum, in conjunction with the premotor cortex, inferior parietal lobule and medial occipital cortex, mediates the probabilistic inferences that guide decision making under uncertainty. We hypothesise that the neo-cerebellum constructs internal working models of uncertain events in the external world, and that such probabilistic models subserve the predictive capacity central to induction. Copyright 2004 Elsevier B.V.
Zonta, Zivko J; Flotats, Xavier; Magrí, Albert
2014-08-01
The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.
Remembrance of inferences past: Amortization in human hypothesis generation.
Dasgupta, Ishita; Schulz, Eric; Goodman, Noah D; Gershman, Samuel J
2018-05-21
Bayesian models of cognition assume that people compute probability distributions over hypotheses. However, the required computations are frequently intractable or prohibitively expensive. Since people often encounter many closely related distributions, selective reuse of computations (amortized inference) is a computationally efficient use of the brain's limited resources. We present three experiments that provide evidence for amortization in human probabilistic reasoning. When sequentially answering two related queries about natural scenes, participants' responses to the second query systematically depend on the structure of the first query. This influence is sensitive to the content of the queries, only appearing when the queries are related. Using a cognitive load manipulation, we find evidence that people amortize summary statistics of previous inferences, rather than storing the entire distribution. These findings support the view that the brain trades off accuracy and computational cost, to make efficient use of its limited cognitive resources to approximate probabilistic inference. Copyright © 2018 Elsevier B.V. All rights reserved.
Don't Fear Optimality: Sampling for Probabilistic-Logic Sequence Models
NASA Astrophysics Data System (ADS)
Thon, Ingo
One of the current challenges in artificial intelligence is modeling dynamic environments that change due to the actions or activities undertaken by people or agents. The task of inferring hidden states, e.g. the activities or intentions of people, based on observations is called filtering. Standard probabilistic models such as Dynamic Bayesian Networks are able to solve this task efficiently using approximative methods such as particle filters. However, these models do not support logical or relational representations. The key contribution of this paper is the upgrade of a particle filter algorithm for use with a probabilistic logical representation through the definition of a proposal distribution. The performance of the algorithm depends largely on how well this distribution fits the target distribution. We adopt the idea of logical compilation into Binary Decision Diagrams for sampling. This allows us to use the optimal proposal distribution which is normally prohibitively slow.
Probabilistic graphlet transfer for photo cropping.
Zhang, Luming; Song, Mingli; Zhao, Qi; Liu, Xiao; Bu, Jiajun; Chen, Chun
2013-02-01
As one of the most basic photo manipulation processes, photo cropping is widely used in the printing, graphic design, and photography industries. In this paper, we introduce graphlets (i.e., small connected subgraphs) to represent a photo's aesthetic features, and propose a probabilistic model to transfer aesthetic features from the training photo onto the cropped photo. In particular, by segmenting each photo into a set of regions, we construct a region adjacency graph (RAG) to represent the global aesthetic feature of each photo. Graphlets are then extracted from the RAGs, and these graphlets capture the local aesthetic features of the photos. Finally, we cast photo cropping as a candidate-searching procedure on the basis of a probabilistic model, and infer the parameters of the cropped photos using Gibbs sampling. The proposed method is fully automatic. Subjective evaluations have shown that it is preferred over a number of existing approaches.
Saul: Towards Declarative Learning Based Programming
Kordjamshidi, Parisa; Roth, Dan; Wu, Hao
2015-01-01
We present Saul, a new probabilistic programming language designed to address some of the shortcomings of programming languages that aim at advancing and simplifying the development of AI systems. Such languages need to interact with messy, naturally occurring data, to allow a programmer to specify what needs to be done at an appropriate level of abstraction rather than at the data level, to be developed on a solid theory that supports moving to and reasoning at this level of abstraction and, finally, to support flexible integration of these learning and inference models within an application program. Saul is an object-functional programming language written in Scala that facilitates these by (1) allowing a programmer to learn, name and manipulate named abstractions over relational data; (2) supporting seamless incorporation of trainable (probabilistic or discriminative) components into the program, and (3) providing a level of inference over trainable models to support composition and make decisions that respect domain and application constraints. Saul is developed over a declaratively defined relational data model, can use piecewise learned factor graphs with declaratively specified learning and inference objectives, and it supports inference over probabilistic models augmented with declarative knowledge-based constraints. We describe the key constructs of Saul and exemplify its use in developing applications that require relational feature engineering and structured output prediction. PMID:26635465
Saul: Towards Declarative Learning Based Programming.
Kordjamshidi, Parisa; Roth, Dan; Wu, Hao
2015-07-01
We present Saul , a new probabilistic programming language designed to address some of the shortcomings of programming languages that aim at advancing and simplifying the development of AI systems. Such languages need to interact with messy, naturally occurring data, to allow a programmer to specify what needs to be done at an appropriate level of abstraction rather than at the data level, to be developed on a solid theory that supports moving to and reasoning at this level of abstraction and, finally, to support flexible integration of these learning and inference models within an application program. Saul is an object-functional programming language written in Scala that facilitates these by (1) allowing a programmer to learn, name and manipulate named abstractions over relational data; (2) supporting seamless incorporation of trainable (probabilistic or discriminative) components into the program, and (3) providing a level of inference over trainable models to support composition and make decisions that respect domain and application constraints. Saul is developed over a declaratively defined relational data model, can use piecewise learned factor graphs with declaratively specified learning and inference objectives, and it supports inference over probabilistic models augmented with declarative knowledge-based constraints. We describe the key constructs of Saul and exemplify its use in developing applications that require relational feature engineering and structured output prediction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pichara, Karim; Protopapas, Pavlos
We present an automatic classification method for astronomical catalogs with missing data. We use Bayesian networks and a probabilistic graphical model that allows us to perform inference to predict missing values given observed data and dependency relationships between variables. To learn a Bayesian network from incomplete data, we use an iterative algorithm that utilizes sampling methods and expectation maximization to estimate the distributions and probabilistic dependencies of variables from data with missing values. To test our model, we use three catalogs with missing data (SAGE, Two Micron All Sky Survey, and UBVI) and one complete catalog (MACHO). We examine howmore » classification accuracy changes when information from missing data catalogs is included, how our method compares to traditional missing data approaches, and at what computational cost. Integrating these catalogs with missing data, we find that classification of variable objects improves by a few percent and by 15% for quasar detection while keeping the computational cost the same.« less
Efficient Effects-Based Military Planning Final Report
2010-11-13
using probabilistic infer- ence methods,” in Proc. 8th Annu. Conf. Uncertainty Artificial Intelli - gence (UAI), Stanford, CA. San Mateo, CA: Morgan...Imprecise Probabilities, the 24th Conference on Uncertainty in Artificial Intelligence (UAI), 2008. 7. Yan Tong and Qiang Ji, Learning Bayesian Networks...Bayesian Networks using Constraints Cassio P. de Campos cassiopc@acm.org Dalle Molle Institute for Artificial Intelligence Galleria 2, Manno 6928
Bayesian Networks Improve Causal Environmental Assessments for Evidence-Based Policy.
Carriger, John F; Barron, Mace G; Newman, Michael C
2016-12-20
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on valued ecological resources. These aspects are demonstrated through hypothetical problem scenarios that explore some major benefits of using Bayesian networks for reasoning and making inferences in evidence-based policy.
Verification of Small Hole Theory for Application to Wire Chaffing Resulting in Shield Faults
NASA Technical Reports Server (NTRS)
Schuet, Stefan R.; Timucin, Dogan A.; Wheeler, Kevin R.
2011-01-01
Our work is focused upon developing methods for wire chafe fault detection through the use of reflectometry to assess shield integrity. When shielded electrical aircraft wiring first begins to chafe typically the resulting evidence is small hole(s) in the shielding. We are focused upon developing algorithms and the signal processing necessary to first detect these small holes prior to incurring damage to the inner conductors. Our approach has been to develop a first principles physics model combined with probabilistic inference, and to verify this model with laboratory experiments as well as through simulation. Previously we have presented the electromagnetic small-hole theory and how it might be applied to coaxial cable. In this presentation, we present our efforts to verify this theoretical approach with high-fidelity electromagnetic simulations (COMSOL). Laboratory observations are used to parameterize the computationally efficient theoretical model with probabilistic inference resulting in quantification of hole size and location. Our efforts in characterizing faults in coaxial cable are subsequently leading to fault detection in shielded twisted pair as well as analysis of intermittent faulty connectors using similar techniques.
Attention as Inference: Selection Is Probabilistic; Responses Are All-or-None Samples
ERIC Educational Resources Information Center
Vul, Edward; Hanus, Deborah; Kanwisher, Nancy
2009-01-01
Theories of probabilistic cognition postulate that internal representations are made up of multiple simultaneously held hypotheses, each with its own probability of being correct (henceforth, "probability distributions"). However, subjects make discrete responses and report the phenomenal contents of their mind to be all-or-none states rather than…
Fuller, Robert William; Wong, Tony E; Keller, Klaus
2017-01-01
The response of the Antarctic ice sheet (AIS) to changing global temperatures is a key component of sea-level projections. Current projections of the AIS contribution to sea-level changes are deeply uncertain. This deep uncertainty stems, in part, from (i) the inability of current models to fully resolve key processes and scales, (ii) the relatively sparse available data, and (iii) divergent expert assessments. One promising approach to characterizing the deep uncertainty stemming from divergent expert assessments is to combine expert assessments, observations, and simple models by coupling probabilistic inversion and Bayesian inversion. Here, we present a proof-of-concept study that uses probabilistic inversion to fuse a simple AIS model and diverse expert assessments. We demonstrate the ability of probabilistic inversion to infer joint prior probability distributions of model parameters that are consistent with expert assessments. We then confront these inferred expert priors with instrumental and paleoclimatic observational data in a Bayesian inversion. These additional constraints yield tighter hindcasts and projections. We use this approach to quantify how the deep uncertainty surrounding expert assessments affects the joint probability distributions of model parameters and future projections.
A Bayesian Developmental Approach to Robotic Goal-Based Imitation Learning.
Chung, Michael Jae-Yoon; Friesen, Abram L; Fox, Dieter; Meltzoff, Andrew N; Rao, Rajesh P N
2015-01-01
A fundamental challenge in robotics today is building robots that can learn new skills by observing humans and imitating human actions. We propose a new Bayesian approach to robotic learning by imitation inspired by the developmental hypothesis that children use self-experience to bootstrap the process of intention recognition and goal-based imitation. Our approach allows an autonomous agent to: (i) learn probabilistic models of actions through self-discovery and experience, (ii) utilize these learned models for inferring the goals of human actions, and (iii) perform goal-based imitation for robotic learning and human-robot collaboration. Such an approach allows a robot to leverage its increasing repertoire of learned behaviors to interpret increasingly complex human actions and use the inferred goals for imitation, even when the robot has very different actuators from humans. We demonstrate our approach using two different scenarios: (i) a simulated robot that learns human-like gaze following behavior, and (ii) a robot that learns to imitate human actions in a tabletop organization task. In both cases, the agent learns a probabilistic model of its own actions, and uses this model for goal inference and goal-based imitation. We also show that the robotic agent can use its probabilistic model to seek human assistance when it recognizes that its inferred actions are too uncertain, risky, or impossible to perform, thereby opening the door to human-robot collaboration.
A Bayesian Developmental Approach to Robotic Goal-Based Imitation Learning
Chung, Michael Jae-Yoon; Friesen, Abram L.; Fox, Dieter; Meltzoff, Andrew N.; Rao, Rajesh P. N.
2015-01-01
A fundamental challenge in robotics today is building robots that can learn new skills by observing humans and imitating human actions. We propose a new Bayesian approach to robotic learning by imitation inspired by the developmental hypothesis that children use self-experience to bootstrap the process of intention recognition and goal-based imitation. Our approach allows an autonomous agent to: (i) learn probabilistic models of actions through self-discovery and experience, (ii) utilize these learned models for inferring the goals of human actions, and (iii) perform goal-based imitation for robotic learning and human-robot collaboration. Such an approach allows a robot to leverage its increasing repertoire of learned behaviors to interpret increasingly complex human actions and use the inferred goals for imitation, even when the robot has very different actuators from humans. We demonstrate our approach using two different scenarios: (i) a simulated robot that learns human-like gaze following behavior, and (ii) a robot that learns to imitate human actions in a tabletop organization task. In both cases, the agent learns a probabilistic model of its own actions, and uses this model for goal inference and goal-based imitation. We also show that the robotic agent can use its probabilistic model to seek human assistance when it recognizes that its inferred actions are too uncertain, risky, or impossible to perform, thereby opening the door to human-robot collaboration. PMID:26536366
A Framework for Thinking about Informal Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie; Rubin, Andee
2009-01-01
Informal inferential reasoning has shown some promise in developing students' deeper understanding of statistical processes. This paper presents a framework to think about three key principles of informal inference--generalizations "beyond the data," probabilistic language, and data as evidence. The authors use primary school classroom…
Using Stan for Item Response Theory Models
ERIC Educational Resources Information Center
Ames, Allison J.; Au, Chi Hang
2018-01-01
Stan is a flexible probabilistic programming language providing full Bayesian inference through Hamiltonian Monte Carlo algorithms. The benefits of Hamiltonian Monte Carlo include improved efficiency and faster inference, when compared to other MCMC software implementations. Users can interface with Stan through a variety of computing…
Liu, Jing; Li, Yongping; Huang, Guohe; Fu, Haiyan; Zhang, Junlong; Cheng, Guanhui
2017-06-01
In this study, a multi-level-factorial risk-inference-based possibilistic-probabilistic programming (MRPP) method is proposed for supporting water quality management under multiple uncertainties. The MRPP method can handle uncertainties expressed as fuzzy-random-boundary intervals, probability distributions, and interval numbers, and analyze the effects of uncertainties as well as their interactions on modeling outputs. It is applied to plan water quality management in the Xiangxihe watershed. Results reveal that a lower probability of satisfying the objective function (θ) as well as a higher probability of violating environmental constraints (q i ) would correspond to a higher system benefit with an increased risk of violating system feasibility. Chemical plants are the major contributors to biological oxygen demand (BOD) and total phosphorus (TP) discharges; total nitrogen (TN) would be mainly discharged by crop farming. It is also discovered that optimistic decision makers should pay more attention to the interactions between chemical plant and water supply, while decision makers who possess a risk-averse attitude would focus on the interactive effect of q i and benefit of water supply. The findings can help enhance the model's applicability and identify a suitable water quality management policy for environmental sustainability according to the practical situations.
A Model-Based Probabilistic Inversion Framework for Wire Fault Detection Using TDR
NASA Technical Reports Server (NTRS)
Schuet, Stefan R.; Timucin, Dogan A.; Wheeler, Kevin R.
2010-01-01
Time-domain reflectometry (TDR) is one of the standard methods for diagnosing faults in electrical wiring and interconnect systems, with a long-standing history focused mainly on hardware development of both high-fidelity systems for laboratory use and portable hand-held devices for field deployment. While these devices can easily assess distance to hard faults such as sustained opens or shorts, their ability to assess subtle but important degradation such as chafing remains an open question. This paper presents a unified framework for TDR-based chafing fault detection in lossy coaxial cables by combining an S-parameter based forward modeling approach with a probabilistic (Bayesian) inference algorithm. Results are presented for the estimation of nominal and faulty cable parameters from laboratory data.
NASA Astrophysics Data System (ADS)
Dasher, D. H.; Lomax, T. J.; Bethe, A.; Jewett, S.; Hoberg, M.
2016-02-01
A regional probabilistic survey of 20 randomly selected stations, where water and sediments were sampled, was conducted over an area of Simpson Lagoon and Gwydyr Bay in the Beaufort Sea adjacent Prudhoe Bay, Alaska, in 2014. Sampling parameters included water column for temperature, salinity, dissolved oxygen, chlorophyll a, nutrients and sediments for macroinvertebrates, chemistry, i.e., trace metals and hydrocarbons, and grain size. The 2014 probabilistic survey design allows for inferences to be made of environmental status, for instance the spatial or aerial distribution of sediment trace metals within the design area sampled. Historically, since the 1970's a number of monitoring studies have been conducted in this estuary area using a targeted rather than regional probabilistic design. Targeted non-random designs were utilized to assess specific points of interest and cannot be used to make inferences to distributions of environmental parameters. Due to differences in the environmental monitoring objectives between probabilistic and targeted designs there has been limited assessment see if benefits exist to combining the two approaches. This study evaluates if a combined approach using the 2014 probabilistic survey sediment trace metal and macroinvertebrate results and historical targeted monitoring data can provide a new perspective on better understanding the environmental status of these estuaries.
Decision-theoretic control of EUVE telescope scheduling
NASA Technical Reports Server (NTRS)
Hansson, Othar; Mayer, Andrew
1993-01-01
This paper describes a decision theoretic scheduler (DTS) designed to employ state-of-the-art probabilistic inference technology to speed the search for efficient solutions to constraint-satisfaction problems. Our approach involves assessing the performance of heuristic control strategies that are normally hard-coded into scheduling systems and using probabilistic inference to aggregate this information in light of the features of a given problem. The Bayesian Problem-Solver (BPS) introduced a similar approach to solving single agent and adversarial graph search patterns yielding orders-of-magnitude improvement over traditional techniques. Initial efforts suggest that similar improvements will be realizable when applied to typical constraint-satisfaction scheduling problems.
Experiments with a decision-theoretic scheduler
NASA Technical Reports Server (NTRS)
Hansson, Othar; Holt, Gerhard; Mayer, Andrew
1992-01-01
This paper describes DTS, a decision-theoretic scheduler designed to employ state-of-the-art probabilistic inference technology to speed the search for efficient solutions to constraint-satisfaction problems. Our approach involves assessing the performance of heuristic control strategies that are normally hard-coded into scheduling systems, and using probabilistic inference to aggregate this information in light of features of a given problem. BPS, the Bayesian Problem-Solver, introduced a similar approach to solving single-agent and adversarial graph search problems, yielding orders-of-magnitude improvement over traditional techniques. Initial efforts suggest that similar improvements will be realizable when applied to typical constraint-satisfaction scheduling problems.
A Probabilistic Model of Meter Perception: Simulating Enculturation.
van der Weij, Bastiaan; Pearce, Marcus T; Honing, Henkjan
2017-01-01
Enculturation is known to shape the perception of meter in music but this is not explicitly accounted for by current cognitive models of meter perception. We hypothesize that the induction of meter is a result of predictive coding: interpreting onsets in a rhythm relative to a periodic meter facilitates prediction of future onsets. Such prediction, we hypothesize, is based on previous exposure to rhythms. As such, predictive coding provides a possible explanation for the way meter perception is shaped by the cultural environment. Based on this hypothesis, we present a probabilistic model of meter perception that uses statistical properties of the relation between rhythm and meter to infer meter from quantized rhythms. We show that our model can successfully predict annotated time signatures from quantized rhythmic patterns derived from folk melodies. Furthermore, we show that by inferring meter, our model improves prediction of the onsets of future events compared to a similar probabilistic model that does not infer meter. Finally, as a proof of concept, we demonstrate how our model can be used in a simulation of enculturation. From the results of this simulation, we derive a class of rhythms that are likely to be interpreted differently by enculturated listeners with different histories of exposure to rhythms.
Almahdi, Basil; Sultan, Pervez; Sohanpal, Imrat; Brandner, Brigitta; Collier, Tracey; Shergill, Sukhi S; Cregg, Roman; Averbeck, Bruno B
2012-01-01
Evidence suggests that some aspects of schizophrenia can be induced in healthy volunteers through acute administration of the non-competitive NMDA-receptor antagonist, ketamine. In probabilistic inference tasks, patients with schizophrenia have been shown to ‘jump to conclusions’ (JTC) when asked to make a decision. We aimed to test whether healthy participants receiving ketamine would adopt a JTC response pattern resembling that of patients. The paradigmatic task used to investigate JTC has been the ‘urn’ task, where participants are shown a sequence of beads drawn from one of two ‘urns’, each containing coloured beads in different proportions. Participants make a decision when they think they know the urn from which beads are being drawn. We compared performance on the urn task between controls receiving acute ketamine or placebo with that of patients with schizophrenia and another group of controls matched to the patient group. Patients were shown to exhibit a JTC response pattern relative to their matched controls, whereas JTC was not evident in controls receiving ketamine relative to placebo. Ketamine does not appear to promote JTC in healthy controls, suggesting that ketamine does not affect probabilistic inferences. PMID:22389244
Précis of bayesian rationality: The probabilistic approach to human reasoning.
Oaksford, Mike; Chater, Nick
2009-02-01
According to Aristotle, humans are the rational animal. The borderline between rationality and irrationality is fundamental to many aspects of human life including the law, mental health, and language interpretation. But what is it to be rational? One answer, deeply embedded in the Western intellectual tradition since ancient Greece, is that rationality concerns reasoning according to the rules of logic--the formal theory that specifies the inferential connections that hold with certainty between propositions. Piaget viewed logical reasoning as defining the end-point of cognitive development; and contemporary psychology of reasoning has focussed on comparing human reasoning against logical standards. Bayesian Rationality argues that rationality is defined instead by the ability to reason about uncertainty. Although people are typically poor at numerical reasoning about probability, human thought is sensitive to subtle patterns of qualitative Bayesian, probabilistic reasoning. In Chapters 1-4 of Bayesian Rationality (Oaksford & Chater 2007), the case is made that cognition in general, and human everyday reasoning in particular, is best viewed as solving probabilistic, rather than logical, inference problems. In Chapters 5-7 the psychology of "deductive" reasoning is tackled head-on: It is argued that purportedly "logical" reasoning problems, revealing apparently irrational behaviour, are better understood from a probabilistic point of view. Data from conditional reasoning, Wason's selection task, and syllogistic inference are captured by recasting these problems probabilistically. The probabilistic approach makes a variety of novel predictions which have been experimentally confirmed. The book considers the implications of this work, and the wider "probabilistic turn" in cognitive science and artificial intelligence, for understanding human rationality.
ERIC Educational Resources Information Center
Denison, Stephanie; Trikutam, Pallavi; Xu, Fei
2014-01-01
A rich tradition in developmental psychology explores physical reasoning in infancy. However, no research to date has investigated whether infants can reason about physical objects that behave probabilistically, rather than deterministically. Physical events are often quite variable, in that similar-looking objects can be placed in similar…
Inferred Eccentricity and Period Distributions of Kepler Eclipsing Binaries
NASA Astrophysics Data System (ADS)
Prsa, Andrej; Matijevic, G.
2014-01-01
Determining the underlying eccentricity and orbital period distributions from an observed sample of eclipsing binary stars is not a trivial task. Shen and Turner (2008) have shown that the commonly used maximum likelihood estimators are biased to larger eccentricities and they do not describe the underlying distribution correctly; orbital periods suffer from a similar bias. Hogg, Myers and Bovy (2010) proposed a hierarchical probabilistic method for inferring the true eccentricity distribution of exoplanet orbits that uses the likelihood functions for individual star eccentricities. The authors show that proper inference outperforms the simple histogramming of the best-fit eccentricity values. We apply this method to the complete sample of eclipsing binary stars observed by the Kepler mission (Prsa et al. 2011) to derive the unbiased underlying eccentricity and orbital period distributions. These distributions can be used for the studies of multiple star formation, dynamical evolution, and they can serve as a drop-in replacement to prior, ad-hoc distributions used in the exoplanet field for determining false positive occurrence rates.
Sampling studies to estimate the HIV prevalence rate in female commercial sex workers.
Pascom, Ana Roberta Pati; Szwarcwald, Célia Landmann; Barbosa Júnior, Aristides
2010-01-01
We investigated sampling methods being used to estimate the HIV prevalence rate among female commercial sex workers. The studies were classified according to the adequacy or not of the sample size to estimate HIV prevalence rate and according to the sampling method (probabilistic or convenience). We identified 75 studies that estimated the HIV prevalence rate among female sex workers. Most of the studies employed convenience samples. The sample size was not adequate to estimate HIV prevalence rate in 35 studies. The use of convenience sample limits statistical inference for the whole group. It was observed that there was an increase in the number of published studies since 2005, as well as in the number of studies that used probabilistic samples. This represents a large advance in the monitoring of risk behavior practices and HIV prevalence rate in this group.
A Tutorial in Bayesian Potential Outcomes Mediation Analysis.
Miočević, Milica; Gonzalez, Oscar; Valente, Matthew J; MacKinnon, David P
2018-01-01
Statistical mediation analysis is used to investigate intermediate variables in the relation between independent and dependent variables. Causal interpretation of mediation analyses is challenging because randomization of subjects to levels of the independent variable does not rule out the possibility of unmeasured confounders of the mediator to outcome relation. Furthermore, commonly used frequentist methods for mediation analysis compute the probability of the data given the null hypothesis, which is not the probability of a hypothesis given the data as in Bayesian analysis. Under certain assumptions, applying the potential outcomes framework to mediation analysis allows for the computation of causal effects, and statistical mediation in the Bayesian framework gives indirect effects probabilistic interpretations. This tutorial combines causal inference and Bayesian methods for mediation analysis so the indirect and direct effects have both causal and probabilistic interpretations. Steps in Bayesian causal mediation analysis are shown in the application to an empirical example.
Inferring probabilistic stellar rotation periods using Gaussian processes
NASA Astrophysics Data System (ADS)
Angus, Ruth; Morton, Timothy; Aigrain, Suzanne; Foreman-Mackey, Daniel; Rajpaul, Vinesh
2018-02-01
Variability in the light curves of spotted, rotating stars is often non-sinusoidal and quasi-periodic - spots move on the stellar surface and have finite lifetimes, causing stellar flux variations to slowly shift in phase. A strictly periodic sinusoid therefore cannot accurately model a rotationally modulated stellar light curve. Physical models of stellar surfaces have many drawbacks preventing effective inference, such as highly degenerate or high-dimensional parameter spaces. In this work, we test an appropriate effective model: a Gaussian Process with a quasi-periodic covariance kernel function. This highly flexible model allows sampling of the posterior probability density function of the periodic parameter, marginalizing over the other kernel hyperparameters using a Markov Chain Monte Carlo approach. To test the effectiveness of this method, we infer rotation periods from 333 simulated stellar light curves, demonstrating that the Gaussian process method produces periods that are more accurate than both a sine-fitting periodogram and an autocorrelation function method. We also demonstrate that it works well on real data, by inferring rotation periods for 275 Kepler stars with previously measured periods. We provide a table of rotation periods for these and many more, altogether 1102 Kepler objects of interest, and their posterior probability density function samples. Because this method delivers posterior probability density functions, it will enable hierarchical studies involving stellar rotation, particularly those involving population modelling, such as inferring stellar ages, obliquities in exoplanet systems, or characterizing star-planet interactions. The code used to implement this method is available online.
Probabilistic risk analysis of building contamination.
Bolster, D T; Tartakovsky, D M
2008-10-01
We present a general framework for probabilistic risk assessment (PRA) of building contamination. PRA provides a powerful tool for the rigorous quantification of risk in contamination of building spaces. A typical PRA starts by identifying relevant components of a system (e.g. ventilation system components, potential sources of contaminants, remediation methods) and proceeds by using available information and statistical inference to estimate the probabilities of their failure. These probabilities are then combined by means of fault-tree analyses to yield probabilistic estimates of the risk of system failure (e.g. building contamination). A sensitivity study of PRAs can identify features and potential problems that need to be addressed with the most urgency. Often PRAs are amenable to approximations, which can significantly simplify the approach. All these features of PRA are presented in this paper via a simple illustrative example, which can be built upon in further studies. The tool presented here can be used to design and maintain adequate ventilation systems to minimize exposure of occupants to contaminants.
ERIC Educational Resources Information Center
Meiser, Thorsten; Rummel, Jan; Fleig, Hanna
2018-01-01
Pseudocontingencies are inferences about correlations in the environment that are formed on the basis of statistical regularities like skewed base rates or varying base rates across environmental contexts. Previous research has demonstrated that pseudocontingencies provide a pervasive mechanism of inductive inference in numerous social judgment…
Perspectives of Probabilistic Inferences: Reinforcement Learning and an Adaptive Network Compared
ERIC Educational Resources Information Center
Rieskamp, Jorg
2006-01-01
The assumption that people possess a strategy repertoire for inferences has been raised repeatedly. The strategy selection learning theory specifies how people select strategies from this repertoire. The theory assumes that individuals select strategies proportional to their subjective expectations of how well the strategies solve particular…
De novo identification of highly diverged protein repeats by probabilistic consistency.
Biegert, A; Söding, J
2008-03-15
An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
The Manhattan Frame Model-Manhattan World Inference in the Space of Surface Normals.
Straub, Julian; Freifeld, Oren; Rosman, Guy; Leonard, John J; Fisher, John W
2018-01-01
Objects and structures within man-made environments typically exhibit a high degree of organization in the form of orthogonal and parallel planes. Traditional approaches utilize these regularities via the restrictive, and rather local, Manhattan World (MW) assumption which posits that every plane is perpendicular to one of the axes of a single coordinate system. The aforementioned regularities are especially evident in the surface normal distribution of a scene where they manifest as orthogonally-coupled clusters. This motivates the introduction of the Manhattan-Frame (MF) model which captures the notion of an MW in the surface normals space, the unit sphere, and two probabilistic MF models over this space. First, for a single MF we propose novel real-time MAP inference algorithms, evaluate their performance and their use in drift-free rotation estimation. Second, to capture the complexity of real-world scenes at a global scale, we extend the MF model to a probabilistic mixture of Manhattan Frames (MMF). For MMF inference we propose a simple MAP inference algorithm and an adaptive Markov-Chain Monte-Carlo sampling algorithm with Metropolis-Hastings split/merge moves that let us infer the unknown number of mixture components. We demonstrate the versatility of the MMF model and inference algorithm across several scales of man-made environments.
Segmentation of Image Ensembles via Latent Atlases
Van Leemput, Koen; Menze, Bjoern H.; Wells, William M.; Golland, Polina
2010-01-01
Spatial priors, such as probabilistic atlases, play an important role in MRI segmentation. However, the availability of comprehensive, reliable and suitable manual segmentations for atlas construction is limited. We therefore propose a method for joint segmentation of corresponding regions of interest in a collection of aligned images that does not require labeled training data. Instead, a latent atlas, initialized by at most a single manual segmentation, is inferred from the evolving segmentations of the ensemble. The algorithm is based on probabilistic principles but is solved using partial differential equations (PDEs) and energy minimization criteria. We evaluate the method on two datasets, segmenting subcortical and cortical structures in a multi-subject study and extracting brain tumors in a single-subject multi-modal longitudinal experiment. We compare the segmentation results to manual segmentations, when those exist, and to the results of a state-of-the-art atlas-based segmentation method. The quality of the results supports the latent atlas as a promising alternative when existing atlases are not compatible with the images to be segmented. PMID:20580305
Hierarchical Probabilistic Inference of Cosmic Shear
NASA Astrophysics Data System (ADS)
Schneider, Michael D.; Hogg, David W.; Marshall, Philip J.; Dawson, William A.; Meyers, Joshua; Bard, Deborah J.; Lang, Dustin
2015-07-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics.
Wong, Tony E.; Keller, Klaus
2017-01-01
The response of the Antarctic ice sheet (AIS) to changing global temperatures is a key component of sea-level projections. Current projections of the AIS contribution to sea-level changes are deeply uncertain. This deep uncertainty stems, in part, from (i) the inability of current models to fully resolve key processes and scales, (ii) the relatively sparse available data, and (iii) divergent expert assessments. One promising approach to characterizing the deep uncertainty stemming from divergent expert assessments is to combine expert assessments, observations, and simple models by coupling probabilistic inversion and Bayesian inversion. Here, we present a proof-of-concept study that uses probabilistic inversion to fuse a simple AIS model and diverse expert assessments. We demonstrate the ability of probabilistic inversion to infer joint prior probability distributions of model parameters that are consistent with expert assessments. We then confront these inferred expert priors with instrumental and paleoclimatic observational data in a Bayesian inversion. These additional constraints yield tighter hindcasts and projections. We use this approach to quantify how the deep uncertainty surrounding expert assessments affects the joint probability distributions of model parameters and future projections. PMID:29287095
Bayesian networks improve causal environmental ...
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on value
Zeng, Jia; Hannenhalli, Sridhar
2013-01-01
Gene duplication, followed by functional evolution of duplicate genes, is a primary engine of evolutionary innovation. In turn, gene expression evolution is a critical component of overall functional evolution of paralogs. Inferring evolutionary history of gene expression among paralogs is therefore a problem of considerable interest. It also represents significant challenges. The standard approaches of evolutionary reconstruction assume that at an internal node of the duplication tree, the two duplicates evolve independently. However, because of various selection pressures functional evolution of the two paralogs may be coupled. The coupling of paralog evolution corresponds to three major fates of gene duplicates: subfunctionalization (SF), conserved function (CF) or neofunctionalization (NF). Quantitative analysis of these fates is of great interest and clearly influences evolutionary inference of expression. These two interrelated problems of inferring gene expression and evolutionary fates of gene duplicates have not been studied together previously and motivate the present study. Here we propose a novel probabilistic framework and algorithm to simultaneously infer (i) ancestral gene expression and (ii) the likely fate (SF, NF, CF) at each duplication event during the evolution of gene family. Using tissue-specific gene expression data, we develop a nonparametric belief propagation (NBP) algorithm to predict the ancestral expression level as a proxy for function, and describe a novel probabilistic model that relates the predicted and known expression levels to the possible evolutionary fates. We validate our model using simulation and then apply it to a genome-wide set of gene duplicates in human. Our results suggest that SF tends to be more frequent at the earlier stage of gene family expansion, while NF occurs more frequently later on.
Erdogan, Goker; Yildirim, Ilker; Jacobs, Robert A.
2015-01-01
People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models—that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model’s percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects’ ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception. PMID:26554704
Conceptual Challenges in Coordinating Theoretical and Data-Centered Estimates of Probability
ERIC Educational Resources Information Center
Konold, Cliff; Madden, Sandra; Pollatsek, Alexander; Pfannkuch, Maxine; Wild, Chris; Ziedins, Ilze; Finzer, William; Horton, Nicholas J.; Kazak, Sibel
2011-01-01
A core component of informal statistical inference is the recognition that judgments based on sample data are inherently uncertain. This implies that instruction aimed at developing informal inference needs to foster basic probabilistic reasoning. In this article, we analyze and critique the now-common practice of introducing students to both…
The Role of Probability in Developing Learners' Models of Simulation Approaches to Inference
ERIC Educational Resources Information Center
Lee, Hollylynne S.; Doerr, Helen M.; Tran, Dung; Lovett, Jennifer N.
2016-01-01
Repeated sampling approaches to inference that rely on simulations have recently gained prominence in statistics education, and probabilistic concepts are at the core of this approach. In this approach, learners need to develop a mapping among the problem situation, a physical enactment, computer representations, and the underlying randomization…
Functional mechanisms of probabilistic inference in feature- and space-based attentional systems.
Dombert, Pascasie L; Kuhns, Anna; Mengotti, Paola; Fink, Gereon R; Vossel, Simone
2016-11-15
Humans flexibly attend to features or locations and these processes are influenced by the probability of sensory events. We combined computational modeling of response times with fMRI to compare the functional correlates of (re-)orienting, and the modulation by probabilistic inference in spatial and feature-based attention systems. Twenty-four volunteers performed two task versions with spatial or color cues. Percentage of cue validity changed unpredictably. A hierarchical Bayesian model was used to derive trial-wise estimates of probability-dependent attention, entering the fMRI analysis as parametric regressors. Attentional orienting activated a dorsal frontoparietal network in both tasks, without significant parametric modulation. Spatially invalid trials activated a bilateral frontoparietal network and the precuneus, while invalid feature trials activated the left intraparietal sulcus (IPS). Probability-dependent attention modulated activity in the precuneus, left posterior IPS, middle occipital gyrus, and right temporoparietal junction for spatial attention, and in the left anterior IPS for feature-based and spatial attention. These findings provide novel insights into the generality and specificity of the functional basis of attentional control. They suggest that probabilistic inference can distinctively affect each attentional subsystem, but that there is an overlap in the left IPS, which responds to both spatial and feature-based expectancy violations. Copyright © 2016 Elsevier Inc. All rights reserved.
The Role of Working Memory in the Probabilistic Inference of Future Sensory Events.
Cashdollar, Nathan; Ruhnau, Philipp; Weisz, Nathan; Hasson, Uri
2017-05-01
The ability to represent the emerging regularity of sensory information from the external environment has been thought to allow one to probabilistically infer future sensory occurrences and thus optimize behavior. However, the underlying neural implementation of this process is still not comprehensively understood. Through a convergence of behavioral and neurophysiological evidence, we establish that the probabilistic inference of future events is critically linked to people's ability to maintain the recent past in working memory. Magnetoencephalography recordings demonstrated that when visual stimuli occurring over an extended time series had a greater statistical regularity, individuals with higher working-memory capacity (WMC) displayed enhanced slow-wave neural oscillations in the θ frequency band (4-8 Hz.) prior to, but not during stimulus appearance. This prestimulus neural activity was specifically linked to contexts where information could be anticipated and influenced the preferential sensory processing for this visual information after its appearance. A separate behavioral study demonstrated that this process intrinsically emerges during continuous perception and underpins a realistic advantage for efficient behavioral responses. In this way, WMC optimizes the anticipation of higher level semantic concepts expected to occur in the near future. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Inference in the brain: Statistics flowing in redundant population codes
Pitkow, Xaq; Angelaki, Dora E
2017-01-01
It is widely believed that the brain performs approximate probabilistic inference to estimate causal variables in the world from ambiguous sensory data. To understand these computations, we need to analyze how information is represented and transformed by the actions of nonlinear recurrent neural networks. We propose that these probabilistic computations function by a message-passing algorithm operating at the level of redundant neural populations. To explain this framework, we review its underlying concepts, including graphical models, sufficient statistics, and message-passing, and then describe how these concepts could be implemented by recurrently connected probabilistic population codes. The relevant information flow in these networks will be most interpretable at the population level, particularly for redundant neural codes. We therefore outline a general approach to identify the essential features of a neural message-passing algorithm. Finally, we argue that to reveal the most important aspects of these neural computations, we must study large-scale activity patterns during moderately complex, naturalistic behaviors. PMID:28595050
Reasoning in Reference Games: Individual- vs. Population-Level Probabilistic Modeling
Franke, Michael; Degen, Judith
2016-01-01
Recent advances in probabilistic pragmatics have achieved considerable success in modeling speakers’ and listeners’ pragmatic reasoning as probabilistic inference. However, these models are usually applied to population-level data, and so implicitly suggest a homogeneous population without individual differences. Here we investigate potential individual differences in Theory-of-Mind related depth of pragmatic reasoning in so-called reference games that require drawing ad hoc Quantity implicatures of varying complexity. We show by Bayesian model comparison that a model that assumes a heterogenous population is a better predictor of our data, especially for comprehension. We discuss the implications for the treatment of individual differences in probabilistic models of language use. PMID:27149675
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sasahara, M; Arimura, H; Hirose, T
Purpose: Current image-guided radiotherapy (IGRT) procedure is bonebased patient positioning, followed by subjective manual correction using cone beam computed tomography (CBCT). This procedure might cause the misalignment of the patient positioning. Automatic target-based patient positioning systems achieve the better reproducibility of patient setup. Our aim of this study was to develop an automatic target-based patient positioning framework for IGRT with CBCT images in prostate cancer treatment. Methods: Seventy-three CBCT images of 10 patients and 24 planning CT images with digital imaging and communications in medicine for radiotherapy (DICOM-RT) structures were used for this study. Our proposed framework started from themore » generation of probabilistic atlases of bone and prostate from 24 planning CT images and prostate contours, which were made in the treatment planning. Next, the gray-scale histograms of CBCT values within CTV regions in the planning CT images were obtained as the occurrence probability of the CBCT values. Then, CBCT images were registered to the atlases using a rigid registration with mutual information. Finally, prostate regions were estimated by applying the Bayesian inference to CBCT images with the probabilistic atlases and CBCT value occurrence probability. The proposed framework was evaluated by calculating the Euclidean distance of errors between two centroids of prostate regions determined by our method and ground truths of manual delineations by a radiation oncologist and a medical physicist on CBCT images for 10 patients. Results: The average Euclidean distance between the centroids of extracted prostate regions determined by our proposed method and ground truths was 4.4 mm. The average errors for each direction were 1.8 mm in anteroposterior direction, 0.6 mm in lateral direction and 2.1 mm in craniocaudal direction. Conclusion: Our proposed framework based on probabilistic atlases and Bayesian inference might be feasible to automatically determine prostate regions on CBCT images.« less
3D Traffic Scene Understanding From Movable Platforms.
Geiger, Andreas; Lauer, Martin; Wojek, Christian; Stiller, Christoph; Urtasun, Raquel
2014-05-01
In this paper, we present a novel probabilistic generative model for multi-object traffic scene understanding from movable platforms which reasons jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, the scene topology, geometry, and traffic activities are inferred from short video sequences. Inspired by the impressive driving capabilities of humans, our model does not rely on GPS, lidar, or map knowledge. Instead, it takes advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow, and occupancy grids. For each of these cues, we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that our approach successfully infers the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context derived from the proposed method we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments.
Buchsbaum, Daphna; Seiver, Elizabeth; Bridgers, Sophie; Gopnik, Alison
2012-01-01
A major challenge children face is uncovering the causal structure of the world around them. Previous research on children's causal inference has demonstrated their ability to learn about causal relationships in the physical environment using probabilistic evidence. However, children must also learn about causal relationships in the social environment, including discovering the causes of other people's behavior, and understanding the causal relationships between others' goal-directed actions and the outcomes of those actions. In this chapter, we argue that social reasoning and causal reasoning are deeply linked, both in the real world and in children's minds. Children use both types of information together and in fact reason about both physical and social causation in fundamentally similar ways. We suggest that children jointly construct and update causal theories about their social and physical environment and that this process is best captured by probabilistic models of cognition. We first present studies showing that adults are able to jointly infer causal structure and human action structure from videos of unsegmented human motion. Next, we describe how children use social information to make inferences about physical causes. We show that the pedagogical nature of a demonstrator influences children's choices of which actions to imitate from within a causal sequence and that this social information interacts with statistical causal evidence. We then discuss how children combine evidence from an informant's testimony and expressed confidence with evidence from their own causal observations to infer the efficacy of different potential causes. We also discuss how children use these same causal observations to make inferences about the knowledge state of the social informant. Finally, we suggest that psychological causation and attribution are part of the same causal system as physical causation. We present evidence that just as children use covariation between physical causes and their effects to learn physical causal relationships, they also use covaration between people's actions and the environment to make inferences about the causes of human behavior.
Cancer Evolution: Mathematical Models and Computational Inference
Beerenwinkel, Niko; Schwarz, Roland F.; Gerstung, Moritz; Markowetz, Florian
2015-01-01
Cancer is a somatic evolutionary process characterized by the accumulation of mutations, which contribute to tumor growth, clinical progression, immune escape, and drug resistance development. Evolutionary theory can be used to analyze the dynamics of tumor cell populations and to make inference about the evolutionary history of a tumor from molecular data. We review recent approaches to modeling the evolution of cancer, including population dynamics models of tumor initiation and progression, phylogenetic methods to model the evolutionary relationship between tumor subclones, and probabilistic graphical models to describe dependencies among mutations. Evolutionary modeling helps to understand how tumors arise and will also play an increasingly important prognostic role in predicting disease progression and the outcome of medical interventions, such as targeted therapy. PMID:25293804
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
ERIC Educational Resources Information Center
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Bayesian Estimation of Small Effects in Exercise and Sports Science.
Mengersen, Kerrie L; Drovandi, Christopher C; Robert, Christian P; Pyne, David B; Gore, Christopher J
2016-01-01
The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a 'magnitude-based inference' approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.
A general probabilistic model for group independent component analysis and its estimation methods
Guo, Ying
2012-01-01
SUMMARY Independent component analysis (ICA) has become an important tool for analyzing data from functional magnetic resonance imaging (fMRI) studies. ICA has been successfully applied to single-subject fMRI data. The extension of ICA to group inferences in neuroimaging studies, however, is challenging due to the unavailability of a pre-specified group design matrix and the uncertainty in between-subjects variability in fMRI data. We present a general probabilistic ICA (PICA) model that can accommodate varying group structures of multi-subject spatio-temporal processes. An advantage of the proposed model is that it can flexibly model various types of group structures in different underlying neural source signals and under different experimental conditions in fMRI studies. A maximum likelihood method is used for estimating this general group ICA model. We propose two EM algorithms to obtain the ML estimates. The first method is an exact EM algorithm which provides an exact E-step and an explicit noniterative M-step. The second method is an variational approximation EM algorithm which is computationally more efficient than the exact EM. In simulation studies, we first compare the performance of the proposed general group PICA model and the existing probabilistic group ICA approach. We then compare the two proposed EM algorithms and show the variational approximation EM achieves comparable accuracy to the exact EM with significantly less computation time. An fMRI data example is used to illustrate application of the proposed methods. PMID:21517789
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, Michael D.; Dawson, William A.; Hogg, David W.
2015-07-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxymore » properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics.« less
Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.
Eddy, Sean R
2014-01-01
Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
A Computationally-Efficient Inverse Approach to Probabilistic Strain-Based Damage Diagnosis
NASA Technical Reports Server (NTRS)
Warner, James E.; Hochhalter, Jacob D.; Leser, William P.; Leser, Patrick E.; Newman, John A
2016-01-01
This work presents a computationally-efficient inverse approach to probabilistic damage diagnosis. Given strain data at a limited number of measurement locations, Bayesian inference and Markov Chain Monte Carlo (MCMC) sampling are used to estimate probability distributions of the unknown location, size, and orientation of damage. Substantial computational speedup is obtained by replacing a three-dimensional finite element (FE) model with an efficient surrogate model. The approach is experimentally validated on cracked test specimens where full field strains are determined using digital image correlation (DIC). Access to full field DIC data allows for testing of different hypothetical sensor arrangements, facilitating the study of strain-based diagnosis effectiveness as the distance between damage and measurement locations increases. The ability of the framework to effectively perform both probabilistic damage localization and characterization in cracked plates is demonstrated and the impact of measurement location on uncertainty in the predictions is shown. Furthermore, the analysis time to produce these predictions is orders of magnitude less than a baseline Bayesian approach with the FE method by utilizing surrogate modeling and effective numerical sampling approaches.
Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data
2014-11-01
NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING...we can apply our results to problems of attrition in which missingness is a severe obstacle to sound inferences. Related works are discussed in...due to the collider path between Y and Ry ). 8 Related Work Deletion based methods such as listwise deletion that are easy to understand as well as
Bivariate drought frequency analysis using the copula method
NASA Astrophysics Data System (ADS)
Mirabbasi, Rasoul; Fakheri-Fard, Ahmad; Dinpashoh, Yagob
2012-04-01
Droughts are major natural hazards with significant environmental and economic impacts. In this study, two-dimensional copulas were applied to the analysis of the meteorological drought characteristics of the Sharafkhaneh gauge station, located in the northwest of Iran. Two major drought characteristics, duration and severity, as defined by the standardized precipitation index, were abstracted from observed drought events. Since drought duration and severity exhibited a significant correlation and since they were modeled using different distributions, copulas were used to construct the joint distribution function of the drought characteristics. The parameter of copulas was estimated using the method of the Inference Function for Margins. Several copulas were tested in order to determine the best data fit. According to the error analysis and the tail dependence coefficient, the Galambos copula provided the best fit for the observed drought data. Some bivariate probabilistic properties of droughts, based on the derived copula-based joint distribution, were also investigated. These probabilistic properties can provide useful information for water resource planning and management.
A Transferrable Belief Model Representation for Physical Security of Nuclear Materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
David Gerts
This work analyzed various probabilistic methods such as classic statistics, Bayesian inference, possibilistic theory, and Dempster-Shafer theory of belief functions for the potential insight offered into the physical security of nuclear materials as well as more broad application to nuclear non-proliferation automated decision making theory. A review of the fundamental heuristic and basic limitations of each of these methods suggested that the Dempster-Shafer theory of belief functions may offer significant capability. Further examination of the various interpretations of Dempster-Shafer theory, such as random set, generalized Bayesian, and upper/lower probability demonstrate some limitations. Compared to the other heuristics, the transferrable beliefmore » model (TBM), one of the leading interpretations of Dempster-Shafer theory, can improve the automated detection of the violation of physical security using sensors and human judgment. The improvement is shown to give a significant heuristic advantage over other probabilistic options by demonstrating significant successes for several classic gedanken experiments.« less
Oh-Descher, Hanna; Beck, Jeffrey M; Ferrari, Silvia; Sommer, Marc A; Egner, Tobias
2017-11-15
Real-life decision-making often involves combining multiple probabilistic sources of information under finite time and cognitive resources. To mitigate these pressures, people "satisfice", foregoing a full evaluation of all available evidence to focus on a subset of cues that allow for fast and "good-enough" decisions. Although this form of decision-making likely mediates many of our everyday choices, very little is known about the way in which the neural encoding of cue information changes when we satisfice under time pressure. Here, we combined human functional magnetic resonance imaging (fMRI) with a probabilistic classification task to characterize neural substrates of multi-cue decision-making under low (1500 ms) and high (500 ms) time pressure. Using variational Bayesian inference, we analyzed participants' choices to track and quantify cue usage under each experimental condition, which was then applied to model the fMRI data. Under low time pressure, participants performed near-optimally, appropriately integrating all available cues to guide choices. Both cortical (prefrontal and parietal cortex) and subcortical (hippocampal and striatal) regions encoded individual cue weights, and activity linearly tracked trial-by-trial variations in the amount of evidence and decision uncertainty. Under increased time pressure, participants adaptively shifted to using a satisficing strategy by discounting the least informative cue in their decision process. This strategic change in decision-making was associated with an increased involvement of the dopaminergic midbrain, striatum, thalamus, and cerebellum in representing and integrating cue values. We conclude that satisficing the probabilistic inference process under time pressure leads to a cortical-to-subcortical shift in the neural drivers of decisions. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification of probabilities.
Vitányi, Paul M B; Chater, Nick
2017-02-01
Within psychology, neuroscience and artificial intelligence, there has been increasing interest in the proposal that the brain builds probabilistic models of sensory and linguistic input: that is, to infer a probabilistic model from a sample. The practical problems of such inference are substantial: the brain has limited data and restricted computational resources. But there is a more fundamental question: is the problem of inferring a probabilistic model from a sample possible even in principle? We explore this question and find some surprisingly positive and general results. First, for a broad class of probability distributions characterized by computability restrictions, we specify a learning algorithm that will almost surely identify a probability distribution in the limit given a finite i.i.d. sample of sufficient but unknown length. This is similarly shown to hold for sequences generated by a broad class of Markov chains, subject to computability assumptions. The technical tool is the strong law of large numbers. Second, for a large class of dependent sequences, we specify an algorithm which identifies in the limit a computable measure for which the sequence is typical, in the sense of Martin-Löf (there may be more than one such measure). The technical tool is the theory of Kolmogorov complexity. We analyze the associated predictions in both cases. We also briefly consider special cases, including language learning, and wider theoretical implications for psychology.
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A.; Stewart, Ron; Gasch, Audrey P.
2013-01-01
Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development. PMID:24146602
The Development of Adaptive Decision Making: Recognition-Based Inference in Children and Adolescents
ERIC Educational Resources Information Center
Horn, Sebastian S.; Ruggeri, Azzurra; Pachur, Thorsten
2016-01-01
Judgments about objects in the world are often based on probabilistic information (or cues). A frugal judgment strategy that utilizes memory (i.e., the ability to discriminate between known and unknown objects) as a cue for inference is the recognition heuristic (RH). The usefulness of the RH depends on the structure of the environment,…
Graph rigidity, cyclic belief propagation, and point pattern matching.
McAuley, Julian J; Caetano, Tibério S; Barbosa, Marconi S
2008-11-01
A recent paper [1] proposed a provably optimal polynomial time method for performing near-isometric point pattern matching by means of exact probabilistic inference in a chordal graphical model. Its fundamental result is that the chordal graph in question is shown to be globally rigid, implying that exact inference provides the same matching solution as exact inference in a complete graphical model. This implies that the algorithm is optimal when there is no noise in the point patterns. In this paper, we present a new graph that is also globally rigid but has an advantage over the graph proposed in [1]: Its maximal clique size is smaller, rendering inference significantly more efficient. However, this graph is not chordal, and thus, standard Junction Tree algorithms cannot be directly applied. Nevertheless, we show that loopy belief propagation in such a graph converges to the optimal solution. This allows us to retain the optimality guarantee in the noiseless case, while substantially reducing both memory requirements and processing time. Our experimental results show that the accuracy of the proposed solution is indistinguishable from that in [1] when there is noise in the point patterns.
Computational approaches to protein inference in shotgun proteomics
2012-01-01
Shotgun proteomics has recently emerged as a powerful approach to characterizing proteomes in biological samples. Its overall objective is to identify the form and quantity of each protein in a high-throughput manner by coupling liquid chromatography with tandem mass spectrometry. As a consequence of its high throughput nature, shotgun proteomics faces challenges with respect to the analysis and interpretation of experimental data. Among such challenges, the identification of proteins present in a sample has been recognized as an important computational task. This task generally consists of (1) assigning experimental tandem mass spectra to peptides derived from a protein database, and (2) mapping assigned peptides to proteins and quantifying the confidence of identified proteins. Protein identification is fundamentally a statistical inference problem with a number of methods proposed to address its challenges. In this review we categorize current approaches into rule-based, combinatorial optimization and probabilistic inference techniques, and present them using integer programing and Bayesian inference frameworks. We also discuss the main challenges of protein identification and propose potential solutions with the goal of spurring innovative research in this area. PMID:23176300
Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M.; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A.; Gilks, C. Blake; Huntsman, David G.; McAlpine, Jessica N.; Aparicio, Samuel
2014-01-01
The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. PMID:25060187
Mezlini, Aziz M; Goldenberg, Anna
2017-10-01
Discovering genetic mechanisms driving complex diseases is a hard problem. Existing methods often lack power to identify the set of responsible genes. Protein-protein interaction networks have been shown to boost power when detecting gene-disease associations. We introduce a Bayesian framework, Conflux, to find disease associated genes from exome sequencing data using networks as a prior. There are two main advantages to using networks within a probabilistic graphical model. First, networks are noisy and incomplete, a substantial impediment to gene discovery. Incorporating networks into the structure of a probabilistic models for gene inference has less impact on the solution than relying on the noisy network structure directly. Second, using a Bayesian framework we can keep track of the uncertainty of each gene being associated with the phenotype rather than returning a fixed list of genes. We first show that using networks clearly improves gene detection compared to individual gene testing. We then show consistently improved performance of Conflux compared to the state-of-the-art diffusion network-based method Hotnet2 and a variety of other network and variant aggregation methods, using randomly generated and literature-reported gene sets. We test Hotnet2 and Conflux on several network configurations to reveal biases and patterns of false positives and false negatives in each case. Our experiments show that our novel Bayesian framework Conflux incorporates many of the advantages of the current state-of-the-art methods, while offering more flexibility and improved power in many gene-disease association scenarios.
NASA Astrophysics Data System (ADS)
Wellons, Sarah; Torrey, Paul
2017-06-01
Galaxy populations at different cosmic epochs are often linked by cumulative comoving number density in observational studies. Many theoretical works, however, have shown that the cumulative number densities of tracked galaxy populations not only evolve in bulk, but also spread out over time. We present a method for linking progenitor and descendant galaxy populations which takes both of these effects into account. We define probability distribution functions that capture the evolution and dispersion of galaxy populations in number density space, and use these functions to assign galaxies at redshift zf probabilities of being progenitors/descendants of a galaxy population at another redshift z0. These probabilities are used as weights for calculating distributions of physical progenitor/descendant properties such as stellar mass, star formation rate or velocity dispersion. We demonstrate that this probabilistic method provides more accurate predictions for the evolution of physical properties than the assumption of either a constant number density or an evolving number density in a bin of fixed width by comparing predictions against galaxy populations directly tracked through a cosmological simulation. We find that the constant number density method performs least well at recovering galaxy properties, the evolving method density slightly better and the probabilistic method best of all. The improvement is present for predictions of stellar mass as well as inferred quantities such as star formation rate and velocity dispersion. We demonstrate that this method can also be applied robustly and easily to observational data, and provide a code package for doing so.
Uncertain deduction and conditional reasoning.
Evans, Jonathan St B T; Thompson, Valerie A; Over, David E
2015-01-01
There has been a paradigm shift in the psychology of deductive reasoning. Many researchers no longer think it is appropriate to ask people to assume premises and decide what necessarily follows, with the results evaluated by binary extensional logic. Most every day and scientific inference is made from more or less confidently held beliefs and not assumptions, and the relevant normative standard is Bayesian probability theory. We argue that the study of "uncertain deduction" should directly ask people to assign probabilities to both premises and conclusions, and report an experiment using this method. We assess this reasoning by two Bayesian metrics: probabilistic validity and coherence according to probability theory. On both measures, participants perform above chance in conditional reasoning, but they do much better when statements are grouped as inferences, rather than evaluated in separate tasks.
NASA Astrophysics Data System (ADS)
Wang, Q. J.; Robertson, D. E.; Chiew, F. H. S.
2009-05-01
Seasonal forecasting of streamflows can be highly valuable for water resources management. In this paper, a Bayesian joint probability (BJP) modeling approach for seasonal forecasting of streamflows at multiple sites is presented. A Box-Cox transformed multivariate normal distribution is proposed to model the joint distribution of future streamflows and their predictors such as antecedent streamflows and El Niño-Southern Oscillation indices and other climate indicators. Bayesian inference of model parameters and uncertainties is implemented using Markov chain Monte Carlo sampling, leading to joint probabilistic forecasts of streamflows at multiple sites. The model provides a parametric structure for quantifying relationships between variables, including intersite correlations. The Box-Cox transformed multivariate normal distribution has considerable flexibility for modeling a wide range of predictors and predictands. The Bayesian inference formulated allows the use of data that contain nonconcurrent and missing records. The model flexibility and data-handling ability means that the BJP modeling approach is potentially of wide practical application. The paper also presents a number of statistical measures and graphical methods for verification of probabilistic forecasts of continuous variables. Results for streamflows at three river gauges in the Murrumbidgee River catchment in southeast Australia show that the BJP modeling approach has good forecast quality and that the fitted model is consistent with observed data.
The Sense of Confidence during Probabilistic Learning: A Normative Account.
Meyniel, Florent; Schlunegger, Daniel; Dehaene, Stanislas
2015-06-01
Learning in a stochastic environment consists of estimating a model from a limited amount of noisy data, and is therefore inherently uncertain. However, many classical models reduce the learning process to the updating of parameter estimates and neglect the fact that learning is also frequently accompanied by a variable "feeling of knowing" or confidence. The characteristics and the origin of these subjective confidence estimates thus remain largely unknown. Here we investigate whether, during learning, humans not only infer a model of their environment, but also derive an accurate sense of confidence from their inferences. In our experiment, humans estimated the transition probabilities between two visual or auditory stimuli in a changing environment, and reported their mean estimate and their confidence in this report. To formalize the link between both kinds of estimate and assess their accuracy in comparison to a normative reference, we derive the optimal inference strategy for our task. Our results indicate that subjects accurately track the likelihood that their inferences are correct. Learning and estimating confidence in what has been learned appear to be two intimately related abilities, suggesting that they arise from a single inference process. We show that human performance matches several properties of the optimal probabilistic inference. In particular, subjective confidence is impacted by environmental uncertainty, both at the first level (uncertainty in stimulus occurrence given the inferred stochastic characteristics) and at the second level (uncertainty due to unexpected changes in these stochastic characteristics). Confidence also increases appropriately with the number of observations within stable periods. Our results support the idea that humans possess a quantitative sense of confidence in their inferences about abstract non-sensory parameters of the environment. This ability cannot be reduced to simple heuristics, it seems instead a core property of the learning process.
The Sense of Confidence during Probabilistic Learning: A Normative Account
Meyniel, Florent; Schlunegger, Daniel; Dehaene, Stanislas
2015-01-01
Learning in a stochastic environment consists of estimating a model from a limited amount of noisy data, and is therefore inherently uncertain. However, many classical models reduce the learning process to the updating of parameter estimates and neglect the fact that learning is also frequently accompanied by a variable “feeling of knowing” or confidence. The characteristics and the origin of these subjective confidence estimates thus remain largely unknown. Here we investigate whether, during learning, humans not only infer a model of their environment, but also derive an accurate sense of confidence from their inferences. In our experiment, humans estimated the transition probabilities between two visual or auditory stimuli in a changing environment, and reported their mean estimate and their confidence in this report. To formalize the link between both kinds of estimate and assess their accuracy in comparison to a normative reference, we derive the optimal inference strategy for our task. Our results indicate that subjects accurately track the likelihood that their inferences are correct. Learning and estimating confidence in what has been learned appear to be two intimately related abilities, suggesting that they arise from a single inference process. We show that human performance matches several properties of the optimal probabilistic inference. In particular, subjective confidence is impacted by environmental uncertainty, both at the first level (uncertainty in stimulus occurrence given the inferred stochastic characteristics) and at the second level (uncertainty due to unexpected changes in these stochastic characteristics). Confidence also increases appropriately with the number of observations within stable periods. Our results support the idea that humans possess a quantitative sense of confidence in their inferences about abstract non-sensory parameters of the environment. This ability cannot be reduced to simple heuristics, it seems instead a core property of the learning process. PMID:26076466
Szöllősi, Gergely J.; Boussau, Bastien; Abby, Sophie S.; Tannier, Eric; Daubin, Vincent
2012-01-01
The timing of the evolution of microbial life has largely remained elusive due to the scarcity of prokaryotic fossil record and the confounding effects of the exchange of genes among possibly distant species. The history of gene transfer events, however, is not a series of individual oddities; it records which lineages were concurrent and thus provides information on the timing of species diversification. Here, we use a probabilistic model of genome evolution that accounts for differences between gene phylogenies and the species tree as series of duplication, transfer, and loss events to reconstruct chronologically ordered species phylogenies. Using simulations we show that we can robustly recover accurate chronologically ordered species phylogenies in the presence of gene tree reconstruction errors and realistic rates of duplication, transfer, and loss. Using genomic data we demonstrate that we can infer rooted species phylogenies using homologous gene families from complete genomes of 10 bacterial and archaeal groups. Focusing on cyanobacteria, distinguished among prokaryotes by a relative abundance of fossils, we infer the maximum likelihood chronologically ordered species phylogeny based on 36 genomes with 8,332 homologous gene families. We find the order of speciation events to be in full agreement with the fossil record and the inferred phylogeny of cyanobacteria to be consistent with the phylogeny recovered from established phylogenomics methods. Our results demonstrate that lateral gene transfers, detected by probabilistic models of genome evolution, can be used as a source of information on the timing of evolution, providing a valuable complement to the limited prokaryotic fossil record. PMID:23043116
bnstruct: an R package for Bayesian Network structure learning in the presence of missing data.
Franzin, Alberto; Sambo, Francesco; Di Camillo, Barbara
2017-04-15
A Bayesian Network is a probabilistic graphical model that encodes probabilistic dependencies between a set of random variables. We introduce bnstruct, an open source R package to (i) learn the structure and the parameters of a Bayesian Network from data in the presence of missing values and (ii) perform reasoning and inference on the learned Bayesian Networks. To the best of our knowledge, there is no other open source software that provides methods for all of these tasks, particularly the manipulation of missing data, which is a common situation in practice. The software is implemented in R and C and is available on CRAN under a GPL licence. francesco.sambo@unipd.it. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A computational framework to empower probabilistic protein design
Fromer, Menachem; Yanover, Chen
2008-01-01
Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future. Contact: fromer@cs.huji.ac.il PMID:18586717
Palmer, Cameron; Pe’er, Itsik
2016-01-01
Missing data are an unavoidable component of modern statistical genetics. Different array or sequencing technologies cover different single nucleotide polymorphisms (SNPs), leading to a complicated mosaic pattern of missingness where both individual genotypes and entire SNPs are sporadically absent. Such missing data patterns cannot be ignored without introducing bias, yet cannot be inferred exclusively from nonmissing data. In genome-wide association studies, the accepted solution to missingness is to impute missing data using external reference haplotypes. The resulting probabilistic genotypes may be analyzed in the place of genotype calls. A general-purpose paradigm, called Multiple Imputation (MI), is known to model uncertainty in many contexts, yet it is not widely used in association studies. Here, we undertake a systematic evaluation of existing imputed data analysis methods and MI. We characterize biases related to uncertainty in association studies, and find that bias is introduced both at the imputation level, when imputation algorithms generate inconsistent genotype probabilities, and at the association level, when analysis methods inadequately model genotype uncertainty. We find that MI performs at least as well as existing methods or in some cases much better, and provides a straightforward paradigm for adapting existing genotype association methods to uncertain data. PMID:27310603
An empirical approach to symmetry and probability
NASA Astrophysics Data System (ADS)
North, Jill
We often rely on symmetries to infer outcomes' probabilities, as when we infer that each side of a fair coin is equally likely to come up on a given toss. Why are these inferences successful? I argue against answering this question with an a priori indifference principle. Reasons to reject such a principle are familiar, yet instructive. They point to a new, empirical explanation for the success of our probabilistic predictions. This has implications for indifference reasoning generally. I argue that a priori symmetries need never constrain our probability attributions, even for initial credences.
Kang, Shuli; Li, Qingjiao; Chen, Quan; Zhou, Yonggang; Park, Stacy; Lee, Gina; Grimes, Brandon; Krysan, Kostyantyn; Yu, Min; Wang, Wei; Alber, Frank; Sun, Fengzhu; Dubinett, Steven M; Li, Wenyuan; Zhou, Xianghong Jasmine
2017-03-24
We propose a probabilistic method, CancerLocator, which exploits the diagnostic potential of cell-free DNA by determining not only the presence but also the location of tumors. CancerLocator simultaneously infers the proportions and the tissue-of-origin of tumor-derived cell-free DNA in a blood sample using genome-wide DNA methylation data. CancerLocator outperforms two established multi-class classification methods on simulations and real data, even with the low proportion of tumor-derived DNA in the cell-free DNA scenarios. CancerLocator also achieves promising results on patient plasma samples with low DNA methylation sequencing coverage.
Hierarchical probabilistic Gabor and MRF segmentation of brain tumours in MRI volumes.
Subbanna, Nagesh K; Precup, Doina; Collins, D Louis; Arbel, Tal
2013-01-01
In this paper, we present a fully automated hierarchical probabilistic framework for segmenting brain tumours from multispectral human brain magnetic resonance images (MRIs) using multiwindow Gabor filters and an adapted Markov Random Field (MRF) framework. In the first stage, a customised Gabor decomposition is developed, based on the combined-space characteristics of the two classes (tumour and non-tumour) in multispectral brain MRIs in order to optimally separate tumour (including edema) from healthy brain tissues. A Bayesian framework then provides a coarse probabilistic texture-based segmentation of tumours (including edema) whose boundaries are then refined at the voxel level through a modified MRF framework that carefully separates the edema from the main tumour. This customised MRF is not only built on the voxel intensities and class labels as in traditional MRFs, but also models the intensity differences between neighbouring voxels in the likelihood model, along with employing a prior based on local tissue class transition probabilities. The second inference stage is shown to resolve local inhomogeneities and impose a smoothing constraint, while also maintaining the appropriate boundaries as supported by the local intensity difference observations. The method was trained and tested on the publicly available MICCAI 2012 Brain Tumour Segmentation Challenge (BRATS) Database [1] on both synthetic and clinical volumes (low grade and high grade tumours). Our method performs well compared to state-of-the-art techniques, outperforming the results of the top methods in cases of clinical high grade and low grade tumour core segmentation by 40% and 45% respectively.
Significance testing as perverse probabilistic reasoning
2011-01-01
Truth claims in the medical literature rely heavily on statistical significance testing. Unfortunately, most physicians misunderstand the underlying probabilistic logic of significance tests and consequently often misinterpret their results. This near-universal misunderstanding is highlighted by means of a simple quiz which we administered to 246 physicians at two major academic hospitals, on which the proportion of incorrect responses exceeded 90%. A solid understanding of the fundamental concepts of probability theory is becoming essential to the rational interpretation of medical information. This essay provides a technically sound review of these concepts that is accessible to a medical audience. We also briefly review the debate in the cognitive sciences regarding physicians' aptitude for probabilistic inference. PMID:21356064
A unified probabilistic framework for spontaneous facial action modeling and understanding.
Tong, Yan; Chen, Jixu; Ji, Qiang
2010-02-01
Facial expression is a natural and powerful means of human communication. Recognizing spontaneous facial actions, however, is very challenging due to subtle facial deformation, frequent head movements, and ambiguous and uncertain facial motion measurements. Because of these challenges, current research in facial expression recognition is limited to posed expressions and often in frontal view. A spontaneous facial expression is characterized by rigid head movements and nonrigid facial muscular movements. More importantly, it is the coherent and consistent spatiotemporal interactions among rigid and nonrigid facial motions that produce a meaningful facial expression. Recognizing this fact, we introduce a unified probabilistic facial action model based on the Dynamic Bayesian network (DBN) to simultaneously and coherently represent rigid and nonrigid facial motions, their spatiotemporal dependencies, and their image measurements. Advanced machine learning methods are introduced to learn the model based on both training data and subjective prior knowledge. Given the model and the measurements of facial motions, facial action recognition is accomplished through probabilistic inference by systematically integrating visual measurements with the facial action model. Experiments show that compared to the state-of-the-art techniques, the proposed system yields significant improvements in recognizing both rigid and nonrigid facial motions, especially for spontaneous facial expressions.
Uncertain deduction and conditional reasoning
Evans, Jonathan St. B. T.; Thompson, Valerie A.; Over, David E.
2015-01-01
There has been a paradigm shift in the psychology of deductive reasoning. Many researchers no longer think it is appropriate to ask people to assume premises and decide what necessarily follows, with the results evaluated by binary extensional logic. Most every day and scientific inference is made from more or less confidently held beliefs and not assumptions, and the relevant normative standard is Bayesian probability theory. We argue that the study of “uncertain deduction” should directly ask people to assign probabilities to both premises and conclusions, and report an experiment using this method. We assess this reasoning by two Bayesian metrics: probabilistic validity and coherence according to probability theory. On both measures, participants perform above chance in conditional reasoning, but they do much better when statements are grouped as inferences, rather than evaluated in separate tasks. PMID:25904888
Cancer evolution: mathematical models and computational inference.
Beerenwinkel, Niko; Schwarz, Roland F; Gerstung, Moritz; Markowetz, Florian
2015-01-01
Cancer is a somatic evolutionary process characterized by the accumulation of mutations, which contribute to tumor growth, clinical progression, immune escape, and drug resistance development. Evolutionary theory can be used to analyze the dynamics of tumor cell populations and to make inference about the evolutionary history of a tumor from molecular data. We review recent approaches to modeling the evolution of cancer, including population dynamics models of tumor initiation and progression, phylogenetic methods to model the evolutionary relationship between tumor subclones, and probabilistic graphical models to describe dependencies among mutations. Evolutionary modeling helps to understand how tumors arise and will also play an increasingly important prognostic role in predicting disease progression and the outcome of medical interventions, such as targeted therapy. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society of Systematic Biologists.
Legenstein, Robert; Maass, Wolfgang
2014-01-01
It has recently been shown that networks of spiking neurons with noise can emulate simple forms of probabilistic inference through “neural sampling”, i.e., by treating spikes as samples from a probability distribution of network states that is encoded in the network. Deficiencies of the existing model are its reliance on single neurons for sampling from each random variable, and the resulting limitation in representing quickly varying probabilistic information. We show that both deficiencies can be overcome by moving to a biologically more realistic encoding of each salient random variable through the stochastic firing activity of an ensemble of neurons. The resulting model demonstrates that networks of spiking neurons with noise can easily track and carry out basic computational operations on rapidly varying probability distributions, such as the odds of getting rewarded for a specific behavior. We demonstrate the viability of this new approach towards neural coding and computation, which makes use of the inherent parallelism of generic neural circuits, by showing that this model can explain experimentally observed firing activity of cortical neurons for a variety of tasks that require rapid temporal integration of sensory information. PMID:25340749
Zhang, Qin; Yao, Quanying
2018-05-01
The dynamic uncertain causality graph (DUCG) is a newly presented framework for uncertain causality representation and probabilistic reasoning. It has been successfully applied to online fault diagnoses of large, complex industrial systems, and decease diagnoses. This paper extends the DUCG to model more complex cases than what could be previously modeled, e.g., the case in which statistical data are in different groups with or without overlap, and some domain knowledge and actions (new variables with uncertain causalities) are introduced. In other words, this paper proposes to use -mode, -mode, and -mode of the DUCG to model such complex cases and then transform them into either the standard -mode or the standard -mode. In the former situation, if no directed cyclic graph is involved, the transformed result is simply a Bayesian network (BN), and existing inference methods for BNs can be applied. In the latter situation, an inference method based on the DUCG is proposed. Examples are provided to illustrate the methodology.
INFERRING THE ECCENTRICITY DISTRIBUTION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogg, David W.; Bovy, Jo; Myers, Adam D., E-mail: david.hogg@nyu.ed
2010-12-20
Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementationmore » of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision-other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts-so long as the measurements have been communicated as a likelihood function or a posterior sampling.« less
A comparison of algorithms for inference and learning in probabilistic graphical models.
Frey, Brendan J; Jojic, Nebojsa
2005-09-01
Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Ha, Gavin; Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A; Gilks, C Blake; Huntsman, David G; McAlpine, Jessica N; Aparicio, Samuel; Shah, Sohrab P
2014-11-01
The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. © 2014 Ha et al.; Published by Cold Spring Harbor Laboratory Press.
Scalable Probabilistic Inference for Global Seismic Monitoring
NASA Astrophysics Data System (ADS)
Arora, N. S.; Dear, T.; Russell, S.
2011-12-01
We describe a probabilistic generative model for seismic events, their transmission through the earth, and their detection (or mis-detection) at seismic stations. We also describe an inference algorithm that constructs the most probable event bulletin explaining the observed set of detections. The model and inference are called NET-VISA (network processing vertically integrated seismic analysis) and is designed to replace the current automated network processing at the IDC, the SEL3 bulletin. Our results (attached table) demonstrate that NET-VISA significantly outperforms SEL3 by reducing the missed events from 30.3% down to 12.5%. The difference is even more dramatic for smaller magnitude events. NET-VISA has no difficulty in locating nuclear explosions as well. The attached figure demonstrates the location predicted by NET-VISA versus other bulletins for the second DPRK event. Further evaluation on dense regional networks demonstrates that NET-VISA finds many events missed in the LEB bulletin, which is produced by the human analysts. Large aftershock sequences, as produced by the 2004 December Sumatra earthquake and the 2011 March Tohoku earthquake, can pose a significant load for automated processing, often delaying the IDC bulletins by weeks or months. Indeed these sequences can overload the serial NET-VISA inference as well. We describe an enhancement to NET-VISA to make it multi-threaded, and hence take full advantage of the processing power of multi-core and -cpu machines. Our experiments show that the new inference algorithm is able to achieve 80% efficiency in parallel speedup.
Spiking neuron network Helmholtz machine.
Sountsov, Pavel; Miller, Paul
2015-01-01
An increasing amount of behavioral and neurophysiological data suggests that the brain performs optimal (or near-optimal) probabilistic inference and learning during perception and other tasks. Although many machine learning algorithms exist that perform inference and learning in an optimal way, the complete description of how one of those algorithms (or a novel algorithm) can be implemented in the brain is currently incomplete. There have been many proposed solutions that address how neurons can perform optimal inference but the question of how synaptic plasticity can implement optimal learning is rarely addressed. This paper aims to unify the two fields of probabilistic inference and synaptic plasticity by using a neuronal network of realistic model spiking neurons to implement a well-studied computational model called the Helmholtz Machine. The Helmholtz Machine is amenable to neural implementation as the algorithm it uses to learn its parameters, called the wake-sleep algorithm, uses a local delta learning rule. Our spiking-neuron network implements both the delta rule and a small example of a Helmholtz machine. This neuronal network can learn an internal model of continuous-valued training data sets without supervision. The network can also perform inference on the learned internal models. We show how various biophysical features of the neural implementation constrain the parameters of the wake-sleep algorithm, such as the duration of the wake and sleep phases of learning and the minimal sample duration. We examine the deviations from optimal performance and tie them to the properties of the synaptic plasticity rule.
Spiking neuron network Helmholtz machine
Sountsov, Pavel; Miller, Paul
2015-01-01
An increasing amount of behavioral and neurophysiological data suggests that the brain performs optimal (or near-optimal) probabilistic inference and learning during perception and other tasks. Although many machine learning algorithms exist that perform inference and learning in an optimal way, the complete description of how one of those algorithms (or a novel algorithm) can be implemented in the brain is currently incomplete. There have been many proposed solutions that address how neurons can perform optimal inference but the question of how synaptic plasticity can implement optimal learning is rarely addressed. This paper aims to unify the two fields of probabilistic inference and synaptic plasticity by using a neuronal network of realistic model spiking neurons to implement a well-studied computational model called the Helmholtz Machine. The Helmholtz Machine is amenable to neural implementation as the algorithm it uses to learn its parameters, called the wake-sleep algorithm, uses a local delta learning rule. Our spiking-neuron network implements both the delta rule and a small example of a Helmholtz machine. This neuronal network can learn an internal model of continuous-valued training data sets without supervision. The network can also perform inference on the learned internal models. We show how various biophysical features of the neural implementation constrain the parameters of the wake-sleep algorithm, such as the duration of the wake and sleep phases of learning and the minimal sample duration. We examine the deviations from optimal performance and tie them to the properties of the synaptic plasticity rule. PMID:25954191
Active Inference, homeostatic regulation and adaptive behavioural control
Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl
2015-01-01
We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference. PMID:26365173
[Methodological design of the National Health and Nutrition Survey 2016].
Romero-Martínez, Martín; Shamah-Levy, Teresa; Cuevas-Nasu, Lucía; Gómez-Humarán, Ignacio Méndez; Gaona-Pineda, Elsa Berenice; Gómez-Acosta, Luz María; Rivera-Dommarco, Juan Ángel; Hernández-Ávila, Mauricio
2017-01-01
Describe the design methodology of the halfway health and nutrition national survey (Ensanut-MC) 2016. The Ensanut-MC is a national probabilistic survey whose objective population are the inhabitants of private households in Mexico. The sample size was determined to make inferences on the urban and rural areas in four regions. Describes main design elements: target population, topics of study, sampling procedure, measurement procedure and logistics organization. A final sample of 9 479 completed household interviews, and a sample of 16 591 individual interviews. The response rate for households was 77.9%, and the response rate for individuals was 91.9%. The Ensanut-MC probabilistic design allows valid statistical inferences about interest parameters for Mexico´s public health and nutrition, specifically on overweight, obesity and diabetes mellitus. Updated information also supports the monitoring, updating and formulation of new policies and priority programs.
Cardinal rules: Visual orientation perception reflects knowledge of environmental statistics
Girshick, Ahna R.; Landy, Michael S.; Simoncelli, Eero P.
2011-01-01
Humans are remarkably good at performing visual tasks, but experimental measurements reveal substantial biases in the perception of basic visual attributes. An appealing hypothesis is that these biases arise through a process of statistical inference, in which information from noisy measurements is fused with a probabilistic model of the environment. But such inference is optimal only if the observer’s internal model matches the environment. Here, we provide evidence that this is the case. We measured performance in an orientation-estimation task, demonstrating the well-known fact that orientation judgements are more accurate at cardinal (horizontal and vertical) orientations, along with a new observation that judgements made under conditions of uncertainty are strongly biased toward cardinal orientations. We estimate observers’ internal models for orientation and find that they match the local orientation distribution measured in photographs. We also show how a neural population could embed probabilistic information responsible for such biases. PMID:21642976
Modeling urban air pollution with optimized hierarchical fuzzy inference system.
Tashayo, Behnam; Alimohammadi, Abbas
2016-10-01
Environmental exposure assessments (EEA) and epidemiological studies require urban air pollution models with appropriate spatial and temporal resolutions. Uncertain available data and inflexible models can limit air pollution modeling techniques, particularly in under developing countries. This paper develops a hierarchical fuzzy inference system (HFIS) to model air pollution under different land use, transportation, and meteorological conditions. To improve performance, the system treats the issue as a large-scale and high-dimensional problem and develops the proposed model using a three-step approach. In the first step, a geospatial information system (GIS) and probabilistic methods are used to preprocess the data. In the second step, a hierarchical structure is generated based on the problem. In the third step, the accuracy and complexity of the model are simultaneously optimized with a multiple objective particle swarm optimization (MOPSO) algorithm. We examine the capabilities of the proposed model for predicting daily and annual mean PM2.5 and NO2 and compare the accuracy of the results with representative models from existing literature. The benefits provided by the model features, including probabilistic preprocessing, multi-objective optimization, and hierarchical structure, are precisely evaluated by comparing five different consecutive models in terms of accuracy and complexity criteria. Fivefold cross validation is used to assess the performance of the generated models. The respective average RMSEs and coefficients of determination (R (2)) for the test datasets using proposed model are as follows: daily PM2.5 = (8.13, 0.78), annual mean PM2.5 = (4.96, 0.80), daily NO2 = (5.63, 0.79), and annual mean NO2 = (2.89, 0.83). The obtained results demonstrate that the developed hierarchical fuzzy inference system can be utilized for modeling air pollution in EEA and epidemiological studies.
Internal validation of STRmix™ for the interpretation of single source and mixed DNA profiles.
Moretti, Tamyra R; Just, Rebecca S; Kehl, Susannah C; Willis, Leah E; Buckleton, John S; Bright, Jo-Anne; Taylor, Duncan A; Onorato, Anthony J
2017-07-01
The interpretation of DNA evidence can entail analysis of challenging STR typing results. Genotypes inferred from low quality or quantity specimens, or mixed DNA samples originating from multiple contributors, can result in weak or inconclusive match probabilities when a binary interpretation method and necessary thresholds (such as a stochastic threshold) are employed. Probabilistic genotyping approaches, such as fully continuous methods that incorporate empirically determined biological parameter models, enable usage of more of the profile information and reduce subjectivity in interpretation. As a result, software-based probabilistic analyses tend to produce more consistent and more informative results regarding potential contributors to DNA evidence. Studies to assess and internally validate the probabilistic genotyping software STRmix™ for casework usage at the Federal Bureau of Investigation Laboratory were conducted using lab-specific parameters and more than 300 single-source and mixed contributor profiles. Simulated forensic specimens, including constructed mixtures that included DNA from two to five donors across a broad range of template amounts and contributor proportions, were used to examine the sensitivity and specificity of the system via more than 60,000 tests comparing hundreds of known contributors and non-contributors to the specimens. Conditioned analyses, concurrent interpretation of amplification replicates, and application of an incorrect contributor number were also performed to further investigate software performance and probe the limitations of the system. In addition, the results from manual and probabilistic interpretation of both prepared and evidentiary mixtures were compared. The findings support that STRmix™ is sufficiently robust for implementation in forensic laboratories, offering numerous advantages over historical methods of DNA profile analysis and greater statistical power for the estimation of evidentiary weight, and can be used reliably in human identification testing. With few exceptions, likelihood ratio results reflected intuitively correct estimates of the weight of the genotype possibilities and known contributor genotypes. This comprehensive evaluation provides a model in accordance with SWGDAM recommendations for internal validation of a probabilistic genotyping system for DNA evidence interpretation. Copyright © 2017. Published by Elsevier B.V.
Inductive Reasoning about Causally Transmitted Properties
ERIC Educational Resources Information Center
Shafto, Patrick; Kemp, Charles; Bonawitz, Elizabeth Baraff; Coley, John D.; Tenenbaum, Joshua B.
2008-01-01
Different intuitive theories constrain and guide inferences in different contexts. Formalizing simple intuitive theories as probabilistic processes operating over structured representations, we present a new computational model of category-based induction about causally transmitted properties. A first experiment demonstrates undergraduates'…
Le Pape, Sylvain; Dimitrova, Elena; Hannaert, Patrick; Konovalov, Alexander; Volmer, Romain; Ron, David; Thuillier, Raphaël; Hauet, Thierry
2014-08-25
The unfolded protein response (UPR)--the endoplasmic reticulum stress response--is found in various pathologies including ischemia-reperfusion injury (IRI). However, its role during IRI is still unclear. Here, by combining two different bioinformatical methods--a method based on ordinary differential equations (Time Series Network Inference) and an algebraic method (probabilistic polynomial dynamical systems)--we identified the IRE1α-XBP1 and the ATF6 pathways as the main UPR effectors involved in cell's adaptation to IRI. We validated these findings experimentally by assessing the impact of their knock-out and knock-down on cell survival during IRI. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.
Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G
2012-10-02
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Probabilistic graphs as a conceptual and computational tool in hydrology and water management
NASA Astrophysics Data System (ADS)
Schoups, Gerrit
2014-05-01
Originally developed in the fields of machine learning and artificial intelligence, probabilistic graphs constitute a general framework for modeling complex systems in the presence of uncertainty. The framework consists of three components: 1. Representation of the model as a graph (or network), with nodes depicting random variables in the model (e.g. parameters, states, etc), which are joined together by factors. Factors are local probabilistic or deterministic relations between subsets of variables, which, when multiplied together, yield the joint distribution over all variables. 2. Consistent use of probability theory for quantifying uncertainty, relying on basic rules of probability for assimilating data into the model and expressing unknown variables as a function of observations (via the posterior distribution). 3. Efficient, distributed approximation of the posterior distribution using general-purpose algorithms that exploit model structure encoded in the graph. These attributes make probabilistic graphs potentially useful as a conceptual and computational tool in hydrology and water management (and beyond). Conceptually, they can provide a common framework for existing and new probabilistic modeling approaches (e.g. by drawing inspiration from other fields of application), while computationally they can make probabilistic inference feasible in larger hydrological models. The presentation explores, via examples, some of these benefits.
Systematic parameter inference in stochastic mesoscopic modeling
NASA Astrophysics Data System (ADS)
Lei, Huan; Yang, Xiu; Li, Zhen; Karniadakis, George Em
2017-02-01
We propose a method to efficiently determine the optimal coarse-grained force field in mesoscopic stochastic simulations of Newtonian fluid and polymer melt systems modeled by dissipative particle dynamics (DPD) and energy conserving dissipative particle dynamics (eDPD). The response surfaces of various target properties (viscosity, diffusivity, pressure, etc.) with respect to model parameters are constructed based on the generalized polynomial chaos (gPC) expansion using simulation results on sampling points (e.g., individual parameter sets). To alleviate the computational cost to evaluate the target properties, we employ the compressive sensing method to compute the coefficients of the dominant gPC terms given the prior knowledge that the coefficients are "sparse". The proposed method shows comparable accuracy with the standard probabilistic collocation method (PCM) while it imposes a much weaker restriction on the number of the simulation samples especially for systems with high dimensional parametric space. Fully access to the response surfaces within the confidence range enables us to infer the optimal force parameters given the desirable values of target properties at the macroscopic scale. Moreover, it enables us to investigate the intrinsic relationship between the model parameters, identify possible degeneracies in the parameter space, and optimize the model by eliminating model redundancies. The proposed method provides an efficient alternative approach for constructing mesoscopic models by inferring model parameters to recover target properties of the physics systems (e.g., from experimental measurements), where those force field parameters and formulation cannot be derived from the microscopic level in a straight forward way.
Using Approximate Bayesian Computation to infer sex ratios from acoustic data.
Lehnen, Lisa; Schorcht, Wigbert; Karst, Inken; Biedermann, Martin; Kerth, Gerald; Puechmaille, Sebastien J
2018-01-01
Population sex ratios are of high ecological relevance, but are challenging to determine in species lacking conspicuous external cues indicating their sex. Acoustic sexing is an option if vocalizations differ between sexes, but is precluded by overlapping distributions of the values of male and female vocalizations in many species. A method allowing the inference of sex ratios despite such an overlap will therefore greatly increase the information extractable from acoustic data. To meet this demand, we developed a novel approach using Approximate Bayesian Computation (ABC) to infer the sex ratio of populations from acoustic data. Additionally, parameters characterizing the male and female distribution of acoustic values (mean and standard deviation) are inferred. This information is then used to probabilistically assign a sex to a single acoustic signal. We furthermore develop a simpler means of sex ratio estimation based on the exclusion of calls from the overlap zone. Applying our methods to simulated data demonstrates that sex ratio and acoustic parameter characteristics of males and females are reliably inferred by the ABC approach. Applying both the ABC and the exclusion method to empirical datasets (echolocation calls recorded in colonies of lesser horseshoe bats, Rhinolophus hipposideros) provides similar sex ratios as molecular sexing. Our methods aim to facilitate evidence-based conservation, and to benefit scientists investigating ecological or conservation questions related to sex- or group specific behaviour across a wide range of organisms emitting acoustic signals. The developed methodology is non-invasive, low-cost and time-efficient, thus allowing the study of many sites and individuals. We provide an R-script for the easy application of the method and discuss potential future extensions and fields of applications. The script can be easily adapted to account for numerous biological systems by adjusting the type and number of groups to be distinguished (e.g. age, social rank, cryptic species) and the acoustic parameters investigated.
Generating probabilistic Boolean networks from a prescribed transition probability matrix.
Ching, W-K; Chen, X; Tsing, N-K
2009-11-01
Probabilistic Boolean networks (PBNs) have received much attention in modeling genetic regulatory networks. A PBN can be regarded as a Markov chain process and is characterised by a transition probability matrix. In this study, the authors propose efficient algorithms for constructing a PBN when its transition probability matrix is given. The complexities of the algorithms are also analysed. This is an interesting inverse problem in network inference using steady-state data. The problem is important as most microarray data sets are assumed to be obtained from sampling the steady-state.
A building block for hardware belief networks.
Behin-Aein, Behtash; Diep, Vinh; Datta, Supriyo
2016-07-21
Belief networks represent a powerful approach to problems involving probabilistic inference, but much of the work in this area is software based utilizing standard deterministic hardware based on the transistor which provides the gain and directionality needed to interconnect billions of them into useful networks. This paper proposes a transistor like device that could provide an analogous building block for probabilistic networks. We present two proof-of-concept examples of belief networks, one reciprocal and one non-reciprocal, implemented using the proposed device which is simulated using experimentally benchmarked models.
Wang, Zhuo; Danziger, Samuel A; Heavner, Benjamin D; Ma, Shuyi; Smith, Jennifer J; Li, Song; Herricks, Thurston; Simeonidis, Evangelos; Baliga, Nitin S; Aitchison, John D; Price, Nathan D
2017-05-01
Gene regulatory and metabolic network models have been used successfully in many organisms, but inherent differences between them make networks difficult to integrate. Probabilistic Regulation Of Metabolism (PROM) provides a partial solution, but it does not incorporate network inference and underperforms in eukaryotes. We present an Integrated Deduced And Metabolism (IDREAM) method that combines statistically inferred Environment and Gene Regulatory Influence Network (EGRIN) models with the PROM framework to create enhanced metabolic-regulatory network models. We used IDREAM to predict phenotypes and genetic interactions between transcription factors and genes encoding metabolic activities in the eukaryote, Saccharomyces cerevisiae. IDREAM models contain many fewer interactions than PROM and yet produce significantly more accurate growth predictions. IDREAM consistently outperformed PROM using any of three popular yeast metabolic models and across three experimental growth conditions. Importantly, IDREAM's enhanced accuracy makes it possible to identify subtle synthetic growth defects. With experimental validation, these novel genetic interactions involving the pyruvate dehydrogenase complex suggested a new role for fatty acid-responsive factor Oaf1 in regulating acetyl-CoA production in glucose grown cells.
Probabilistic grammatical model for helix‐helix contact site classification
2013-01-01
Background Hidden Markov Models power many state‐of‐the‐art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium‐ and long‐range residue‐residue interactions. This requires an expressive power of at least context‐free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited. Results In this work, we present a probabilistic grammatical framework for problem‐specific protein languages and apply it to classification of transmembrane helix‐helix pairs configurations. The core of the model consists of a probabilistic context‐free grammar, automatically inferred by a genetic algorithm from only a generic set of expert‐based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix‐helix contact site configurations. The highest performance of the classifiers reached AUCROC of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix‐helix contact sites. Conclusions We demonstrated that our probabilistic context‐free framework for analysis of protein sequences outperforms the state of the art in the task of helix‐helix contact site classification. However, this is achieved without necessarily requiring modeling long range dependencies between interacting residues. A significant feature of our approach is that grammar rules and parse trees are human‐readable. Thus they could provide biologically meaningful information for molecular biologists. PMID:24350601
NASA Astrophysics Data System (ADS)
Schmidt, S.; Heyns, P. S.; de Villiers, J. P.
2018-02-01
In this paper, a fault diagnostic methodology is developed which is able to detect, locate and trend gear faults under fluctuating operating conditions when only vibration data from a single transducer, measured on a healthy gearbox are available. A two-phase feature extraction and modelling process is proposed to infer the operating condition and based on the operating condition, to detect changes in the machine condition. Information from optimised machine and operating condition hidden Markov models are statistically combined to generate a discrepancy signal which is post-processed to infer the condition of the gearbox. The discrepancy signal is processed and combined with statistical methods for automatic fault detection and localisation and to perform fault trending over time. The proposed methodology is validated on experimental data and a tacholess order tracking methodology is used to enhance the cost-effectiveness of the diagnostic methodology.
Goldstein, Mary K; Asch, Steven M; Mackey, Lester; Altman, Russ B
2017-01-01
Objective: Build probabilistic topic model representations of hospital admissions processes and compare the ability of such models to predict clinical order patterns as compared to preconstructed order sets. Materials and Methods: The authors evaluated the first 24 hours of structured electronic health record data for > 10 K inpatients. Drawing an analogy between structured items (e.g., clinical orders) to words in a text document, the authors performed latent Dirichlet allocation probabilistic topic modeling. These topic models use initial clinical information to predict clinical orders for a separate validation set of > 4 K patients. The authors evaluated these topic model-based predictions vs existing human-authored order sets by area under the receiver operating characteristic curve, precision, and recall for subsequent clinical orders. Results: Existing order sets predict clinical orders used within 24 hours with area under the receiver operating characteristic curve 0.81, precision 16%, and recall 35%. This can be improved to 0.90, 24%, and 47% (P < 10−20) by using probabilistic topic models to summarize clinical data into up to 32 topics. Many of these latent topics yield natural clinical interpretations (e.g., “critical care,” “pneumonia,” “neurologic evaluation”). Discussion: Existing order sets tend to provide nonspecific, process-oriented aid, with usability limitations impairing more precise, patient-focused support. Algorithmic summarization has the potential to breach this usability barrier by automatically inferring patient context, but with potential tradeoffs in interpretability. Conclusion: Probabilistic topic modeling provides an automated approach to detect thematic trends in patient care and generate decision support content. A potential use case finds related clinical orders for decision support. PMID:27655861
A probabilistic NF2 relational algebra for integrated information retrieval and database systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fuhr, N.; Roelleke, T.
The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-first-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redefine the relational operators for this type of relations such thatmore » the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.« less
Influence diagrams as oil spill decision science tools
Making inferences on risks to ecosystem services (ES) from ecological crises can be more reliably handled using decision science tools. Influence diagrams (IDs) are probabilistic networks that explicitly represent the decisions related to a problem and evidence of their influence...
DOE Office of Scientific and Technical Information (OSTI.GOV)
La Russa, D
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributionsmore » found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.« less
The Sapir-Whorf Hypothesis and Probabilistic Inference: Evidence from the Domain of Color
Austerweil, Joseph L.; Griffiths, Thomas L.; Regier, Terry
2016-01-01
The Sapir-Whorf hypothesis holds that our thoughts are shaped by our native language, and that speakers of different languages therefore think differently. This hypothesis is controversial in part because it appears to deny the possibility of a universal groundwork for human cognition, and in part because some findings taken to support it have not reliably replicated. We argue that considering this hypothesis through the lens of probabilistic inference has the potential to resolve both issues, at least with respect to certain prominent findings in the domain of color cognition. We explore a probabilistic model that is grounded in a presumed universal perceptual color space and in language-specific categories over that space. The model predicts that categories will most clearly affect color memory when perceptual information is uncertain. In line with earlier studies, we show that this model accounts for language-consistent biases in color reconstruction from memory in English speakers, modulated by uncertainty. We also show, to our knowledge for the first time, that such a model accounts for influential existing data on cross-language differences in color discrimination from memory, both within and across categories. We suggest that these ideas may help to clarify the debate over the Sapir-Whorf hypothesis. PMID:27434643
NASA Astrophysics Data System (ADS)
Leistedt, Boris; Hogg, David W.
2017-12-01
We present a hierarchical probabilistic model for improving geometric stellar distance estimates using color-magnitude information. This is achieved with a data-driven model of the color-magnitude diagram, not relying on stellar models but instead on the relative abundances of stars in color-magnitude cells, which are inferred from very noisy magnitudes and parallaxes. While the resulting noise-deconvolved color-magnitude diagram can be useful for a range of applications, we focus on deriving improved stellar distance estimates relying on both parallax and photometric information. We demonstrate the efficiency of this approach on the 1.4 million stars of the Gaia TGAS sample that also have AAVSO Photometric All Sky Survey magnitudes. Our hierarchical model has 4 million parameters in total, most of which are marginalized out numerically or analytically. We find that distance estimates are significantly improved for the noisiest parallaxes and densest regions of the color-magnitude diagram. In particular, the average distance signal-to-noise ratio (S/N) and uncertainty improve by 19% and 36%, respectively, with 8% of the objects improving in S/N by a factor greater than 2. This computationally efficient approach fully accounts for both parallax and photometric noise and is a first step toward a full hierarchical probabilistic model of the Gaia data.
Justice, N. B.; Sczesnak, A.; Hazen, T. C.; ...
2017-08-04
A central goal of microbial ecology is to identify and quantify the forces that lead to observed population distributions and dynamics. However, these forces, which include environmental selection, dispersal, and organism interactions, are often difficult to assess in natural environments. Here in this paper, we present a method that links microbial community structures with selective and stochastic forces through highly replicated subsampling and enrichment of a single environmental inoculum. Specifically, groundwater from a well-studied natural aquifer was serially diluted and inoculated into nearly 1,000 aerobic and anaerobic nitrate-reducing cultures, and the final community structures were evaluated with 16S rRNA genemore » amplicon sequencing. We analyzed the frequency and abundance of individual operational taxonomic units (OTUs) to understand how probabilistic immigration, relative fitness differences, environmental factors, and organismal interactions contributed to divergent distributions of community structures. We further used a most probable number (MPN) method to estimate the natural condition-dependent cultivable abundance of each of the nearly 400 OTU cultivated in our study and infer the relative fitness of each. Additionally, we infer condition-specific organism interactions and discuss how this high-replicate culturing approach is essential in dissecting the interplay between overlapping ecological forces and taxon-specific attributes that underpin microbial community assembly. IMPORTANCEThrough highly replicated culturing, in which inocula are subsampled from a single environmental sample, we empirically determine how selective forces, interspecific interactions, relative fitness, and probabilistic dispersal shape bacterial communities. These methods offer a novel approach to untangle not only interspecific interactions but also taxon-specific fitness differences that manifest across different cultivation conditions and lead to the selection and enrichment of specific organisms. Additionally, we provide a method for estimating the number of cultivable units of each OTU in the original sample through the MPN approach.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Justice, N. B.; Sczesnak, A.; Hazen, T. C.
A central goal of microbial ecology is to identify and quantify the forces that lead to observed population distributions and dynamics. However, these forces, which include environmental selection, dispersal, and organism interactions, are often difficult to assess in natural environments. Here in this paper, we present a method that links microbial community structures with selective and stochastic forces through highly replicated subsampling and enrichment of a single environmental inoculum. Specifically, groundwater from a well-studied natural aquifer was serially diluted and inoculated into nearly 1,000 aerobic and anaerobic nitrate-reducing cultures, and the final community structures were evaluated with 16S rRNA genemore » amplicon sequencing. We analyzed the frequency and abundance of individual operational taxonomic units (OTUs) to understand how probabilistic immigration, relative fitness differences, environmental factors, and organismal interactions contributed to divergent distributions of community structures. We further used a most probable number (MPN) method to estimate the natural condition-dependent cultivable abundance of each of the nearly 400 OTU cultivated in our study and infer the relative fitness of each. Additionally, we infer condition-specific organism interactions and discuss how this high-replicate culturing approach is essential in dissecting the interplay between overlapping ecological forces and taxon-specific attributes that underpin microbial community assembly. IMPORTANCEThrough highly replicated culturing, in which inocula are subsampled from a single environmental sample, we empirically determine how selective forces, interspecific interactions, relative fitness, and probabilistic dispersal shape bacterial communities. These methods offer a novel approach to untangle not only interspecific interactions but also taxon-specific fitness differences that manifest across different cultivation conditions and lead to the selection and enrichment of specific organisms. Additionally, we provide a method for estimating the number of cultivable units of each OTU in the original sample through the MPN approach.« less
Bayesian networks improve causal environmental assessments for evidence-based policy
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the p...
Active Inference, homeostatic regulation and adaptive behavioural control.
Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl
2015-11-01
We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Bayesian outcome-based strategy classification.
Lee, Michael D
2016-03-01
Hilbig and Moshagen (Psychonomic Bulletin & Review, 21, 1431-1443, 2014) recently developed a method for making inferences about the decision processes people use in multi-attribute forced choice tasks. Their paper makes a number of worthwhile theoretical and methodological contributions. Theoretically, they provide an insightful psychological motivation for a probabilistic extension of the widely-used "weighted additive" (WADD) model, and show how this model, as well as other important models like "take-the-best" (TTB), can and should be expressed in terms of meaningful priors. Methodologically, they develop an inference approach based on the Minimum Description Length (MDL) principles that balances both the goodness-of-fit and complexity of the decision models they consider. This paper aims to preserve these useful contributions, but provide a complementary Bayesian approach with some theoretical and methodological advantages. We develop a simple graphical model, implemented in JAGS, that allows for fully Bayesian inferences about which models people use to make decisions. To demonstrate the Bayesian approach, we apply it to the models and data considered by Hilbig and Moshagen (Psychonomic Bulletin & Review, 21, 1431-1443, 2014), showing how a prior predictive analysis of the models, and posterior inferences about which models people use and the parameter settings at which they use them, can contribute to our understanding of human decision making.
Ocampo-Duque, William; Osorio, Carolina; Piamba, Christian; Schuhmacher, Marta; Domingo, José L
2013-02-01
The integration of water quality monitoring variables is essential in environmental decision making. Nowadays, advanced techniques to manage subjectivity, imprecision, uncertainty, vagueness, and variability are required in such complex evaluation process. We here propose a probabilistic fuzzy hybrid model to assess river water quality. Fuzzy logic reasoning has been used to compute a water quality integrative index. By applying a Monte Carlo technique, based on non-parametric probability distributions, the randomness of model inputs was estimated. Annual histograms of nine water quality variables were built with monitoring data systematically collected in the Colombian Cauca River, and probability density estimations using the kernel smoothing method were applied to fit data. Several years were assessed, and river sectors upstream and downstream the city of Santiago de Cali, a big city with basic wastewater treatment and high industrial activity, were analyzed. The probabilistic fuzzy water quality index was able to explain the reduction in water quality, as the river receives a larger number of agriculture, domestic, and industrial effluents. The results of the hybrid model were compared to traditional water quality indexes. The main advantage of the proposed method is that it considers flexible boundaries between the linguistic qualifiers used to define the water status, being the belongingness of water quality to the diverse output fuzzy sets or classes provided with percentiles and histograms, which allows classify better the real water condition. The results of this study show that fuzzy inference systems integrated to stochastic non-parametric techniques may be used as complementary tools in water quality indexing methodologies. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zunino, Andrea; Mosegaard, Klaus
2017-04-01
Sought-after reservoir properties of interest are linked only indirectly to the observable geophysical data which are recorded at the earth's surface. In this framework, seismic data represent one of the most reliable tool to study the structure and properties of the subsurface for natural resources. Nonetheless, seismic analysis is not an end in itself, as physical properties such as porosity are often of more interest for reservoir characterization. As such, inference of those properties implies taking into account also rock physics models linking porosity and other physical properties to elastic parameters. In the framework of seismic reflection data, we address this challenge for a reservoir target zone employing a probabilistic method characterized by a multi-step complex nonlinear forward modeling that combines: 1) a rock physics model with 2) the solution of full Zoeppritz equations and 3) a convolutional seismic forward modeling. The target property of this work is porosity, which is inferred using a Monte Carlo approach where porosity models, i.e., solutions to the inverse problem, are directly sampled from the posterior distribution. From a theoretical point of view, the Monte Carlo strategy can be particularly useful in the presence of nonlinear forward models, which is often the case when employing sophisticated rock physics models and full Zoeppritz equations and to estimate related uncertainty. However, the resulting computational challenge is huge. We propose to alleviate this computational burden by assuming some smoothness of the subsurface parameters and consequently parameterizing the model in terms of spline bases. This allows us a certain flexibility in that the number of spline bases and hence the resolution in each spatial direction can be controlled. The method is tested on a 3-D synthetic case and on a 2-D real data set.
Wen, Shihua; Zhang, Lanju; Yang, Bo
2014-07-01
The Problem formulation, Objectives, Alternatives, Consequences, Trade-offs, Uncertainties, Risk attitude, and Linked decisions (PrOACT-URL) framework and multiple criteria decision analysis (MCDA) have been recommended by the European Medicines Agency for structured benefit-risk assessment of medicinal products undergoing regulatory review. The objective of this article was to provide solutions to incorporate the uncertainty from clinical data into the MCDA model when evaluating the overall benefit-risk profiles among different treatment options. Two statistical approaches, the δ-method approach and the Monte-Carlo approach, were proposed to construct the confidence interval of the overall benefit-risk score from the MCDA model as well as other probabilistic measures for comparing the benefit-risk profiles between treatment options. Both approaches can incorporate the correlation structure between clinical parameters (criteria) in the MCDA model and are straightforward to implement. The two proposed approaches were applied to a case study to evaluate the benefit-risk profile of an add-on therapy for rheumatoid arthritis (drug X) relative to placebo. It demonstrated a straightforward way to quantify the impact of the uncertainty from clinical data to the benefit-risk assessment and enabled statistical inference on evaluating the overall benefit-risk profiles among different treatment options. The δ-method approach provides a closed form to quantify the variability of the overall benefit-risk score in the MCDA model, whereas the Monte-Carlo approach is more computationally intensive but can yield its true sampling distribution for statistical inference. The obtained confidence intervals and other probabilistic measures from the two approaches enhance the benefit-risk decision making of medicinal products. Copyright © 2014 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
LEGO-MM: LEarning structured model by probabilistic loGic Ontology tree for MultiMedia.
Tang, Jinhui; Chang, Shiyu; Qi, Guo-Jun; Tian, Qi; Rui, Yong; Huang, Thomas S
2016-09-22
Recent advances in Multimedia ontology have resulted in a number of concept models, e.g., LSCOM and Mediamill 101, which are accessible and public to other researchers. However, most current research effort still focuses on building new concepts from scratch, very few work explores the appropriate method to construct new concepts upon the existing models already in the warehouse. To address this issue, we propose a new framework in this paper, termed LEGO1-MM, which can seamlessly integrate both the new target training examples and the existing primitive concept models to infer the more complex concept models. LEGOMM treats the primitive concept models as the lego toy to potentially construct an unlimited vocabulary of new concepts. Specifically, we first formulate the logic operations to be the lego connectors to combine existing concept models hierarchically in probabilistic logic ontology trees. Then, we incorporate new target training information simultaneously to efficiently disambiguate the underlying logic tree and correct the error propagation. Extensive experiments are conducted on a large vehicle domain data set from ImageNet. The results demonstrate that LEGO-MM has significantly superior performance over existing state-of-the-art methods, which build new concept models from scratch.
Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel; Agol, Eric; Ambikasaran, Sivaram; Angus, Ruth
2017-12-01
The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large data sets. Gaussian processes (GPs) are a popular class of models used for this purpose, but since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small data sets. In this paper, we present a novel method for GPs modeling in one dimension where the computational requirements scale linearly with the size of the data set. We demonstrate the method by applying it to simulated and real astronomical time series data sets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically driven damped harmonic oscillators—providing a physical motivation for and interpretation of this choice—but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable GP methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.
NASA Astrophysics Data System (ADS)
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.
NASA Astrophysics Data System (ADS)
Closas, Pau; Guillamon, Antoni
2017-12-01
This paper deals with the problem of inferring the signals and parameters that cause neural activity to occur. The ultimate challenge being to unveil brain's connectivity, here we focus on a microscopic vision of the problem, where single neurons (potentially connected to a network of peers) are at the core of our study. The sole observation available are noisy, sampled voltage traces obtained from intracellular recordings. We design algorithms and inference methods using the tools provided by stochastic filtering that allow a probabilistic interpretation and treatment of the problem. Using particle filtering, we are able to reconstruct traces of voltages and estimate the time course of auxiliary variables. By extending the algorithm, through PMCMC methodology, we are able to estimate hidden physiological parameters as well, like intrinsic conductances or reversal potentials. Last, but not least, the method is applied to estimate synaptic conductances arriving at a target cell, thus reconstructing the synaptic excitatory/inhibitory input traces. Notably, the performance of these estimations achieve the theoretical lower bounds even in spiking regimes.
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
Perceptual Decision-Making as Probabilistic Inference by Neural Sampling.
Haefner, Ralf M; Berkes, Pietro; Fiser, József
2016-05-04
We address two main challenges facing systems neuroscience today: understanding the nature and function of cortical feedback between sensory areas and of correlated variability. Starting from the old idea of perception as probabilistic inference, we show how to use knowledge of the psychophysical task to make testable predictions for the influence of feedback signals on early sensory representations. Applying our framework to a two-alternative forced choice task paradigm, we can explain multiple empirical findings that have been hard to account for by the traditional feedforward model of sensory processing, including the task dependence of neural response correlations and the diverging time courses of choice probabilities and psychophysical kernels. Our model makes new predictions and characterizes a component of correlated variability that represents task-related information rather than performance-degrading noise. It demonstrates a normative way to integrate sensory and cognitive components into physiologically testable models of perceptual decision-making. Copyright © 2016 Elsevier Inc. All rights reserved.
An improved probabilistic account of counterfactual reasoning.
Lucas, Christopher G; Kemp, Charles
2015-10-01
When people want to identify the causes of an event, assign credit or blame, or learn from their mistakes, they often reflect on how things could have gone differently. In this kind of reasoning, one considers a counterfactual world in which some events are different from their real-world counterparts and considers what else would have changed. Researchers have recently proposed several probabilistic models that aim to capture how people do (or should) reason about counterfactuals. We present a new model and show that it accounts better for human inferences than several alternative models. Our model builds on the work of Pearl (2000), and extends his approach in a way that accommodates backtracking inferences and that acknowledges the difference between counterfactual interventions and counterfactual observations. We present 6 new experiments and analyze data from 4 experiments carried out by Rips (2010), and the results suggest that the new model provides an accurate account of both mean human judgments and the judgments of individuals. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
The probability heuristics model of syllogistic reasoning.
Chater, N; Oaksford, M
1999-03-01
A probability heuristic model (PHM) for syllogistic reasoning is proposed. An informational ordering over quantified statements suggests simple probability based heuristics for syllogistic reasoning. The most important is the "min-heuristic": choose the type of the least informative premise as the type of the conclusion. The rationality of this heuristic is confirmed by an analysis of the probabilistic validity of syllogistic reasoning which treats logical inference as a limiting case of probabilistic inference. A meta-analysis of past experiments reveals close fits with PHM. PHM also compares favorably with alternative accounts, including mental logics, mental models, and deduction as verbal reasoning. Crucially, PHM extends naturally to generalized quantifiers, such as Most and Few, which have not been characterized logically and are, consequently, beyond the scope of current mental logic and mental model theories. Two experiments confirm the novel predictions of PHM when generalized quantifiers are used in syllogistic arguments. PHM suggests that syllogistic reasoning performance may be determined by simple but rational informational strategies justified by probability theory rather than by logic. Copyright 1999 Academic Press.
Relevance and Reason Relations.
Skovgaard-Olsen, Niels; Singmann, Henrik; Klauer, Karl Christoph
2017-05-01
This paper examines precursors and consequents of perceived relevance of a proposition A for a proposition C. In Experiment 1, we test Spohn's (2012) assumption that ∆P = P(C|A) - P(C|~A) is a good predictor of ratings of perceived relevance and reason relations, and we examine whether it is a better predictor than the difference measure (P(C|A) - P(C)). In Experiment 2, we examine the effects of relevance on probabilistic coherence in Cruz, Baratgin, Oaksford, and Over's (2015) uncertain "and-to-if" inferences. The results suggest that ∆P predicts perceived relevance and reason relations better than the difference measure and that participants are either less probabilistically coherent in "and-to-if" inferences than initially assumed or that they do not follow P(if A, then C) = P(C|A) ("the Equation"). Results are discussed in light of recent results suggesting that the Equation may not hold under conditions of irrelevance or negative relevance. Copyright © 2016 Cognitive Science Society, Inc.
Bayesian inference of radiation belt loss timescales.
NASA Astrophysics Data System (ADS)
Camporeale, E.; Chandorkar, M.
2017-12-01
Electron fluxes in the Earth's radiation belts are routinely studied using the classical quasi-linear radial diffusion model. Although this simplified linear equation has proven to be an indispensable tool in understanding the dynamics of the radiation belt, it requires specification of quantities such as the diffusion coefficient and electron loss timescales that are never directly measured. Researchers have so far assumed a-priori parameterisations for radiation belt quantities and derived the best fit using satellite data. The state of the art in this domain lacks a coherent formulation of this problem in a probabilistic framework. We present some recent progress that we have made in performing Bayesian inference of radial diffusion parameters. We achieve this by making extensive use of the theory connecting Gaussian Processes and linear partial differential equations, and performing Markov Chain Monte Carlo sampling of radial diffusion parameters. These results are important for understanding the role and the propagation of uncertainties in radiation belt simulations and, eventually, for providing a probabilistic forecast of energetic electron fluxes in a Space Weather context.
Systematic parameter inference in stochastic mesoscopic modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lei, Huan; Yang, Xiu; Li, Zhen
2017-02-01
We propose a method to efficiently determine the optimal coarse-grained force field in mesoscopic stochastic simulations of Newtonian fluid and polymer melt systems modeled by dissipative particle dynamics (DPD) and energy conserving dissipative particle dynamics (eDPD). The response surfaces of various target properties (viscosity, diffusivity, pressure, etc.) with respect to model parameters are constructed based on the generalized polynomial chaos (gPC) expansion using simulation results on sampling points (e.g., individual parameter sets). To alleviate the computational cost to evaluate the target properties, we employ the compressive sensing method to compute the coefficients of the dominant gPC terms given the priormore » knowledge that the coefficients are “sparse”. The proposed method shows comparable accuracy with the standard probabilistic collocation method (PCM) while it imposes a much weaker restriction on the number of the simulation samples especially for systems with high dimensional parametric space. Fully access to the response surfaces within the confidence range enables us to infer the optimal force parameters given the desirable values of target properties at the macroscopic scale. Moreover, it enables us to investigate the intrinsic relationship between the model parameters, identify possible degeneracies in the parameter space, and optimize the model by eliminating model redundancies. The proposed method provides an efficient alternative approach for constructing mesoscopic models by inferring model parameters to recover target properties of the physics systems (e.g., from experimental measurements), where those force field parameters and formulation cannot be derived from the microscopic level in a straight forward way.« less
Proceedings, Seminar on Probabilistic Methods in Geotechnical Engineering
NASA Astrophysics Data System (ADS)
Hynes-Griffin, M. E.; Buege, L. L.
1983-09-01
Contents: Applications of Probabilistic Methods in Geotechnical Engineering; Probabilistic Seismic and Geotechnical Evaluation at a Dam Site; Probabilistic Slope Stability Methodology; Probability of Liquefaction in a 3-D Soil Deposit; Probabilistic Design of Flood Levees; Probabilistic and Statistical Methods for Determining Rock Mass Deformability Beneath Foundations: An Overview; Simple Statistical Methodology for Evaluating Rock Mechanics Exploration Data; New Developments in Statistical Techniques for Analyzing Rock Slope Stability.
The Emergence of Organizing Structure in Conceptual Representation
ERIC Educational Resources Information Center
Lake, Brenden M.; Lawrence, Neil D.; Tenenbaum, Joshua B.
2018-01-01
Both scientists and children make important structural discoveries, yet their computational underpinnings are not well understood. Structure discovery has previously been formalized as probabilistic inference about the right structural form--where form could be a tree, ring, chain, grid, etc. (Kemp & Tenenbaum, 2008). Although this approach…
RANGE AND DENSITY OF ALIEN FISH IN WESTERN STREAMS AND RIVERS, US
Alien fish have become increasingly prevalent in Western U.S. waters. The EPA Environmental Monitoring and Assessment Program's Western Pilot (12 western states), which is based upon a probabilistic design, provides an opportunity to make inferences about the range and density of...
On Modeling of If-Then Rules for Probabilistic Inference
1993-02-01
conditionals b -- a. This space contains A strictly. Contrary to a statement in Gilio and Spezzaferri (1992), these conditionals are equivalent to...Wiley, N.Y. [4] Gilio , A. and Spezzaferri, F. (1992). Knowledge integration for condi- tional probability assessmn-ts. Proceedings 8th Conf. Uncertainty
A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures
Shiraishi, Yuichi; Tremmel, Georg; Miyano, Satoru; Stephens, Matthew
2015-01-01
Recent advances in sequencing technologies have enabled the production of massive amounts of data on somatic mutations from cancer genomes. These data have led to the detection of characteristic patterns of somatic mutations or “mutation signatures” at an unprecedented resolution, with the potential for new insights into the causes and mechanisms of tumorigenesis. Here we present new methods for modelling, identifying and visualizing such mutation signatures. Our methods greatly simplify mutation signature models compared with existing approaches, reducing the number of parameters by orders of magnitude even while increasing the contextual factors (e.g. the number of flanking bases) that are accounted for. This improves both sensitivity and robustness of inferred signatures. We also provide a new intuitive way to visualize the signatures, analogous to the use of sequence logos to visualize transcription factor binding sites. We illustrate our new method on somatic mutation data from urothelial carcinoma of the upper urinary tract, and a larger dataset from 30 diverse cancer types. The results illustrate several important features of our methods, including the ability of our new visualization tool to clearly highlight the key features of each signature, the improved robustness of signature inferences from small sample sizes, and more detailed inference of signature characteristics such as strand biases and sequence context effects at the base two positions 5′ to the mutated site. The overall framework of our work is based on probabilistic models that are closely connected with “mixed-membership models” which are widely used in population genetic admixture analysis, and in machine learning for document clustering. We argue that recognizing these relationships should help improve understanding of mutation signature extraction problems, and suggests ways to further improve the statistical methods. Our methods are implemented in an R package pmsignature (https://github.com/friend1ws/pmsignature) and a web application available at https://friend1ws.shinyapps.io/pmsignature_shiny/. PMID:26630308
Natural parameter values for generalized gene adjacency.
Yang, Zhenyu; Sankoff, David
2010-09-01
Given the gene orders in two modern genomes, it may be difficult to decide if some genes are close enough in both genomes to infer some ancestral proximity or some functional relationship. Current methods all depend on arbitrary parameters. We explore a class of gene proximity criteria and find two kinds of natural values for their parameters. One kind has to do with the parameter value where the expected information contained in two genomes about each other is maximized. The other kind of natural value has to do with parameter values beyond which all genes are clustered. We analyze these using combinatorial and probabilistic arguments as well as simulations.
Yu, Yun; Degnan, James H.; Nakhleh, Luay
2012-01-01
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. PMID:22536161
Using speakers' referential intentions to model early cross-situational word learning.
Frank, Michael C; Goodman, Noah D; Tenenbaum, Joshua B
2009-05-01
Word learning is a "chicken and egg" problem. If a child could understand speakers' utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers' intended meanings. To the beginning learner, however, both individual word meanings and speakers' intentions are unknown. We describe a computational model of word learning that solves these two inference problems in parallel, rather than relying exclusively on either the inferred meanings of utterances or cross-situational word-meaning associations. We tested our model using annotated corpus data and found that it inferred pairings between words and object concepts with higher precision than comparison models. Moreover, as the result of making probabilistic inferences about speakers' intentions, our model explains a variety of behavioral phenomena described in the word-learning literature. These phenomena include mutual exclusivity, one-trial learning, cross-situational learning, the role of words in object individuation, and the use of inferred intentions to disambiguate reference.
Essays on inference in economics, competition, and the rate of profit
NASA Astrophysics Data System (ADS)
Scharfenaker, Ellis S.
This dissertation is comprised of three papers that demonstrate the role of Bayesian methods of inference and Shannon's information theory in classical political economy. The first chapter explores the empirical distribution of profit rate data from North American firms from 1962-2012. This chapter address the fact that existing methods for sample selection from noisy profit rate data in the industrial organization field of economics tends to be conditional on a covariate's value that risks discarding information. Conditioning sample selection instead on the profit rate data's structure by means of a two component (signal and noise) Bayesian mixture model we find the the profit rate sample to be time stationary Laplace distributed, corroborating earlier estimates of cross section distributions. The second chapter compares alternative probabilistic approaches to discrete (quantal) choice analysis and examines the various ways in which they overlap. In particular, the work on individual choice behavior by Duncan Luce and the extension of this work to quantal response problems by game theoreticians is shown to be related both to the rational inattention work of Christopher Sims through Shannon's information theory as well as to the maximum entropy principle of inference proposed physicist Edwin T. Jaynes. In the third chapter I propose a model of ``classically" competitive firms facing informational entropy constraints in their decisions to potentially enter or exit markets based on profit rate differentials. The result is a three parameter logit quantal response distribution for firm entry and exit decisions. Bayesian methods are used for inference into the the distribution of entry and exit decisions conditional on profit rate deviations and firm level data from Compustat is used to test these predictions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morton, April M; Piburn, Jesse O; McManamay, Ryan A
2017-01-01
Monte Carlo simulation is a popular numerical experimentation technique used in a range of scientific fields to obtain the statistics of unknown random output variables. Despite its widespread applicability, it can be difficult to infer required input probability distributions when they are related to population counts unknown at desired spatial resolutions. To overcome this challenge, we propose a framework that uses a dasymetric model to infer the probability distributions needed for a specific class of Monte Carlo simulations which depend on population counts.
Inferring social ties from geographic coincidences.
Crandall, David J; Backstrom, Lars; Cosley, Dan; Suri, Siddharth; Huttenlocher, Daniel; Kleinberg, Jon
2010-12-28
We investigate the extent to which social ties between people can be inferred from co-occurrence in time and space: Given that two people have been in approximately the same geographic locale at approximately the same time, on multiple occasions, how likely are they to know each other? Furthermore, how does this likelihood depend on the spatial and temporal proximity of the co-occurrences? Such issues arise in data originating in both online and offline domains as well as settings that capture interfaces between online and offline behavior. Here we develop a framework for quantifying the answers to such questions, and we apply this framework to publicly available data from a social media site, finding that even a very small number of co-occurrences can result in a high empirical likelihood of a social tie. We then present probabilistic models showing how such large probabilities can arise from a natural model of proximity and co-occurrence in the presence of social ties. In addition to providing a method for establishing some of the first quantifiable estimates of these measures, our findings have potential privacy implications, particularly for the ways in which social structures can be inferred from public online records that capture individuals' physical locations over time.
SIG-VISA: Signal-based Vertically Integrated Seismic Monitoring
NASA Astrophysics Data System (ADS)
Moore, D.; Mayeda, K. M.; Myers, S. C.; Russell, S.
2013-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software; however, while such detections may constitute a useful summary of station activity, they discard large amounts of information present in the original recorded signal. We present SIG-VISA (Signal-based Vertically Integrated Seismic Analysis), a system for seismic monitoring through Bayesian inference on seismic signals. By directly modeling the recorded signal, our approach incorporates additional information unavailable to detection-based methods, enabling higher sensitivity and more accurate localization using techniques such as waveform matching. SIG-VISA's Bayesian forward model of seismic signal envelopes includes physically-derived models of travel times and source characteristics as well as Gaussian process (kriging) statistical models of signal properties that combine interpolation of historical data with extrapolation of learned physical trends. Applying Bayesian inference, we evaluate the model on earthquakes as well as the 2009 DPRK test event, demonstrating a waveform matching effect as part of the probabilistic inference, along with results on event localization and sensitivity. In particular, we demonstrate increased sensitivity from signal-based modeling, in which the SIGVISA signal model finds statistical evidence for arrivals even at stations for which the IMS station processing failed to register any detection.
NASA Astrophysics Data System (ADS)
Neri, Augusto; Bevilacqua, Andrea; Esposti Ongaro, Tomaso; Isaia, Roberto; Aspinall, Willy P.; Bisson, Marina; Flandoli, Franco; Baxter, Peter J.; Bertagnini, Antonella; Iannuzzi, Enrico; Orsucci, Simone; Pistolesi, Marco; Rosi, Mauro; Vitale, Stefano
2015-04-01
Campi Flegrei (CF) is an example of an active caldera containing densely populated settlements at very high risk of pyroclastic density currents (PDCs). We present here an innovative method for assessing background spatial PDC hazard in a caldera setting with probabilistic invasion maps conditional on the occurrence of an explosive event. The method encompasses the probabilistic assessment of potential vent opening positions, derived in the companion paper, combined with inferences about the spatial density distribution of PDC invasion areas from a simplified flow model, informed by reconstruction of deposits from eruptions in the last 15 ka. The flow model describes the PDC kinematics and accounts for main effects of topography on flow propagation. Structured expert elicitation is used to incorporate certain sources of epistemic uncertainty, and a Monte Carlo approach is adopted to produce a set of probabilistic hazard maps for the whole CF area. Our findings show that, in case of eruption, almost the entire caldera is exposed to invasion with a mean probability of at least 5%, with peaks greater than 50% in some central areas. Some areas outside the caldera are also exposed to this danger, with mean probabilities of invasion of the order of 5-10%. Our analysis suggests that these probability estimates have location-specific uncertainties which can be substantial. The results prove to be robust with respect to alternative elicitation models and allow the influence on hazard mapping of different sources of uncertainty, and of theoretical and numerical assumptions, to be quantified.
ERIC Educational Resources Information Center
Gorman, Kyle
2013-01-01
This dissertation outlines a program for the theory of phonotactics--the theory of speakers' knowledge of possible and impossible (or likely and unlikely) words--and argues that the alternative view of phonotactics as stochastic, and of phonotactic learning as probabilistic inference, is not capable of accounting for the facts of this domain.…
Statistical Knowledge and Learning in Phonology
ERIC Educational Resources Information Center
Dunbar, Ewan Michael
2013-01-01
This dissertation deals with the theory of the phonetic component of grammar in a formal probabilistic inference framework: (1) it has been recognized since the beginning of generative phonology that some language-specific phonetic implementation is actually context-dependent, and thus it can be said that there are gradient "phonetic…
A Tool Chain for the V and V of NASA Cryogenic Fuel Loading Health Management
2014-10-02
Here, K. (2011). Formal testing for separation assurance. Ann. Math. Artif . Intell., 63(1), 5–30. Goodrich, C., Narasimhan, S., Daigle, M...Probabilistic Reasoning in Intelligent Sys- tems: Networks of plausible inference Morgan Kauf- mann: . Reed, E., Schumann, J., & Mengshoel, O. (2011
ERIC Educational Resources Information Center
Hourigan, Mairéad; Leavy, Aisling
2016-01-01
As part of Japanese Lesson study research focusing on "comparing and describing likelihoods", fifth grade elementary students used real-world data in decision-making. Sporting statistics facilitated opportunities for informal inference, where data were used to make and justify predictions.
Probabilistic and spatially variable niches inferred from demography
Jeffrey M. Diez; Itamar Giladi; Robert Warren; H. Ronald Pulliam
2014-01-01
Summary 1. Mismatches between species distributions and habitat suitability are predicted by niche theory and have important implications for forecasting how species may respond to environmental changes. Quantifying these mismatches is challenging, however, due to the high dimensionality of species niches and the large spatial and temporal variability in population...
Making inferences on risks to ecosystem services (ES) from ecological crises may be improved using decision science tools. Influence diagrams (IDs) are probabilistic networks that explicitly represent the decisions related to a problem and evidence of their influence on desired o...
NASA Technical Reports Server (NTRS)
Singhal, Surendra N.
2003-01-01
The SAE G-11 RMSL Division and Probabilistic Methods Committee meeting sponsored by the Picatinny Arsenal during March 1-3, 2004 at Westin Morristown, will report progress on projects for probabilistic assessment of Army system and launch an initiative for probabilistic education. The meeting features several Army and industry Senior executives and Ivy League Professor to provide an industry/government/academia forum to review RMSL technology; reliability and probabilistic technology; reliability-based design methods; software reliability; and maintainability standards. With over 100 members including members with national/international standing, the mission of the G-11s Probabilistic Methods Committee is to enable/facilitate rapid deployment of probabilistic technology to enhance the competitiveness of our industries by better, faster, greener, smarter, affordable and reliable product development.
NASA Astrophysics Data System (ADS)
Laloy, Eric; Beerten, Koen; Vanacker, Veerle; Christl, Marcus; Rogiers, Bart; Wouters, Laurent
2017-07-01
The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium), have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN) data, our approach has the following two innovative components: it (1) uses Markov chain Monte Carlo (MCMC) sampling and (2) accounts (under certain assumptions) for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP) parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr-1 (1σ) is relatively large in comparison with landforms that erode under comparable (paleo-)climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Cheng, Feon W; Gao, Xiang; Bao, Le; Mitchell, Diane C; Wood, Craig; Sliwinski, Martin J; Smiciklas-Wright, Helen; Still, Christopher D; Rolston, David D K; Jensen, Gordon L
2017-07-01
To examine the risk factors of developing functional decline and make probabilistic predictions by using a tree-based method that allows higher order polynomials and interactions of the risk factors. The conditional inference tree analysis, a data mining approach, was used to construct a risk stratification algorithm for developing functional limitation based on BMI and other potential risk factors for disability in 1,951 older adults without functional limitations at baseline (baseline age 73.1 ± 4.2 y). We also analyzed the data with multivariate stepwise logistic regression and compared the two approaches (e.g., cross-validation). Over a mean of 9.2 ± 1.7 years of follow-up, 221 individuals developed functional limitation. Higher BMI, age, and comorbidity were consistently identified as significant risk factors for functional decline using both methods. Based on these factors, individuals were stratified into four risk groups via the conditional inference tree analysis. Compared to the low-risk group, all other groups had a significantly higher risk of developing functional limitation. The odds ratio comparing two extreme categories was 9.09 (95% confidence interval: 4.68, 17.6). Higher BMI, age, and comorbid disease were consistently identified as significant risk factors for functional decline among older individuals across all approaches and analyses. © 2017 The Obesity Society.
Long, Chengjiang; Hua, Gang; Kapoor, Ashish
2015-01-01
We present a noise resilient probabilistic model for active learning of a Gaussian process classifier from crowds, i.e., a set of noisy labelers. It explicitly models both the overall label noise and the expertise level of each individual labeler with two levels of flip models. Expectation propagation is adopted for efficient approximate Bayesian inference of our probabilistic model for classification, based on which, a generalized EM algorithm is derived to estimate both the global label noise and the expertise of each individual labeler. The probabilistic nature of our model immediately allows the adoption of the prediction entropy for active selection of data samples to be labeled, and active selection of high quality labelers based on their estimated expertise to label the data. We apply the proposed model for four visual recognition tasks, i.e., object category recognition, multi-modal activity recognition, gender recognition, and fine-grained classification, on four datasets with real crowd-sourced labels from the Amazon Mechanical Turk. The experiments clearly demonstrate the efficacy of the proposed model. In addition, we extend the proposed model with the Predictive Active Set Selection Method to speed up the active learning system, whose efficacy is verified by conducting experiments on the first three datasets. The results show our extended model can not only preserve a higher accuracy, but also achieve a higher efficiency. PMID:26924892
Inferring the gravitational potential of the Milky Way with a few precisely measured stars
DOE Office of Scientific and Technical Information (OSTI.GOV)
Price-Whelan, Adrian M.; Johnston, Kathryn V.; Hendel, David
2014-10-10
The dark matter halo of the Milky Way is expected to be triaxial and filled with substructure. It is hoped that streams or shells of stars produced by tidal disruption of stellar systems will provide precise measures of the gravitational potential to test these predictions. We develop a method for inferring the Galactic potential with tidal streams based on the idea that the stream stars were once close in phase space. Our method can flexibly adapt to any form for the Galactic potential: it works in phase-space rather than action-space and hence relies neither on our ability to derive actionsmore » nor on the integrability of the potential. Our model is probabilistic, with a likelihood function and priors on the parameters. The method can properly account for finite observational uncertainties and missing data dimensions. We test our method on synthetic data sets generated from N-body simulations of satellite disruption in a static, multi-component Milky Way, including a triaxial dark matter halo with observational uncertainties chosen to mimic current and near-future surveys of various stars. We find that with just eight well-measured stream stars, we can infer properties of a triaxial potential with precisions of the order of 5%-7%. Without proper motions, we obtain 10% constraints on most potential parameters and precisions around 5%-10% for recovering missing phase-space coordinates. These results are encouraging for the goal of using flexible, time-dependent potential models combined with larger data sets to unravel the detailed shape of the dark matter distribution around the Milky Way.« less
Balancing the stochastic description of uncertainties as a function of hydrologic model complexity
NASA Astrophysics Data System (ADS)
Del Giudice, D.; Reichert, P.; Albert, C.; Kalcic, M.; Logsdon Muenich, R.; Scavia, D.; Bosch, N. S.; Michalak, A. M.
2016-12-01
Uncertainty analysis is becoming an important component of forecasting water and pollutant fluxes in urban and rural environments. Properly accounting for errors in the modeling process can help to robustly assess the uncertainties associated with the inputs (e.g. precipitation) and outputs (e.g. runoff) of hydrological models. In recent years we have investigated several Bayesian methods to infer the parameters of a mechanistic hydrological model along with those of the stochastic error component. The latter describes the uncertainties of model outputs and possibly inputs. We have adapted our framework to a variety of applications, ranging from predicting floods in small stormwater systems to nutrient loads in large agricultural watersheds. Given practical constraints, we discuss how in general the number of quantities to infer probabilistically varies inversely with the complexity of the mechanistic model. Most often, when evaluating a hydrological model of intermediate complexity, we can infer the parameters of the model as well as of the output error model. Describing the output errors as a first order autoregressive process can realistically capture the "downstream" effect of inaccurate inputs and structure. With simpler runoff models we can additionally quantify input uncertainty by using a stochastic rainfall process. For complex hydrologic transport models, instead, we show that keeping model parameters fixed and just estimating time-dependent output uncertainties could be a viable option. The common goal across all these applications is to create time-dependent prediction intervals which are both reliable (cover the nominal amount of validation data) and precise (are as narrow as possible). In conclusion, we recommend focusing both on the choice of the hydrological model and of the probabilistic error description. The latter can include output uncertainty only, if the model is computationally-expensive, or, with simpler models, it can separately account for different sources of errors like in the inputs and the structure of the model.
Learning topic models by belief propagation.
Zeng, Jia; Cheung, William K; Liu, Jiming
2013-05-01
Latent Dirichlet allocation (LDA) is an important hierarchical Bayesian model for probabilistic topic modeling, which attracts worldwide interest and touches on many important applications in text mining, computer vision and computational biology. This paper represents the collapsed LDA as a factor graph, which enables the classic loopy belief propagation (BP) algorithm for approximate inference and parameter estimation. Although two commonly used approximate inference methods, such as variational Bayes (VB) and collapsed Gibbs sampling (GS), have gained great success in learning LDA, the proposed BP is competitive in both speed and accuracy, as validated by encouraging experimental results on four large-scale document datasets. Furthermore, the BP algorithm has the potential to become a generic scheme for learning variants of LDA-based topic models in the collapsed space. To this end, we show how to learn two typical variants of LDA-based topic models, such as author-topic models (ATM) and relational topic models (RTM), using BP based on the factor graph representations.
Bullock, Theodore Holmes
1970-01-01
The prevalent probabilistic view is virtually untestable; it remains a plausible belief. The cases usually cited can not be taken as evidence for it. Several grounds for this conclusion are developed. Three issues are distinguished in an attempt to clarify a murky debate: (a) the utility of probabilistic methods in data reduction, (b) the value of models that assume indeterminacy, and (c) the validity of the inference that the nervous system is largely indeterministic at the neuronal level. No exception is taken to the first two; the second is a private heuristic question. The third is the issue to which the assertion in the first two sentences is addressed. Of the two kinds of uncertainty, statistical mechanical (= practical unpredictability) as in a gas, and Heisenbergian indeterminancy, the first certainly exists, the second is moot at the neuronal level. It would contribute to discussion to recognize that neurons perform with a degree of reliability. Although unreliability is difficult to establish, to say nothing of measure, evidence that some neurons have a high degree of reliability, in both connections and activity is increasing greatly. An example is given from sternarchine electric fish. PMID:5462670
Application of dynamic uncertain causality graph in spacecraft fault diagnosis: Logic cycle
NASA Astrophysics Data System (ADS)
Yao, Quanying; Zhang, Qin; Liu, Peng; Yang, Ping; Zhu, Ma; Wang, Xiaochen
2017-04-01
Intelligent diagnosis system are applied to fault diagnosis in spacecraft. Dynamic Uncertain Causality Graph (DUCG) is a new probability graphic model with many advantages. In the knowledge expression of spacecraft fault diagnosis, feedback among variables is frequently encountered, which may cause directed cyclic graphs (DCGs). Probabilistic graphical models (PGMs) such as bayesian network (BN) have been widely applied in uncertain causality representation and probabilistic reasoning, but BN does not allow DCGs. In this paper, DUGG is applied to fault diagnosis in spacecraft: introducing the inference algorithm for the DUCG to deal with feedback. Now, DUCG has been tested in 16 typical faults with 100% diagnosis accuracy.
Plasticity in probabilistic reaction norms for maturation in a salmonid fish.
Morita, Kentaro; Tsuboi, Jun-ichi; Nagasawa, Toru
2009-10-23
The relationship between body size and the probability of maturing, often referred to as the probabilistic maturation reaction norm (PMRN), has been increasingly used to infer genetic variation in maturation schedule. Despite this trend, few studies have directly evaluated plasticity in the PMRN. A transplant experiment using white-spotted charr demonstrated that the PMRN for precocious males exhibited plasticity. A smaller threshold size at maturity occurred in charr inhabiting narrow streams where more refuges are probably available for small charr, which in turn might enhance the reproductive success of sneaker precocious males. Our findings suggested that plastic effects should clearly be included in investigations of variation in PMRNs.
Moltke, Ida; Albrechtsen, Anders; Hansen, Thomas v.O.; Nielsen, Finn C.; Nielsen, Rasmus
2011-01-01
All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications—from answering questions about human evolution to locating regions in the human genome containing disease-causing variants. However, IBD regions can be difficult to detect, especially in the common case where no pedigree information is available. In particular, all existing non-pedigree based methods can only infer IBD sharing between two individuals. Here, we present a new Markov Chain Monte Carlo method for detection of IBD regions, which does not rely on any pedigree information. It is based on a probabilistic model applicable to unphased SNP data. It can take inbreeding, allele frequencies, genotyping errors, and genomic distances into account. And most importantly, it can simultaneously infer IBD sharing among multiple individuals. Through simulations, we show that the simultaneous modeling of multiple individuals makes the method more powerful and accurate than several other non-pedigree based methods. We illustrate the potential of the method by applying it to data from individuals with breast and/or ovarian cancer, and show that a known disease-causing mutation can be mapped to a 2.2-Mb region using SNP data from only five seemingly unrelated affected individuals. This would not be possible using classical linkage mapping or association mapping. PMID:21493780
NASA Technical Reports Server (NTRS)
Townsend, J.; Meyers, C.; Ortega, R.; Peck, J.; Rheinfurth, M.; Weinstock, B.
1993-01-01
Probabilistic structural analyses and design methods are steadily gaining acceptance within the aerospace industry. The safety factor approach to design has long been the industry standard, and it is believed by many to be overly conservative and thus, costly. A probabilistic approach to design may offer substantial cost savings. This report summarizes several probabilistic approaches: the probabilistic failure analysis (PFA) methodology developed by Jet Propulsion Laboratory, fast probability integration (FPI) methods, the NESSUS finite element code, and response surface methods. Example problems are provided to help identify the advantages and disadvantages of each method.
Gene network analysis: from heart development to cardiac therapy.
Ferrazzi, Fulvia; Bellazzi, Riccardo; Engel, Felix B
2015-03-01
Networks offer a flexible framework to represent and analyse the complex interactions between components of cellular systems. In particular gene networks inferred from expression data can support the identification of novel hypotheses on regulatory processes. In this review we focus on the use of gene network analysis in the study of heart development. Understanding heart development will promote the elucidation of the aetiology of congenital heart disease and thus possibly improve diagnostics. Moreover, it will help to establish cardiac therapies. For example, understanding cardiac differentiation during development will help to guide stem cell differentiation required for cardiac tissue engineering or to enhance endogenous repair mechanisms. We introduce different methodological frameworks to infer networks from expression data such as Boolean and Bayesian networks. Then we present currently available temporal expression data in heart development and discuss the use of network-based approaches in published studies. Collectively, our literature-based analysis indicates that gene network analysis constitutes a promising opportunity to infer therapy-relevant regulatory processes in heart development. However, the use of network-based approaches has so far been limited by the small amount of samples in available datasets. Thus, we propose to acquire high-resolution temporal expression data to improve the mathematical descriptions of regulatory processes obtained with gene network inference methodologies. Especially probabilistic methods that accommodate the intrinsic variability of biological systems have the potential to contribute to a deeper understanding of heart development.
NASA Astrophysics Data System (ADS)
Mandel, Kaisey; Kirshner, R. P.; Narayan, G.; Wood-Vasey, W. M.; Friedman, A. S.; Hicken, M.
2010-01-01
I have constructed a comprehensive statistical model for Type Ia supernova light curves spanning optical through near infrared data simultaneously. The near infrared light curves are found to be excellent standard candles (sigma(MH) = 0.11 +/- 0.03 mag) that are less vulnerable to systematic error from dust extinction, a major confounding factor for cosmological studies. A hierarchical statistical framework incorporates coherently multiple sources of randomness and uncertainty, including photometric error, intrinsic supernova light curve variations and correlations, dust extinction and reddening, peculiar velocity dispersion and distances, for probabilistic inference with Type Ia SN light curves. Inferences are drawn from the full probability density over individual supernovae and the SN Ia and dust populations, conditioned on a dataset of SN Ia light curves and redshifts. To compute probabilistic inferences with hierarchical models, I have developed BayeSN, a Markov Chain Monte Carlo algorithm based on Gibbs sampling. This code explores and samples the global probability density of parameters describing individual supernovae and the population. I have applied this hierarchical model to optical and near infrared data of over 100 nearby Type Ia SN from PAIRITEL, the CfA3 sample, and the literature. Using this statistical model, I find that SN with optical and NIR data have a smaller residual scatter in the Hubble diagram than SN with only optical data. The continued study of Type Ia SN in the near infrared will be important for improving their utility as precise and accurate cosmological distance indicators.
Phylogenetic inference under varying proportions of indel-induced alignment gaps
Dwivedi, Bhakti; Gadagkar, Sudhindra R
2009-01-01
Background The effect of alignment gaps on phylogenetic accuracy has been the subject of numerous studies. In this study, we investigated the relationship between the total number of gapped sites and phylogenetic accuracy, when the gaps were introduced (by means of computer simulation) to reflect indel (insertion/deletion) events during the evolution of DNA sequences. The resulting (true) alignments were subjected to commonly used gap treatment and phylogenetic inference methods. Results (1) In general, there was a strong – almost deterministic – relationship between the amount of gap in the data and the level of phylogenetic accuracy when the alignments were very "gappy", (2) gaps resulting from deletions (as opposed to insertions) contributed more to the inaccuracy of phylogenetic inference, (3) the probabilistic methods (Bayesian, PhyML & "MLε, " a method implemented in DNAML in PHYLIP) performed better at most levels of gap percentage when compared to parsimony (MP) and distance (NJ) methods, with Bayesian analysis being clearly the best, (4) methods that treat gapped sites as missing data yielded less accurate trees when compared to those that attribute phylogenetic signal to the gapped sites (by coding them as binary character data – presence/absence, or as in the MLε method), and (5) in general, the accuracy of phylogenetic inference depended upon the amount of available data when the gaps resulted from mainly deletion events, and the amount of missing data when insertion events were equally likely to have caused the alignment gaps. Conclusion When gaps in an alignment are a consequence of indel events in the evolution of the sequences, the accuracy of phylogenetic analysis is likely to improve if: (1) alignment gaps are categorized as arising from insertion events or deletion events and then treated separately in the analysis, (2) the evolutionary signal provided by indels is harnessed in the phylogenetic analysis, and (3) methods that utilize the phylogenetic signal in indels are developed for distance methods too. When the true homology is known and the amount of gaps is 20 percent of the alignment length or less, the methods used in this study are likely to yield trees with 90–100 percent accuracy. PMID:19698168
A Prize-Collecting Steiner Tree Approach for Transduction Network Inference
NASA Astrophysics Data System (ADS)
Bailly-Bechet, Marc; Braunstein, Alfredo; Zecchina, Riccardo
Into the cell, information from the environment is mainly propagated via signaling pathways which form a transduction network. Here we propose a new algorithm to infer transduction networks from heterogeneous data, using both the protein interaction network and expression datasets. We formulate the inference problem as an optimization task, and develop a message-passing, probabilistic and distributed formalism to solve it. We apply our algorithm to the pheromone response in the baker’s yeast S. cerevisiae. We are able to find the backbone of the known structure of the MAPK cascade of pheromone response, validating our algorithm. More importantly, we make biological predictions about some proteins whose role could be at the interface between pheromone response and other cellular functions.
Nichols, J.D.; Sauer, J.R.; Hines, J.E.; Boulinier, T.; Pollock, K.H.; Therres, Glenn D.
2001-01-01
Although many ecological monitoring programs are now in place, the use of resulting data to draw inferences about changes in biodiversity is problematic. The difficulty arises because of the inability to count all animals present in any sampled area. This inability results not only in underestimation of species richness but also in potentially misleading comparisons of species richness over time and space. We recommend the use of probabilistic estimators for estimating species richness and related parameters (e.g., rate of change in species richness, local extinction probability, local turnover, local colonization) when animal detection probabilities are <1. We illustrate these methods using data from the North American Breeding Bird Survey obtained along survey routes in Maryland. We also introduce software to implement these estimation methods.
A Computational Approach to Qualitative Analysis in Large Textual Datasets
Evans, Michael S.
2014-01-01
In this paper I introduce computational techniques to extend qualitative analysis into the study of large textual datasets. I demonstrate these techniques by using probabilistic topic modeling to analyze a broad sample of 14,952 documents published in major American newspapers from 1980 through 2012. I show how computational data mining techniques can identify and evaluate the significance of qualitatively distinct subjects of discussion across a wide range of public discourse. I also show how examining large textual datasets with computational methods can overcome methodological limitations of conventional qualitative methods, such as how to measure the impact of particular cases on broader discourse, how to validate substantive inferences from small samples of textual data, and how to determine if identified cases are part of a consistent temporal pattern. PMID:24498398
Buesing, Lars; Bill, Johannes; Nessler, Bernhard; Maass, Wolfgang
2011-01-01
The organization of computations in networks of spiking neurons in the brain is still largely unknown, in particular in view of the inherently stochastic features of their firing activity and the experimentally observed trial-to-trial variability of neural systems in the brain. In principle there exists a powerful computational framework for stochastic computations, probabilistic inference by sampling, which can explain a large number of macroscopic experimental data in neuroscience and cognitive science. But it has turned out to be surprisingly difficult to create a link between these abstract models for stochastic computations and more detailed models of the dynamics of networks of spiking neurons. Here we create such a link and show that under some conditions the stochastic firing activity of networks of spiking neurons can be interpreted as probabilistic inference via Markov chain Monte Carlo (MCMC) sampling. Since common methods for MCMC sampling in distributed systems, such as Gibbs sampling, are inconsistent with the dynamics of spiking neurons, we introduce a different approach based on non-reversible Markov chains that is able to reflect inherent temporal processes of spiking neuronal activity through a suitable choice of random variables. We propose a neural network model and show by a rigorous theoretical analysis that its neural activity implements MCMC sampling of a given distribution, both for the case of discrete and continuous time. This provides a step towards closing the gap between abstract functional models of cortical computation and more detailed models of networks of spiking neurons. PMID:22096452
What's in a Name? Probabilistic Inference of Religious Community from South Asian Names
ERIC Educational Resources Information Center
Susewind, Raphael
2015-01-01
Fine-grained data on religious communities are often considered sensitive in South Asia and consequently remain inaccessible. Yet without such data, statistical research on communal relations and group-based inequality remains superficial, hampering the development of appropriate policy measures to prevent further social exclusion on the basis of…
ERIC Educational Resources Information Center
Pajak, Bozena; Fine, Alex B.; Kleinschmidt, Dave F.; Jaeger, T. Florian
2016-01-01
We present a framework of second and additional language (L2/L"n") acquisition motivated by recent work on socio-indexical knowledge in first language (L1) processing. The distribution of linguistic categories covaries with socio-indexical variables (e.g., talker identity, gender, dialects). We summarize evidence that implicit…
The Ultimate Sampling Dilemma in Experience-Based Decision Making
ERIC Educational Resources Information Center
Fiedler, Klaus
2008-01-01
Computer simulations and 2 experiments demonstrate the ultimate sampling dilemma, which constitutes a serious obstacle to inductive inferences in a probabilistic world. Participants were asked to take the role of a manager who is to make purchasing decisions based on positive versus negative feedback about 3 providers in 2 different product…
The Generation and Resemblance Heuristics in Face Recognition: Cooperation and Competition
ERIC Educational Resources Information Center
Kleider, Heather M.; Goldinger, Stephen D.
2006-01-01
Like all probabilistic decisions, recognition memory judgments are based on inferences about the strength and quality of stimulus familiarity. In recent articles, B. W. A. Whittlesea and J. Leboe (2000; J. Leboe & B. W. A. Whittlesea, 2002) proposed that such memory decisions entail various heuristics, similar to well-known heuristics in overt…
Probabilistic Structural Analysis Methods (PSAM) for select space propulsion system components
NASA Technical Reports Server (NTRS)
1991-01-01
The fourth year of technical developments on the Numerical Evaluation of Stochastic Structures Under Stress (NESSUS) system for Probabilistic Structural Analysis Methods is summarized. The effort focused on the continued expansion of the Probabilistic Finite Element Method (PFEM) code, the implementation of the Probabilistic Boundary Element Method (PBEM), and the implementation of the Probabilistic Approximate Methods (PAppM) code. The principal focus for the PFEM code is the addition of a multilevel structural dynamics capability. The strategy includes probabilistic loads, treatment of material, geometry uncertainty, and full probabilistic variables. Enhancements are included for the Fast Probability Integration (FPI) algorithms and the addition of Monte Carlo simulation as an alternate. Work on the expert system and boundary element developments continues. The enhanced capability in the computer codes is validated by applications to a turbine blade and to an oxidizer duct.
Quantum-like model of unconscious–conscious dynamics
Khrennikov, Andrei
2015-01-01
We present a quantum-like model of sensation–perception dynamics (originated in Helmholtz theory of unconscious inference) based on the theory of quantum apparatuses and instruments. We illustrate our approach with the model of bistable perception of a particular ambiguous figure, the Schröder stair. This is a concrete model for unconscious and conscious processing of information and their interaction. The starting point of our quantum-like journey was the observation that perception dynamics is essentially contextual which implies impossibility of (straightforward) embedding of experimental statistical data in the classical (Kolmogorov, 1933) framework of probability theory. This motivates application of nonclassical probabilistic schemes. And the quantum formalism provides a variety of the well-approved and mathematically elegant probabilistic schemes to handle results of measurements. The theory of quantum apparatuses and instruments is the most general quantum scheme describing measurements and it is natural to explore it to model the sensation–perception dynamics. In particular, this theory provides the scheme of indirect quantum measurements which we apply to model unconscious inference leading to transition from sensations to perceptions. PMID:26283979
Extracting Databases from Dark Data with DeepDive.
Zhang, Ce; Shin, Jaeho; Ré, Christopher; Cafarella, Michael; Niu, Feng
2016-01-01
DeepDive is a system for extracting relational databases from dark data : the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and so on - were instead in a relational database, it would give analysts a massive and valuable new set of "big data." DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference.
A probabilistic model for detecting rigid domains in protein structures.
Nguyen, Thach; Habeck, Michael
2016-09-01
Large-scale conformational changes in proteins are implicated in many important biological functions. These structural transitions can often be rationalized in terms of relative movements of rigid domains. There is a need for objective and automated methods that identify rigid domains in sets of protein structures showing alternative conformational states. We present a probabilistic model for detecting rigid-body movements in protein structures. Our model aims to approximate alternative conformational states by a few structural parts that are rigidly transformed under the action of a rotation and a translation. By using Bayesian inference and Markov chain Monte Carlo sampling, we estimate all parameters of the model, including a segmentation of the protein into rigid domains, the structures of the domains themselves, and the rigid transformations that generate the observed structures. We find that our Gibbs sampling algorithm can also estimate the optimal number of rigid domains with high efficiency and accuracy. We assess the power of our method on several thousand entries of the DynDom database and discuss applications to various complex biomolecular systems. The Python source code for protein ensemble analysis is available at: https://github.com/thachnguyen/motion_detection : mhabeck@gwdg.de. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Wang, S.; Huang, G. H.; Baetz, B. W.; Huang, W.
2015-11-01
This paper presents a polynomial chaos ensemble hydrologic prediction system (PCEHPS) for an efficient and robust uncertainty assessment of model parameters and predictions, in which possibilistic reasoning is infused into probabilistic parameter inference with simultaneous consideration of randomness and fuzziness. The PCEHPS is developed through a two-stage factorial polynomial chaos expansion (PCE) framework, which consists of an ensemble of PCEs to approximate the behavior of the hydrologic model, significantly speeding up the exhaustive sampling of the parameter space. Multiple hypothesis testing is then conducted to construct an ensemble of reduced-dimensionality PCEs with only the most influential terms, which is meaningful for achieving uncertainty reduction and further acceleration of parameter inference. The PCEHPS is applied to the Xiangxi River watershed in China to demonstrate its validity and applicability. A detailed comparison between the HYMOD hydrologic model, the ensemble of PCEs, and the ensemble of reduced PCEs is performed in terms of accuracy and efficiency. Results reveal temporal and spatial variations in parameter sensitivities due to the dynamic behavior of hydrologic systems, and the effects (magnitude and direction) of parametric interactions depending on different hydrological metrics. The case study demonstrates that the PCEHPS is capable not only of capturing both expert knowledge and probabilistic information in the calibration process, but also of implementing an acceleration of more than 10 times faster than the hydrologic model without compromising the predictive accuracy.
Terminal-Area Aircraft Intent Inference Approach Based on Online Trajectory Clustering.
Yang, Yang; Zhang, Jun; Cai, Kai-quan
2015-01-01
Terminal-area aircraft intent inference (T-AII) is a prerequisite to detect and avoid potential aircraft conflict in the terminal airspace. T-AII challenges the state-of-the-art AII approaches due to the uncertainties of air traffic situation, in particular due to the undefined flight routes and frequent maneuvers. In this paper, a novel T-AII approach is introduced to address the limitations by solving the problem with two steps that are intent modeling and intent inference. In the modeling step, an online trajectory clustering procedure is designed for recognizing the real-time available routes in replacing of the missed plan routes. In the inference step, we then present a probabilistic T-AII approach based on the multiple flight attributes to improve the inference performance in maneuvering scenarios. The proposed approach is validated with real radar trajectory and flight attributes data of 34 days collected from Chengdu terminal area in China. Preliminary results show the efficacy of the presented approach.
NASA Astrophysics Data System (ADS)
Beucler, E.; Haugmard, M.; Mocquet, A.
2016-12-01
The most widely used inversion schemes to locate earthquakes are based on iterative linearized least-squares algorithms and using an a priori knowledge of the propagation medium. When a small amount of observations is available for moderate events for instance, these methods may lead to large trade-offs between outputs and both the velocity model and the initial set of hypocentral parameters. We present a joint structure-source determination approach using Bayesian inferences. Monte-Carlo continuous samplings, using Markov chains, generate models within a broad range of parameters, distributed according to the unknown posterior distributions. The non-linear exploration of both the seismic structure (velocity and thickness) and the source parameters relies on a fast forward problem using 1-D travel time computations. The a posteriori covariances between parameters (hypocentre depth, origin time and seismic structure among others) are computed and explicitly documented. This method manages to decrease the influence of the surrounding seismic network geometry (sparse and/or azimuthally inhomogeneous) and a too constrained velocity structure by inferring realistic distributions on hypocentral parameters. Our algorithm is successfully used to accurately locate events of the Armorican Massif (western France), which is characterized by moderate and apparently diffuse local seismicity.
Gopnik, Alison
2012-09-28
New theoretical ideas and empirical research show that very young children's learning and thinking are strikingly similar to much learning and thinking in science. Preschoolers test hypotheses against data and make causal inferences; they learn from statistics and informal experimentation, and from watching and listening to others. The mathematical framework of probabilistic models and Bayesian inference can describe this learning in precise ways. These discoveries have implications for early childhood education and policy. In particular, they suggest both that early childhood experience is extremely important and that the trend toward more structured and academic early childhood programs is misguided.
IC-Finder: inferring robustly the hierarchical organization of chromatin folding
Haddad, Noelle
2017-01-01
Abstract The spatial organization of the genome plays a crucial role in the regulation of gene expression. Recent experimental techniques like Hi-C have emphasized the segmentation of genomes into interaction compartments that constitute conserved functional domains participating in the maintenance of a proper cell identity. Here, we propose a novel method, IC-Finder, to identify interaction compartments (IC) from experimental Hi-C maps. IC-Finder is based on a hierarchical clustering approach that we adapted to account for the polymeric nature of chromatin. Based on a benchmark of realistic in silico Hi-C maps, we show that IC-Finder is one of the best methods in terms of reliability and is the most efficient numerically. IC-Finder proposes two original options: a probabilistic description of the inferred compartments and the possibility to explore the various hierarchies of chromatin organization. Applying the method to experimental data in fly and human, we show how the predicted segmentation may depend on the normalization scheme and how 3D compartmentalization is tightly associated with epigenomic information. IC-Finder provides a robust and generic ‘all-in-one’ tool to uncover the general principles of 3D chromatin folding and their influence on gene regulation. The software is available at http://membres-timc.imag.fr/Daniel.Jost/DJ-TIMC/Software.html. PMID:28130423
NASA Astrophysics Data System (ADS)
Sobradelo, Rosa; Martí, Joan
2015-01-01
One of the most challenging aspects of managing a volcanic crisis is the interpretation of the monitoring data, so as to anticipate to the evolution of the unrest and implement timely mitigation actions. An unrest episode may include different stages or time intervals of increasing activity that may or may not precede a volcanic eruption, depending on the causes of the unrest (magmatic, geothermal or tectonic). Therefore, one of the main goals in monitoring volcanic unrest is to forecast whether or not such increase of activity will end up with an eruption, and if this is the case, how, when, and where this eruption will take place. As an alternative method to expert elicitation for assessing and merging monitoring data and relevant past information, we present a probabilistic method to transform precursory activity into the probability of experiencing a significant variation by the next time interval (i.e. the next step in the unrest), given its preceding evolution, and by further estimating the probability of the occurrence of a particular eruptive scenario combining monitoring and past data. With the 1991 Pinatubo volcanic crisis as a reference, we have developed such a method to assess short-term volcanic hazard using Bayesian inference.
Epidemiologic research using probabilistic outcome definitions.
Cai, Bing; Hennessy, Sean; Lo Re, Vincent; Small, Dylan S
2015-01-01
Epidemiologic studies using electronic healthcare data often define the presence or absence of binary clinical outcomes by using algorithms with imperfect specificity, sensitivity, and positive predictive value. This results in misclassification and bias in study results. We describe and evaluate a new method called probabilistic outcome definition (POD) that uses logistic regression to estimate the probability of a clinical outcome using multiple potential algorithms and then uses multiple imputation to make valid inferences about the risk ratio or other epidemiologic parameters of interest. We conducted a simulation to evaluate the performance of the POD method with two variables that can predict the true outcome and compared the POD method with the conventional method. The simulation results showed that when the true risk ratio is equal to 1.0 (null), the conventional method based on a binary outcome provides unbiased estimates. However, when the risk ratio is not equal to 1.0, the traditional method, either using one predictive variable or both predictive variables to define the outcome, is biased when the positive predictive value is <100%, and the bias is very severe when the sensitivity or positive predictive value is poor (less than 0.75 in our simulation). In contrast, the POD method provides unbiased estimates of the risk ratio both when this measure of effect is equal to 1.0 and not equal to 1.0. Even when the sensitivity and positive predictive value are low, the POD method continues to provide unbiased estimates of the risk ratio. The POD method provides an improved way to define outcomes in database research. This method has a major advantage over the conventional method in that it provided unbiased estimates of risk ratios and it is easy to use. Copyright © 2014 John Wiley & Sons, Ltd.
Raffelt, David A.; Smith, Robert E.; Ridgway, Gerard R.; Tournier, J-Donald; Vaughan, David N.; Rose, Stephen; Henderson, Robert; Connelly, Alan
2015-01-01
In brain regions containing crossing fibre bundles, voxel-average diffusion MRI measures such as fractional anisotropy (FA) are difficult to interpret, and lack within-voxel single fibre population specificity. Recent work has focused on the development of more interpretable quantitative measures that can be associated with a specific fibre population within a voxel containing crossing fibres (herein we use fixel to refer to a specific fibre population within a single voxel). Unfortunately, traditional 3D methods for smoothing and cluster-based statistical inference cannot be used for voxel-based analysis of these measures, since the local neighbourhood for smoothing and cluster formation can be ambiguous when adjacent voxels may have different numbers of fixels, or ill-defined when they belong to different tracts. Here we introduce a novel statistical method to perform whole-brain fixel-based analysis called connectivity-based fixel enhancement (CFE). CFE uses probabilistic tractography to identify structurally connected fixels that are likely to share underlying anatomy and pathology. Probabilistic connectivity information is then used for tract-specific smoothing (prior to the statistical analysis) and enhancement of the statistical map (using a threshold-free cluster enhancement-like approach). To investigate the characteristics of the CFE method, we assessed sensitivity and specificity using a large number of combinations of CFE enhancement parameters and smoothing extents, using simulated pathology generated with a range of test-statistic signal-to-noise ratios in five different white matter regions (chosen to cover a broad range of fibre bundle features). The results suggest that CFE input parameters are relatively insensitive to the characteristics of the simulated pathology. We therefore recommend a single set of CFE parameters that should give near optimal results in future studies where the group effect is unknown. We then demonstrate the proposed method by comparing apparent fibre density between motor neurone disease (MND) patients with control subjects. The MND results illustrate the benefit of fixel-specific statistical inference in white matter regions that contain crossing fibres. PMID:26004503
Forward and backward inference in spatial cognition.
Penny, Will D; Zeidman, Peter; Burgess, Neil
2013-01-01
This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.
Forward and Backward Inference in Spatial Cognition
Penny, Will D.; Zeidman, Peter; Burgess, Neil
2013-01-01
This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of ‘lower-level’ computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus. PMID:24348230
Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
Kolaczkowski, Bryan; Thornton, Joseph W.
2009-01-01
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis. PMID:20011052
Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
Kolaczkowski, Bryan; Thornton, Joseph W
2009-12-09
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.
The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networksmore » and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.« less
Plasticity in probabilistic reaction norms for maturation in a salmonid fish
Morita, Kentaro; Tsuboi, Jun-ichi; Nagasawa, Toru
2009-01-01
The relationship between body size and the probability of maturing, often referred to as the probabilistic maturation reaction norm (PMRN), has been increasingly used to infer genetic variation in maturation schedule. Despite this trend, few studies have directly evaluated plasticity in the PMRN. A transplant experiment using white-spotted charr demonstrated that the PMRN for precocious males exhibited plasticity. A smaller threshold size at maturity occurred in charr inhabiting narrow streams where more refuges are probably available for small charr, which in turn might enhance the reproductive success of sneaker precocious males. Our findings suggested that plastic effects should clearly be included in investigations of variation in PMRNs. PMID:19493875
Recent developments of the NESSUS probabilistic structural analysis computer program
NASA Technical Reports Server (NTRS)
Millwater, H.; Wu, Y.-T.; Torng, T.; Thacker, B.; Riha, D.; Leung, C. P.
1992-01-01
The NESSUS probabilistic structural analysis computer program combines state-of-the-art probabilistic algorithms with general purpose structural analysis methods to compute the probabilistic response and the reliability of engineering structures. Uncertainty in loading, material properties, geometry, boundary conditions and initial conditions can be simulated. The structural analysis methods include nonlinear finite element and boundary element methods. Several probabilistic algorithms are available such as the advanced mean value method and the adaptive importance sampling method. The scope of the code has recently been expanded to include probabilistic life and fatigue prediction of structures in terms of component and system reliability and risk analysis of structures considering cost of failure. The code is currently being extended to structural reliability considering progressive crack propagation. Several examples are presented to demonstrate the new capabilities.
The Sapir-Whorf hypothesis and inference under uncertainty.
Regier, Terry; Xu, Yang
2017-11-01
The Sapir-Whorf hypothesis holds that human thought is shaped by language, leading speakers of different languages to think differently. This hypothesis has sparked both enthusiasm and controversy, but despite its prominence it has only occasionally been addressed in computational terms. Recent developments support a view of the Sapir-Whorf hypothesis in terms of probabilistic inference. This view may resolve some of the controversy surrounding the Sapir-Whorf hypothesis, and may help to normalize the hypothesis by linking it to established principles that also explain other phenomena. On this view, effects of language on nonlinguistic cognition or perception reflect standard principles of inference under uncertainty. WIREs Cogn Sci 2017, 8:e1440. doi: 10.1002/wcs.1440 For further resources related to this article, please visit the WIREs website. © 2017 Wiley Periodicals, Inc.
On the Psychology of the Recognition Heuristic: Retrieval Primacy as a Key Determinant of Its Use
ERIC Educational Resources Information Center
Pachur, Thorsten; Hertwig, Ralph
2006-01-01
The recognition heuristic is a prime example of a boundedly rational mind tool that rests on an evolved capacity, recognition, and exploits environmental structures. When originally proposed, it was conjectured that no other probabilistic cue reverses the recognition-based inference (D. G. Goldstein & G. Gigerenzer, 2002). More recent studies…
ERIC Educational Resources Information Center
Bergert, F. Bryan; Nosofsky, Robert M.
2007-01-01
The authors develop and test generalized versions of take-the-best (TTB) and rational (RAT) models of multiattribute paired-comparison inference. The generalized models make allowances for subjective attribute weighting, probabilistic orders of attribute inspection, and noisy decision making. A key new test involves a response-time (RT)…
XID+: Next generation XID development
NASA Astrophysics Data System (ADS)
Hurley, Peter
2017-04-01
XID+ is a prior-based source extraction tool which carries out photometry in the Herschel SPIRE (Spectral and Photometric Imaging Receiver) maps at the positions of known sources. It uses a probabilistic Bayesian framework that provides a natural framework in which to include prior information, and uses the Bayesian inference tool Stan to obtain the full posterior probability distribution on flux estimates.
Naive Probability: Model-based Estimates of Unique Events
2014-05-04
Gilio & Over, 2012) – a possibility to which we return later. Despite these studies...Barrouillet, Jean-François Bonnefon, Nick Cassimatis, Nick Chater, Ernest Davis, Igor Douven, Angelo Gilio , Adam Harris, Gernot Kleiter, Gary Marcus, Ray...1230-1239. Gilio , A., & Over, D. (2012). The psychology of inferring conditionals from disjunctions: A probabilistic study. Journal of
Building Scalable Knowledge Graphs for Earth Science
NASA Technical Reports Server (NTRS)
Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian
2017-01-01
Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.
Wang, Junbai; Wu, Qianqian; Hu, Xiaohua Tony; Tian, Tianhai
2016-11-01
Investigating the dynamics of genetic regulatory networks through high throughput experimental data, such as microarray gene expression profiles, is a very important but challenging task. One of the major hindrances in building detailed mathematical models for genetic regulation is the large number of unknown model parameters. To tackle this challenge, a new integrated method is proposed by combining a top-down approach and a bottom-up approach. First, the top-down approach uses probabilistic graphical models to predict the network structure of DNA repair pathway that is regulated by the p53 protein. Two networks are predicted, namely a network of eight genes with eight inferred interactions and an extended network of 21 genes with 17 interactions. Then, the bottom-up approach using differential equation models is developed to study the detailed genetic regulations based on either a fully connected regulatory network or a gene network obtained by the top-down approach. Model simulation error, parameter identifiability and robustness property are used as criteria to select the optimal network. Simulation results together with permutation tests of input gene network structures indicate that the prediction accuracy and robustness property of the two predicted networks using the top-down approach are better than those of the corresponding fully connected networks. In particular, the proposed approach reduces computational cost significantly for inferring model parameters. Overall, the new integrated method is a promising approach for investigating the dynamics of genetic regulation. Copyright © 2016 Elsevier Inc. All rights reserved.
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
NASA Astrophysics Data System (ADS)
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.
2011-01-01
Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348
Vertically Integrated Seismological Analysis II : Inference
NASA Astrophysics Data System (ADS)
Arora, N. S.; Russell, S.; Sudderth, E.
2009-12-01
Methods for automatically associating detected waveform features with hypothesized seismic events, and localizing those events, are a critical component of efforts to verify the Comprehensive Test Ban Treaty (CTBT). As outlined in our companion abstract, we have developed a hierarchical model which views detection, association, and localization as an integrated probabilistic inference problem. In this abstract, we provide more details on the Markov chain Monte Carlo (MCMC) methods used to solve this inference task. MCMC generates samples from a posterior distribution π(x) over possible worlds x by defining a Markov chain whose states are the worlds x, and whose stationary distribution is π(x). In the Metropolis-Hastings (M-H) method, transitions in the Markov chain are constructed in two steps. First, given the current state x, a candidate next state x‧ is generated from a proposal distribution q(x‧ | x), which may be (more or less) arbitrary. Second, the transition to x‧ is not automatic, but occurs with an acceptance probability—α(x‧ | x) = min(1, π(x‧)q(x | x‧)/π(x)q(x‧ | x)). The seismic event model outlined in our companion abstract is quite similar to those used in multitarget tracking, for which MCMC has proved very effective. In this model, each world x is defined by a collection of events, a list of properties characterizing those events (times, locations, magnitudes, and types), and the association of each event to a set of observed detections. The target distribution π(x) = P(x | y), the posterior distribution over worlds x given the observed waveform data y at all stations. Proposal distributions then implement several types of moves between worlds. For example, birth moves create new events; death moves delete existing events; split moves partition the detections for an event into two new events; merge moves combine event pairs; swap moves modify the properties and assocations for pairs of events. Importantly, the rules for accepting such complex moves need not be hand-designed. Instead, they are automatically determined by the underlying probabilistic model, which is in turn calibrated via historical data and scientific knowledge. Consider a small seismic event which generates weak signals at several different stations, which might independently be mistaken for noise. A birth move may nevertheless hypothesize an event jointly explaining these detections. If the corresponding waveform data then aligns with the seismological knowledge encoded in the probabilistic model, the event may be detected even though no single station observes it unambiguously. Alternatively, if a large outlier reading is produced at a single station, moves which instantiate a corresponding (false) event would be rejected because of the absence of plausible detections at other sensors. More broadly, one of the main advantages of our MCMC approach is its consistent handling of the relative uncertainties in different information sources. By avoiding low-level thresholds, we expect to improve accuracy and robustness. At the conference, we will present results quantitatively validating our approach, using ground-truth associations and locations provided either by simulation or human analysts.
NASA Technical Reports Server (NTRS)
Cruse, T. A.
1987-01-01
The objective is the development of several modular structural analysis packages capable of predicting the probabilistic response distribution for key structural variables such as maximum stress, natural frequencies, transient response, etc. The structural analysis packages are to include stochastic modeling of loads, material properties, geometry (tolerances), and boundary conditions. The solution is to be in terms of the cumulative probability of exceedance distribution (CDF) and confidence bounds. Two methods of probability modeling are to be included as well as three types of structural models - probabilistic finite-element method (PFEM); probabilistic approximate analysis methods (PAAM); and probabilistic boundary element methods (PBEM). The purpose in doing probabilistic structural analysis is to provide the designer with a more realistic ability to assess the importance of uncertainty in the response of a high performance structure. Probabilistic Structural Analysis Method (PSAM) tools will estimate structural safety and reliability, while providing the engineer with information on the confidence that should be given to the predicted behavior. Perhaps most critically, the PSAM results will directly provide information on the sensitivity of the design response to those variables which are seen to be uncertain.
NASA Technical Reports Server (NTRS)
Cruse, T. A.; Burnside, O. H.; Wu, Y.-T.; Polch, E. Z.; Dias, J. B.
1988-01-01
The objective is the development of several modular structural analysis packages capable of predicting the probabilistic response distribution for key structural variables such as maximum stress, natural frequencies, transient response, etc. The structural analysis packages are to include stochastic modeling of loads, material properties, geometry (tolerances), and boundary conditions. The solution is to be in terms of the cumulative probability of exceedance distribution (CDF) and confidence bounds. Two methods of probability modeling are to be included as well as three types of structural models - probabilistic finite-element method (PFEM); probabilistic approximate analysis methods (PAAM); and probabilistic boundary element methods (PBEM). The purpose in doing probabilistic structural analysis is to provide the designer with a more realistic ability to assess the importance of uncertainty in the response of a high performance structure. Probabilistic Structural Analysis Method (PSAM) tools will estimate structural safety and reliability, while providing the engineer with information on the confidence that should be given to the predicted behavior. Perhaps most critically, the PSAM results will directly provide information on the sensitivity of the design response to those variables which are seen to be uncertain.
Probabilistic structural analysis methods for space propulsion system components
NASA Technical Reports Server (NTRS)
Chamis, C. C.
1986-01-01
The development of a three-dimensional inelastic analysis methodology for the Space Shuttle main engine (SSME) structural components is described. The methodology is composed of: (1) composite load spectra, (2) probabilistic structural analysis methods, (3) the probabilistic finite element theory, and (4) probabilistic structural analysis. The methodology has led to significant technical progress in several important aspects of probabilistic structural analysis. The program and accomplishments to date are summarized.
McClelland, James L.
2013-01-01
This article seeks to establish a rapprochement between explicitly Bayesian models of contextual effects in perception and neural network models of such effects, particularly the connectionist interactive activation (IA) model of perception. The article is in part an historical review and in part a tutorial, reviewing the probabilistic Bayesian approach to understanding perception and how it may be shaped by context, and also reviewing ideas about how such probabilistic computations may be carried out in neural networks, focusing on the role of context in interactive neural networks, in which both bottom-up and top-down signals affect the interpretation of sensory inputs. It is pointed out that connectionist units that use the logistic or softmax activation functions can exactly compute Bayesian posterior probabilities when the bias terms and connection weights affecting such units are set to the logarithms of appropriate probabilistic quantities. Bayesian concepts such the prior, likelihood, (joint and marginal) posterior, probability matching and maximizing, and calculating vs. sampling from the posterior are all reviewed and linked to neural network computations. Probabilistic and neural network models are explicitly linked to the concept of a probabilistic generative model that describes the relationship between the underlying target of perception (e.g., the word intended by a speaker or other source of sensory stimuli) and the sensory input that reaches the perceiver for use in inferring the underlying target. It is shown how a new version of the IA model called the multinomial interactive activation (MIA) model can sample correctly from the joint posterior of a proposed generative model for perception of letters in words, indicating that interactive processing is fully consistent with principled probabilistic computation. Ways in which these computations might be realized in real neural systems are also considered. PMID:23970868
McClelland, James L
2013-01-01
This article seeks to establish a rapprochement between explicitly Bayesian models of contextual effects in perception and neural network models of such effects, particularly the connectionist interactive activation (IA) model of perception. The article is in part an historical review and in part a tutorial, reviewing the probabilistic Bayesian approach to understanding perception and how it may be shaped by context, and also reviewing ideas about how such probabilistic computations may be carried out in neural networks, focusing on the role of context in interactive neural networks, in which both bottom-up and top-down signals affect the interpretation of sensory inputs. It is pointed out that connectionist units that use the logistic or softmax activation functions can exactly compute Bayesian posterior probabilities when the bias terms and connection weights affecting such units are set to the logarithms of appropriate probabilistic quantities. Bayesian concepts such the prior, likelihood, (joint and marginal) posterior, probability matching and maximizing, and calculating vs. sampling from the posterior are all reviewed and linked to neural network computations. Probabilistic and neural network models are explicitly linked to the concept of a probabilistic generative model that describes the relationship between the underlying target of perception (e.g., the word intended by a speaker or other source of sensory stimuli) and the sensory input that reaches the perceiver for use in inferring the underlying target. It is shown how a new version of the IA model called the multinomial interactive activation (MIA) model can sample correctly from the joint posterior of a proposed generative model for perception of letters in words, indicating that interactive processing is fully consistent with principled probabilistic computation. Ways in which these computations might be realized in real neural systems are also considered.
Yang, Jihong; Li, Zheng; Fan, Xiaohui; Cheng, Yiyu
2014-09-22
The high incidence of complex diseases has become a worldwide threat to human health. Multiple targets and pathways are perturbed during the pathological process of complex diseases. Systematic investigation of complex relationship between drugs and diseases is necessary for new association discovery and drug repurposing. For this purpose, three causal networks were constructed herein for cardiovascular diseases, diabetes mellitus, and neoplasms, respectively. A causal inference-probabilistic matrix factorization (CI-PMF) approach was proposed to predict and classify drug-disease associations, and further used for drug-repositioning predictions. First, multilevel systematic relations between drugs and diseases were integrated from heterogeneous databases to construct causal networks connecting drug-target-pathway-gene-disease. Then, the association scores between drugs and diseases were assessed by evaluating a drug's effects on multiple targets and pathways. Furthermore, PMF models were learned based on known interactions, and associations were then classified into three types by trained models. Finally, therapeutic associations were predicted based upon the ranking of association scores and predicted association types. In terms of drug-disease association prediction, modified causal inference included in CI-PMF outperformed existing causal inference with a higher AUC (area under receiver operating characteristic curve) score and greater precision. Moreover, CI-PMF performed better than single modified causal inference in predicting therapeutic drug-disease associations. In the top 30% of predicted associations, 58.6% (136/232), 50.8% (31/61), and 39.8% (140/352) hit known therapeutic associations, while precisions obtained by the latter were only 10.2% (231/2264), 8.8% (36/411), and 9.7% (189/1948). Clinical verifications were further conducted for the top 100 newly predicted therapeutic associations. As a result, 21, 12, and 32 associations have been studied and many treatment effects of drugs on diseases were investigated for cardiovascular diseases, diabetes mellitus, and neoplasms, respectively. Related chains in causal networks were extracted for these 65 clinical-verified associations, and we further illustrated the therapeutic role of etodolac in breast cancer by inferred chains. Overall, CI-PMF is a useful approach for associating drugs with complex diseases and provides potential values for drug repositioning.
Incorporating seismic phase correlations into a probabilistic model of global-scale seismology
NASA Astrophysics Data System (ADS)
Arora, Nimar
2013-04-01
We present a probabilistic model of seismic phases whereby the attributes of the body-wave phases are correlated to those of the first arriving P phase. This model has been incorporated into NET-VISA (Network processing Vertically Integrated Seismic Analysis) a probabilistic generative model of seismic events, their transmission, and detection on a global seismic network. In the earlier version of NET-VISA, seismic phase were assumed to be independent of each other. Although this didn't affect the quality of the inferred seismic bulletin, for the most part, it did result in a few instances of anomalous phase association. For example, an S phase with a smaller slowness than the corresponding P phase. We demonstrate that the phase attributes are indeed highly correlated, for example the uncertainty in the S phase travel time is significantly reduced given the P phase travel time. Our new model exploits these correlations to produce better calibrated probabilities for the events, as well as fewer anomalous associations.
Diard, Julien; Rynik, Vincent; Lorenceau, Jean
2013-01-01
This research involves a novel apparatus, in which the user is presented with an illusion inducing visual stimulus. The user perceives illusory movement that can be followed by the eye, so that smooth pursuit eye movements can be sustained in arbitrary directions. Thus, free-flow trajectories of any shape can be traced. In other words, coupled with an eye-tracking device, this apparatus enables "eye writing," which appears to be an original object of study. We adapt a previous model of reading and writing to this context. We describe a probabilistic model called the Bayesian Action-Perception for Eye On-Line model (BAP-EOL). It encodes probabilistic knowledge about isolated letter trajectories, their size, high-frequency components of the produced trajectory, and pupil diameter. We show how Bayesian inference, in this single model, can be used to solve several tasks, like letter recognition and novelty detection (i.e., recognizing when a presented character is not part of the learned database). We are interested in the potential use of the eye writing apparatus by motor impaired patients: the final task we solve by Bayesian inference is disability assessment (i.e., measuring and tracking the evolution of motor characteristics of produced trajectories). Preliminary experimental results are presented, which illustrate the method, showing the feasibility of character recognition in the context of eye writing. We then show experimentally how a model of the unknown character can be used to detect trajectories that are likely to be new symbols, and how disability assessment can be performed by opportunistically observing characteristics of fine motor control, as letter are being traced. Experimental analyses also help identify specificities of eye writing, as compared to handwriting, and the resulting technical challenges.
Diard, Julien; Rynik, Vincent; Lorenceau, Jean
2013-01-01
This research involves a novel apparatus, in which the user is presented with an illusion inducing visual stimulus. The user perceives illusory movement that can be followed by the eye, so that smooth pursuit eye movements can be sustained in arbitrary directions. Thus, free-flow trajectories of any shape can be traced. In other words, coupled with an eye-tracking device, this apparatus enables “eye writing,” which appears to be an original object of study. We adapt a previous model of reading and writing to this context. We describe a probabilistic model called the Bayesian Action-Perception for Eye On-Line model (BAP-EOL). It encodes probabilistic knowledge about isolated letter trajectories, their size, high-frequency components of the produced trajectory, and pupil diameter. We show how Bayesian inference, in this single model, can be used to solve several tasks, like letter recognition and novelty detection (i.e., recognizing when a presented character is not part of the learned database). We are interested in the potential use of the eye writing apparatus by motor impaired patients: the final task we solve by Bayesian inference is disability assessment (i.e., measuring and tracking the evolution of motor characteristics of produced trajectories). Preliminary experimental results are presented, which illustrate the method, showing the feasibility of character recognition in the context of eye writing. We then show experimentally how a model of the unknown character can be used to detect trajectories that are likely to be new symbols, and how disability assessment can be performed by opportunistically observing characteristics of fine motor control, as letter are being traced. Experimental analyses also help identify specificities of eye writing, as compared to handwriting, and the resulting technical challenges. PMID:24273525
Probabilistic Structural Analysis Theory Development
NASA Technical Reports Server (NTRS)
Burnside, O. H.
1985-01-01
The objective of the Probabilistic Structural Analysis Methods (PSAM) project is to develop analysis techniques and computer programs for predicting the probabilistic response of critical structural components for current and future space propulsion systems. This technology will play a central role in establishing system performance and durability. The first year's technical activity is concentrating on probabilistic finite element formulation strategy and code development. Work is also in progress to survey critical materials and space shuttle mian engine components. The probabilistic finite element computer program NESSUS (Numerical Evaluation of Stochastic Structures Under Stress) is being developed. The final probabilistic code will have, in the general case, the capability of performing nonlinear dynamic of stochastic structures. It is the goal of the approximate methods effort to increase problem solving efficiency relative to finite element methods by using energy methods to generate trial solutions which satisfy the structural boundary conditions. These approximate methods will be less computer intensive relative to the finite element approach.
Probabilistic Structural Analysis Program
NASA Technical Reports Server (NTRS)
Pai, Shantaram S.; Chamis, Christos C.; Murthy, Pappu L. N.; Stefko, George L.; Riha, David S.; Thacker, Ben H.; Nagpal, Vinod K.; Mital, Subodh K.
2010-01-01
NASA/NESSUS 6.2c is a general-purpose, probabilistic analysis program that computes probability of failure and probabilistic sensitivity measures of engineered systems. Because NASA/NESSUS uses highly computationally efficient and accurate analysis techniques, probabilistic solutions can be obtained even for extremely large and complex models. Once the probabilistic response is quantified, the results can be used to support risk-informed decisions regarding reliability for safety-critical and one-of-a-kind systems, as well as for maintaining a level of quality while reducing manufacturing costs for larger-quantity products. NASA/NESSUS has been successfully applied to a diverse range of problems in aerospace, gas turbine engines, biomechanics, pipelines, defense, weaponry, and infrastructure. This program combines state-of-the-art probabilistic algorithms with general-purpose structural analysis and lifting methods to compute the probabilistic response and reliability of engineered structures. Uncertainties in load, material properties, geometry, boundary conditions, and initial conditions can be simulated. The structural analysis methods include non-linear finite-element methods, heat-transfer analysis, polymer/ceramic matrix composite analysis, monolithic (conventional metallic) materials life-prediction methodologies, boundary element methods, and user-written subroutines. Several probabilistic algorithms are available such as the advanced mean value method and the adaptive importance sampling method. NASA/NESSUS 6.2c is structured in a modular format with 15 elements.
Dinov, Martin; Leech, Robert
2017-01-01
Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses.
Dinov, Martin; Leech, Robert
2017-01-01
Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses. PMID:29163110
Location contexts of user check-ins to model urban geo life-style patterns.
Hasan, Samiul; Ukkusuri, Satish V
2015-01-01
Geo-location data from social media offers us information, in new ways, to understand people's attitudes and interests through their activity choices. In this paper, we explore the idea of inferring individual life-style patterns from activity-location choices revealed in social media. We present a model to understand life-style patterns using the contextual information (e. g. location categories) of user check-ins. Probabilistic topic models are developed to infer individual geo life-style patterns from two perspectives: i) to characterize the patterns of user interests to different types of places and ii) to characterize the patterns of user visits to different neighborhoods. The method is applied to a dataset of Foursquare check-ins of the users from New York City. The co-existence of several location contexts and the corresponding probabilities in a given pattern provide useful information about user interests and choices. It is found that geo life-style patterns have similar items-either nearby neighborhoods or similar location categories. The semantic and geographic proximity of the items in a pattern reflects the hidden regularity in user preferences and location choice behavior.
Reasoning about Causal Relationships: Inferences on Causal Networks
Rottman, Benjamin Margolin; Hastie, Reid
2013-01-01
Over the last decade, a normative framework for making causal inferences, Bayesian Probabilistic Causal Networks, has come to dominate psychological studies of inference based on causal relationships. The following causal networks—[X→Y→Z, X←Y→Z, X→Y←Z]—supply answers for questions like, “Suppose both X and Y occur, what is the probability Z occurs?” or “Suppose you intervene and make Y occur, what is the probability Z occurs?” In this review, we provide a tutorial for how normatively to calculate these inferences. Then, we systematically detail the results of behavioral studies comparing human qualitative and quantitative judgments to the normative calculations for many network structures and for several types of inferences on those networks. Overall, when the normative calculations imply that an inference should increase, judgments usually go up; when calculations imply a decrease, judgments usually go down. However, two systematic deviations appear. First, people’s inferences violate the Markov assumption. For example, when inferring Z from the structure X→Y→Z, people think that X is relevant even when Y completely mediates the relationship between X and Z. Second, even when people’s inferences are directionally consistent with the normative calculations, they are often not as sensitive to the parameters and the structure of the network as they should be. We conclude with a discussion of productive directions for future research. PMID:23544658
ERIC Educational Resources Information Center
Solway, Alec; Botvinick, Matthew M.
2012-01-01
Recent work has given rise to the view that reward-based decision making is governed by two key controllers: a habit system, which stores stimulus-response associations shaped by past reward, and a goal-oriented system that selects actions based on their anticipated outcomes. The current literature provides a rich body of computational theory…
A Tutorial Introduction to Bayesian Models of Cognitive Development
ERIC Educational Resources Information Center
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2011-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…
Inductive reasoning about causally transmitted properties.
Shafto, Patrick; Kemp, Charles; Bonawitz, Elizabeth Baraff; Coley, John D; Tenenbaum, Joshua B
2008-11-01
Different intuitive theories constrain and guide inferences in different contexts. Formalizing simple intuitive theories as probabilistic processes operating over structured representations, we present a new computational model of category-based induction about causally transmitted properties. A first experiment demonstrates undergraduates' context-sensitive use of taxonomic and food web knowledge to guide reasoning about causal transmission and shows good qualitative agreement between model predictions and human inferences. A second experiment demonstrates strong quantitative and qualitative fits to inferences about a more complex artificial food web. A third experiment investigates human reasoning about complex novel food webs where species have known taxonomic relations. Results demonstrate a double-dissociation between the predictions of our causal model and a related taxonomic model [Kemp, C., & Tenenbaum, J. B. (2003). Learning domain structures. In Proceedings of the 25th annual conference of the cognitive science society]: the causal model predicts human inferences about diseases but not genes, while the taxonomic model predicts human inferences about genes but not diseases. We contrast our framework with previous models of category-based induction and previous formal instantiations of intuitive theories, and outline challenges in developing a complete model of context-sensitive reasoning.
Extracting Databases from Dark Data with DeepDive
Zhang, Ce; Shin, Jaeho; Ré, Christopher; Cafarella, Michael; Niu, Feng
2016-01-01
DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data — scientific papers, Web classified ads, customer service notes, and so on — were instead in a relational database, it would give analysts a massive and valuable new set of “big data.” DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference. PMID:28316365
Deep Unfolding for Topic Models.
Chien, Jen-Tzung; Lee, Chao-Hsi
2018-02-01
Deep unfolding provides an approach to integrate the probabilistic generative models and the deterministic neural networks. Such an approach is benefited by deep representation, easy interpretation, flexible learning and stochastic modeling. This study develops the unsupervised and supervised learning of deep unfolded topic models for document representation and classification. Conventionally, the unsupervised and supervised topic models are inferred via the variational inference algorithm where the model parameters are estimated by maximizing the lower bound of logarithm of marginal likelihood using input documents without and with class labels, respectively. The representation capability or classification accuracy is constrained by the variational lower bound and the tied model parameters across inference procedure. This paper aims to relax these constraints by directly maximizing the end performance criterion and continuously untying the parameters in learning process via deep unfolding inference (DUI). The inference procedure is treated as the layer-wise learning in a deep neural network. The end performance is iteratively improved by using the estimated topic parameters according to the exponentiated updates. Deep learning of topic models is therefore implemented through a back-propagation procedure. Experimental results show the merits of DUI with increasing number of layers compared with variational inference in unsupervised as well as supervised topic models.
Probabilistic Structural Analysis Methods (PSAM) for Select Space Propulsion System Components
NASA Technical Reports Server (NTRS)
1999-01-01
Probabilistic Structural Analysis Methods (PSAM) are described for the probabilistic structural analysis of engine components for current and future space propulsion systems. Components for these systems are subjected to stochastic thermomechanical launch loads. Uncertainties or randomness also occurs in material properties, structural geometry, and boundary conditions. Material property stochasticity, such as in modulus of elasticity or yield strength, exists in every structure and is a consequence of variations in material composition and manufacturing processes. Procedures are outlined for computing the probabilistic structural response or reliability of the structural components. The response variables include static or dynamic deflections, strains, and stresses at one or several locations, natural frequencies, fatigue or creep life, etc. Sample cases illustrates how the PSAM methods and codes simulate input uncertainties and compute probabilistic response or reliability using a finite element model with probabilistic methods.
Structured statistical models of inductive reasoning.
Kemp, Charles; Tenenbaum, Joshua B
2009-01-01
Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. This article presents a Bayesian framework that attempts to meet both goals and describes [corrected] 4 applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the 4 models are defined over different kinds of structures that capture different relationships between the categories in a domain. The framework therefore shows how statistical inference can operate over structured background knowledge, and the authors argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.
Probabilistic data integration and computational complexity
NASA Astrophysics Data System (ADS)
Hansen, T. M.; Cordua, K. S.; Mosegaard, K.
2016-12-01
Inverse problems in Earth Sciences typically refer to the problem of inferring information about properties of the Earth from observations of geophysical data (the result of nature's solution to the `forward' problem). This problem can be formulated more generally as a problem of `integration of information'. A probabilistic formulation of data integration is in principle simple: If all information available (from e.g. geology, geophysics, remote sensing, chemistry…) can be quantified probabilistically, then different algorithms exist that allow solving the data integration problem either through an analytical description of the combined probability function, or sampling the probability function. In practice however, probabilistic based data integration may not be easy to apply successfully. This may be related to the use of sampling methods, which are known to be computationally costly. But, another source of computational complexity is related to how the individual types of information are quantified. In one case a data integration problem is demonstrated where the goal is to determine the existence of buried channels in Denmark, based on multiple sources of geo-information. Due to one type of information being too informative (and hence conflicting), this leads to a difficult sampling problems with unrealistic uncertainty. Resolving this conflict prior to data integration, leads to an easy data integration problem, with no biases. In another case it is demonstrated how imperfections in the description of the geophysical forward model (related to solving the wave-equation) can lead to a difficult data integration problem, with severe bias in the results. If the modeling error is accounted for, the data integration problems becomes relatively easy, with no apparent biases. Both examples demonstrate that biased information can have a dramatic effect on the computational efficiency solving a data integration problem and lead to biased results, and under-estimation of uncertainty. However, in both examples, one can also analyze the performance of the sampling methods used to solve the data integration problem to indicate the existence of biased information. This can be used actively to avoid biases in the available information and subsequently in the final uncertainty evaluation.
Bayes in biological anthropology.
Konigsberg, Lyle W; Frankenberg, Susan R
2013-12-01
In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.
Probabilistic structural analysis methods of hot engine structures
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Hopkins, D. A.
1989-01-01
Development of probabilistic structural analysis methods for hot engine structures is a major activity at Lewis Research Center. Recent activities have focused on extending the methods to include the combined uncertainties in several factors on structural response. This paper briefly describes recent progress on composite load spectra models, probabilistic finite element structural analysis, and probabilistic strength degradation modeling. Progress is described in terms of fundamental concepts, computer code development, and representative numerical results.
Objectively combining AR5 instrumental period and paleoclimate climate sensitivity evidence
NASA Astrophysics Data System (ADS)
Lewis, Nicholas; Grünwald, Peter
2018-03-01
Combining instrumental period evidence regarding equilibrium climate sensitivity with largely independent paleoclimate proxy evidence should enable a more constrained sensitivity estimate to be obtained. Previous, subjective Bayesian approaches involved selection of a prior probability distribution reflecting the investigators' beliefs about climate sensitivity. Here a recently developed approach employing two different statistical methods—objective Bayesian and frequentist likelihood-ratio—is used to combine instrumental period and paleoclimate evidence based on data presented and assessments made in the IPCC Fifth Assessment Report. Probabilistic estimates from each source of evidence are represented by posterior probability density functions (PDFs) of physically-appropriate form that can be uniquely factored into a likelihood function and a noninformative prior distribution. The three-parameter form is shown accurately to fit a wide range of estimated climate sensitivity PDFs. The likelihood functions relating to the probabilistic estimates from the two sources are multiplicatively combined and a prior is derived that is noninformative for inference from the combined evidence. A posterior PDF that incorporates the evidence from both sources is produced using a single-step approach, which avoids the order-dependency that would arise if Bayesian updating were used. Results are compared with an alternative approach using the frequentist signed root likelihood ratio method. Results from these two methods are effectively identical, and provide a 5-95% range for climate sensitivity of 1.1-4.05 K (median 1.87 K).
Chen, Yi; Pouillot, Régis; S Burall, Laurel; Strain, Errol A; Van Doren, Jane M; De Jesus, Antonio J; Laasri, Anna; Wang, Hua; Ali, Laila; Tatavarthy, Aparna; Zhang, Guodong; Hu, Lijun; Day, James; Sheth, Ishani; Kang, Jihun; Sahu, Surasri; Srinivasan, Devayani; Brown, Eric W; Parish, Mickey; Zink, Donald L; Datta, Atin R; Hammack, Thomas S; Macarisin, Dumitru
2017-01-16
A precise and accurate method for enumeration of low level of Listeria monocytogenes in foods is critical to a variety of studies. In this study, paired comparison of most probable number (MPN) and direct plating enumeration of L. monocytogenes was conducted on a total of 1730 outbreak-associated ice cream samples that were naturally contaminated with low level of L. monocytogenes. MPN was performed on all 1730 samples. Direct plating was performed on all samples using the RAPID'L.mono (RLM) agar (1600 samples) and agar Listeria Ottaviani and Agosti (ALOA; 130 samples). Probabilistic analysis with Bayesian inference model was used to compare paired direct plating and MPN estimates of L. monocytogenes in ice cream samples because assumptions implicit in ordinary least squares (OLS) linear regression analyses were not met for such a comparison. The probabilistic analysis revealed good agreement between the MPN and direct plating estimates, and this agreement showed that the MPN schemes and direct plating schemes using ALOA or RLM evaluated in the present study were suitable for enumerating low levels of L. monocytogenes in these ice cream samples. The statistical analysis further revealed that OLS linear regression analyses of direct plating and MPN data did introduce bias that incorrectly characterized systematic differences between estimates from the two methods. Published by Elsevier B.V.
Causal Inference and Explaining Away in a Spiking Network
Moreno-Bote, Rubén; Drugowitsch, Jan
2015-01-01
While the brain uses spiking neurons for communication, theoretical research on brain computations has mostly focused on non-spiking networks. The nature of spike-based algorithms that achieve complex computations, such as object probabilistic inference, is largely unknown. Here we demonstrate that a family of high-dimensional quadratic optimization problems with non-negativity constraints can be solved exactly and efficiently by a network of spiking neurons. The network naturally imposes the non-negativity of causal contributions that is fundamental to causal inference, and uses simple operations, such as linear synapses with realistic time constants, and neural spike generation and reset non-linearities. The network infers the set of most likely causes from an observation using explaining away, which is dynamically implemented by spike-based, tuned inhibition. The algorithm performs remarkably well even when the network intrinsically generates variable spike trains, the timing of spikes is scrambled by external sources of noise, or the network is mistuned. This type of network might underlie tasks such as odor identification and classification. PMID:26621426
Causal Inference and Explaining Away in a Spiking Network.
Moreno-Bote, Rubén; Drugowitsch, Jan
2015-12-01
While the brain uses spiking neurons for communication, theoretical research on brain computations has mostly focused on non-spiking networks. The nature of spike-based algorithms that achieve complex computations, such as object probabilistic inference, is largely unknown. Here we demonstrate that a family of high-dimensional quadratic optimization problems with non-negativity constraints can be solved exactly and efficiently by a network of spiking neurons. The network naturally imposes the non-negativity of causal contributions that is fundamental to causal inference, and uses simple operations, such as linear synapses with realistic time constants, and neural spike generation and reset non-linearities. The network infers the set of most likely causes from an observation using explaining away, which is dynamically implemented by spike-based, tuned inhibition. The algorithm performs remarkably well even when the network intrinsically generates variable spike trains, the timing of spikes is scrambled by external sources of noise, or the network is mistuned. This type of network might underlie tasks such as odor identification and classification.
Friston, Karl J.; Dolan, Raymond J.
2017-01-01
Normative models of human cognition often appeal to Bayesian filtering, which provides optimal online estimates of unknown or hidden states of the world, based on previous observations. However, in many cases it is necessary to optimise beliefs about sequences of states rather than just the current state. Importantly, Bayesian filtering and sequential inference strategies make different predictions about beliefs and subsequent choices, rendering them behaviourally dissociable. Taking data from a probabilistic reversal task we show that subjects’ choices provide strong evidence that they are representing short sequences of states. Between-subject measures of this implicit sequential inference strategy had a neurobiological underpinning and correlated with grey matter density in prefrontal and parietal cortex, as well as the hippocampus. Our findings provide, to our knowledge, the first evidence for sequential inference in human cognition, and by exploiting between-subject variation in this measure we provide pointers to its neuronal substrates. PMID:28486504
Inferring brain-computational mechanisms with models of activity measurements
Diedrichsen, Jörn
2016-01-01
High-resolution functional imaging is providing increasingly rich measurements of brain activity in animals and humans. A major challenge is to leverage such data to gain insight into the brain's computational mechanisms. The first step is to define candidate brain-computational models (BCMs) that can perform the behavioural task in question. We would then like to infer which of the candidate BCMs best accounts for measured brain-activity data. Here we describe a method that complements each BCM by a measurement model (MM), which simulates the way the brain-activity measurements reflect neuronal activity (e.g. local averaging in functional magnetic resonance imaging (fMRI) voxels or sparse sampling in array recordings). The resulting generative model (BCM-MM) produces simulated measurements. To avoid having to fit the MM to predict each individual measurement channel of the brain-activity data, we compare the measured and predicted data at the level of summary statistics. We describe a novel particular implementation of this approach, called probabilistic representational similarity analysis (pRSA) with MMs, which uses representational dissimilarity matrices (RDMs) as the summary statistics. We validate this method by simulations of fMRI measurements (locally averaging voxels) based on a deep convolutional neural network for visual object recognition. Results indicate that the way the measurements sample the activity patterns strongly affects the apparent representational dissimilarities. However, modelling of the measurement process can account for these effects, and different BCMs remain distinguishable even under substantial noise. The pRSA method enables us to perform Bayesian inference on the set of BCMs and to recognize the data-generating model in each case. This article is part of the themed issue ‘Interpreting BOLD: a dialogue between cognitive and cellular neuroscience’. PMID:27574316
A hierarchical model for probabilistic independent component analysis of multi-subject fMRI studies
Tang, Li
2014-01-01
Summary An important goal in fMRI studies is to decompose the observed series of brain images to identify and characterize underlying brain functional networks. Independent component analysis (ICA) has been shown to be a powerful computational tool for this purpose. Classic ICA has been successfully applied to single-subject fMRI data. The extension of ICA to group inferences in neuroimaging studies, however, is challenging due to the unavailability of a pre-specified group design matrix. Existing group ICA methods generally concatenate observed fMRI data across subjects on the temporal domain and then decompose multi-subject data in a similar manner to single-subject ICA. The major limitation of existing methods is that they ignore between-subject variability in spatial distributions of brain functional networks in group ICA. In this paper, we propose a new hierarchical probabilistic group ICA method to formally model subject-specific effects in both temporal and spatial domains when decomposing multi-subject fMRI data. The proposed method provides model-based estimation of brain functional networks at both the population and subject level. An important advantage of the hierarchical model is that it provides a formal statistical framework to investigate similarities and differences in brain functional networks across subjects, e.g., subjects with mental disorders or neurodegenerative diseases such as Parkinson’s as compared to normal subjects. We develop an EM algorithm for model estimation where both the E-step and M-step have explicit forms. We compare the performance of the proposed hierarchical model with that of two popular group ICA methods via simulation studies. We illustrate our method with application to an fMRI study of Zen meditation. PMID:24033125
Patel, Ameera X; Bullmore, Edward T
2016-11-15
Connectome mapping using techniques such as functional magnetic resonance imaging (fMRI) has become a focus of systems neuroscience. There remain many statistical challenges in analysis of functional connectivity and network architecture from BOLD fMRI multivariate time series. One key statistic for any time series is its (effective) degrees of freedom, df, which will generally be less than the number of time points (or nominal degrees of freedom, N). If we know the df, then probabilistic inference on other fMRI statistics, such as the correlation between two voxel or regional time series, is feasible. However, we currently lack good estimators of df in fMRI time series, especially after the degrees of freedom of the "raw" data have been modified substantially by denoising algorithms for head movement. Here, we used a wavelet-based method both to denoise fMRI data and to estimate the (effective) df of the denoised process. We show that seed voxel correlations corrected for locally variable df could be tested for false positive connectivity with better control over Type I error and greater specificity of anatomical mapping than probabilistic connectivity maps using the nominal degrees of freedom. We also show that wavelet despiked statistics can be used to estimate all pairwise correlations between a set of regional nodes, assign a P value to each edge, and then iteratively add edges to the graph in order of increasing P. These probabilistically thresholded graphs are likely more robust to regional variation in head movement effects than comparable graphs constructed by thresholding correlations. Finally, we show that time-windowed estimates of df can be used for probabilistic connectivity testing or dynamic network analysis so that apparent changes in the functional connectome are appropriately corrected for the effects of transient noise bursts. Wavelet despiking is both an algorithm for fMRI time series denoising and an estimator of the (effective) df of denoised fMRI time series. Accurate estimation of df offers many potential advantages for probabilistically thresholding functional connectivity and network statistics tested in the context of spatially variant and non-stationary noise. Code for wavelet despiking, seed correlational testing and probabilistic graph construction is freely available to download as part of the BrainWavelet Toolbox at www.brainwavelet.org. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination.
Zhao, Qibin; Zhang, Liqing; Cichocki, Andrzej
2015-09-01
CANDECOMP/PARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. To address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. To learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance.
Quantum-Like Bayesian Networks for Modeling Decision Making
Moreira, Catarina; Wichert, Andreas
2016-01-01
In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669
Association of earthquakes and faults in the San Francisco Bay area using Bayesian inference
Wesson, R.L.; Bakun, W.H.; Perkins, D.M.
2003-01-01
Bayesian inference provides a method to use seismic intensity data or instrumental locations, together with geologic and seismologic data, to make quantitative estimates of the probabilities that specific past earthquakes are associated with specific faults. Probability density functions are constructed for the location of each earthquake, and these are combined with prior probabilities through Bayes' theorem to estimate the probability that an earthquake is associated with a specific fault. Results using this method are presented here for large, preinstrumental, historical earthquakes and for recent earthquakes with instrumental locations in the San Francisco Bay region. The probabilities for individual earthquakes can be summed to construct a probabilistic frequency-magnitude relationship for a fault segment. Other applications of the technique include the estimation of the probability of background earthquakes, that is, earthquakes not associated with known or considered faults, and the estimation of the fraction of the total seismic moment associated with earthquakes less than the characteristic magnitude. Results for the San Francisco Bay region suggest that potentially damaging earthquakes with magnitudes less than the characteristic magnitudes should be expected. Comparisons of earthquake locations and the surface traces of active faults as determined from geologic data show significant disparities, indicating that a complete understanding of the relationship between earthquakes and faults remains elusive.
NASA Technical Reports Server (NTRS)
Singhal, Surendra N.
2003-01-01
The SAE G-11 RMSL Division and Probabilistic Methods Committee meeting during October 6-8 at the Best Western Sterling Inn, Sterling Heights (Detroit), Michigan is co-sponsored by US Army Tank-automotive & Armaments Command (TACOM). The meeting will provide an industry/government/academia forum to review RMSL technology; reliability and probabilistic technology; reliability-based design methods; software reliability; and maintainability standards. With over 100 members including members with national/international standing, the mission of the G-11's Probabilistic Methods Committee is to "enable/facilitate rapid deployment of probabilistic technology to enhance the competitiveness of our industries by better, faster, greener, smarter, affordable and reliable product development."
Bayesian tomography and integrated data analysis in fusion diagnostics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Dong, E-mail: lid@swip.ac.cn; Dong, Y. B.; Deng, Wei
2016-11-15
In this article, a Bayesian tomography method using non-stationary Gaussian process for a prior has been introduced. The Bayesian formalism allows quantities which bear uncertainty to be expressed in the probabilistic form so that the uncertainty of a final solution can be fully resolved from the confidence interval of a posterior probability. Moreover, a consistency check of that solution can be performed by checking whether the misfits between predicted and measured data are reasonably within an assumed data error. In particular, the accuracy of reconstructions is significantly improved by using the non-stationary Gaussian process that can adapt to the varyingmore » smoothness of emission distribution. The implementation of this method to a soft X-ray diagnostics on HL-2A has been used to explore relevant physics in equilibrium and MHD instability modes. This project is carried out within a large size inference framework, aiming at an integrated analysis of heterogeneous diagnostics.« less
Probabilistic Signal Recovery and Random Matrices
2016-12-08
applications in statistics , biomedical data analysis, quantization, dimen- sion reduction, and networks science. 1. High-dimensional inference and geometry Our...low-rank approxima- tion, with applications to community detection in networks, Annals of Statistics 44 (2016), 373–400. [7] C. Le, E. Levina, R...approximation, with applications to community detection in networks, Annals of Statistics 44 (2016), 373–400. C. Le, E. Levina, R. Vershynin, Concentration
Ambiguity and Uncertainty in Probabilistic Inference.
1984-06-01
Bulletin, 1967, 68, 29-46. *Rappoport, A., a Wllston, T. S . *individual decision behavior . Annual Review of Psychology, 1972, 23, 131-176. * Savage, L... Behavior and Human Performance, 1973, 10 40-423. Shafer , G. A. A mathematical theory of evidence. Princeton, NJ: Princeton University Press, 1976.- *~-7... S . Comparison of Bayesian and regression approaches to the study of information processing in judgment. Organizational Behavior and Human
An All-Fragments Grammar for Simple and Accurate Parsing
2012-03-21
Tsujii. Probabilistic CFG with latent annotations. In Proceedings of ACL, 2005. Slav Petrov and Dan Klein. Improved Inference for Unlexicalized Parsing. In...Proceedings of NAACL-HLT, 2007. Slav Petrov and Dan Klein. Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing. In Proceedings of...EMNLP, 2008. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings
NASA Astrophysics Data System (ADS)
Tonini, R.; Anita, G.
2011-12-01
In both worldwide and regional historical catalogues, most of the tsunamis are caused by earthquakes and a minor percentage is represented by all the other non-seismic sources. On the other hand, tsunami hazard and risk studies are often applied to very specific areas, where this global trend can be different or even inverted, depending on the kind of potential tsunamigenic sources which characterize the case study. So far, few probabilistic approaches consider the contribution of landslides and/or phenomena derived by volcanic activity, i.e. pyroclastic flows and flank collapses, as predominant in the PTHA, also because of the difficulties to estimate the correspondent recurrence time. These considerations are valid, for example, for the city of Naples, Italy, which is surrounded by a complex active volcanic system (Vesuvio, Campi Flegrei, Ischia) that presents a significant number of potential tsunami sources of non-seismic origin compared to the seismic ones. In this work we present the preliminary results of a probabilistic multi-source tsunami hazard assessment applied to Naples. The method to estimate the uncertainties will be based on Bayesian inference. This is the first step towards a more comprehensive task which will provide a tsunami risk quantification for this town in the frame of the Italian national project ByMuR (http://bymur.bo.ingv.it). This three years long ongoing project has the final objective of developing a Bayesian multi-risk methodology to quantify the risk related to different natural hazards (volcanoes, earthquakes and tsunamis) applied to the city of Naples.
Clustering network layers with the strata multilayer stochastic block model.
Stanley, Natalie; Shai, Saray; Taylor, Dane; Mucha, Peter J
2016-01-01
Multilayer networks are a useful data structure for simultaneously capturing multiple types of relationships between a set of nodes. In such networks, each relational definition gives rise to a layer. While each layer provides its own set of information, community structure across layers can be collectively utilized to discover and quantify underlying relational patterns between nodes. To concisely extract information from a multilayer network, we propose to identify and combine sets of layers with meaningful similarities in community structure. In this paper, we describe the "strata multilayer stochastic block model" (sMLSBM), a probabilistic model for multilayer community structure. The central extension of the model is that there exist groups of layers, called "strata", which are defined such that all layers in a given stratum have community structure described by a common stochastic block model (SBM). That is, layers in a stratum exhibit similar node-to-community assignments and SBM probability parameters. Fitting the sMLSBM to a multilayer network provides a joint clustering that yields node-to-community and layer-to-stratum assignments, which cooperatively aid one another during inference. We describe an algorithm for separating layers into their appropriate strata and an inference technique for estimating the SBM parameters for each stratum. We demonstrate our method using synthetic networks and a multilayer network inferred from data collected in the Human Microbiome Project.
Clustering network layers with the strata multilayer stochastic block model
Stanley, Natalie; Shai, Saray; Taylor, Dane; Mucha, Peter J.
2016-01-01
Multilayer networks are a useful data structure for simultaneously capturing multiple types of relationships between a set of nodes. In such networks, each relational definition gives rise to a layer. While each layer provides its own set of information, community structure across layers can be collectively utilized to discover and quantify underlying relational patterns between nodes. To concisely extract information from a multilayer network, we propose to identify and combine sets of layers with meaningful similarities in community structure. In this paper, we describe the “strata multilayer stochastic block model” (sMLSBM), a probabilistic model for multilayer community structure. The central extension of the model is that there exist groups of layers, called “strata”, which are defined such that all layers in a given stratum have community structure described by a common stochastic block model (SBM). That is, layers in a stratum exhibit similar node-to-community assignments and SBM probability parameters. Fitting the sMLSBM to a multilayer network provides a joint clustering that yields node-to-community and layer-to-stratum assignments, which cooperatively aid one another during inference. We describe an algorithm for separating layers into their appropriate strata and an inference technique for estimating the SBM parameters for each stratum. We demonstrate our method using synthetic networks and a multilayer network inferred from data collected in the Human Microbiome Project. PMID:28435844
The penumbra of learning: a statistical theory of synaptic tagging and capture.
Gershman, Samuel J
2014-01-01
Learning in humans and animals is accompanied by a penumbra: Learning one task benefits from learning an unrelated task shortly before or after. At the cellular level, the penumbra of learning appears when weak potentiation of one synapse is amplified by strong potentiation of another synapse on the same neuron during a critical time window. Weak potentiation sets a molecular tag that enables the synapse to capture plasticity-related proteins synthesized in response to strong potentiation at another synapse. This paper describes a computational model which formalizes synaptic tagging and capture in terms of statistical learning mechanisms. According to this model, synaptic strength encodes a probabilistic inference about the dynamically changing association between pre- and post-synaptic firing rates. The rate of change is itself inferred, coupling together different synapses on the same neuron. When the inputs to one synapse change rapidly, the inferred rate of change increases, amplifying learning at other synapses.
Can circular inference relate the neuropathological and behavioral aspects of schizophrenia?
Leptourgos, Pantelis; Denève, Sophie; Jardri, Renaud
2017-10-01
Schizophrenia is a complex and heterogeneous mental disorder, and researchers have only recently begun to understand its neuropathology. However, since the time of Kraepelin and Bleuler, much information has been accumulated regarding the behavioral abnormalities usually encountered in patients suffering from schizophrenia. Despite recent progress, how the latter are caused by the former is still debated. Here, we argue that circular inference, a computational framework proposed as a potential explanation for various schizophrenia symptoms, could help end this debate. Based on Marr's three levels of analysis, we discuss how impairments in local and more global neural circuits could generate aberrant beliefs, with far-ranging consequences from probabilistic decision making to high-level visual perception in conditions of ambiguity. Interestingly, the circular inference framework appears to be compatible with a variety of pathophysiological theories of schizophrenia while simulating the behavioral symptoms. Copyright © 2017 Elsevier Ltd. All rights reserved.
Probabilistic measures of persistence and extinction in measles (meta)populations.
Gunning, Christian E; Wearing, Helen J
2013-08-01
Persistence and extinction are fundamental processes in ecological systems that are difficult to accurately measure due to stochasticity and incomplete observation. Moreover, these processes operate on multiple scales, from individual populations to metapopulations. Here, we examine an extensive new data set of measles case reports and associated demographics in pre-vaccine era US cities, alongside a classic England & Wales data set. We first infer the per-population quasi-continuous distribution of log incidence. We then use stochastic, spatially implicit metapopulation models to explore the frequency of rescue events and apparent extinctions. We show that, unlike critical community size, the inferred distributions account for observational processes, allowing direct comparisons between metapopulations. The inferred distributions scale with population size. We use these scalings to estimate extinction boundary probabilities. We compare these predictions with measurements in individual populations and random aggregates of populations, highlighting the importance of medium-sized populations in metapopulation persistence. © 2013 John Wiley & Sons Ltd/CNRS.
Probabilistic finite elements for fracture mechanics
NASA Technical Reports Server (NTRS)
Besterfield, Glen
1988-01-01
The probabilistic finite element method (PFEM) is developed for probabilistic fracture mechanics (PFM). A finite element which has the near crack-tip singular strain embedded in the element is used. Probabilistic distributions, such as expectation, covariance and correlation stress intensity factors, are calculated for random load, random material and random crack length. The method is computationally quite efficient and can be expected to determine the probability of fracture or reliability.
Probabilistic Structural Analysis Methods (PSAM) for select space propulsion systems components
NASA Technical Reports Server (NTRS)
1991-01-01
Summarized here is the technical effort and computer code developed during the five year duration of the program for probabilistic structural analysis methods. The summary includes a brief description of the computer code manuals and a detailed description of code validation demonstration cases for random vibrations of a discharge duct, probabilistic material nonlinearities of a liquid oxygen post, and probabilistic buckling of a transfer tube liner.
Potthoff, Denise; Seitz, Rüdiger J
2015-12-01
Humans typically make probabilistic inferences about another person's affective state based on her/his bodily movements such as emotional facial expressions, emblematic gestures and whole body movements. Furthermore, humans deduce tentative predictions about the other person's intentions. Thus, the first person perspective of a subject is supplemented by the second person perspective involving theory of mind and empathy. Neuroimaging investigations have shown that the medial and lateral frontal cortex are critical nodes in the circuits underlying theory of mind, empathy, as well as intention of action. It is suggested that personal perspective taking in social interactions is paradigmatic for the capability of humans to generate probabilistic accounts of the outside world that underlie a person's control of behaviour. Copyright © 2015 Elsevier Ltd. All rights reserved.
Perspective: Stochastic magnetic devices for cognitive computing
NASA Astrophysics Data System (ADS)
Roy, Kaushik; Sengupta, Abhronil; Shim, Yong
2018-06-01
Stochastic switching of nanomagnets can potentially enable probabilistic cognitive hardware consisting of noisy neural and synaptic components. Furthermore, computational paradigms inspired from the Ising computing model require stochasticity for achieving near-optimality in solutions to various types of combinatorial optimization problems such as the Graph Coloring Problem or the Travelling Salesman Problem. Achieving optimal solutions in such problems are computationally exhaustive and requires natural annealing to arrive at the near-optimal solutions. Stochastic switching of devices also finds use in applications involving Deep Belief Networks and Bayesian Inference. In this article, we provide a multi-disciplinary perspective across the stack of devices, circuits, and algorithms to illustrate how the stochastic switching dynamics of spintronic devices in the presence of thermal noise can provide a direct mapping to the computational units of such probabilistic intelligent systems.
NASA Astrophysics Data System (ADS)
Fei, Cheng-Wei; Bai, Guang-Chen
2014-12-01
To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.
Probabilistic Common Spatial Patterns for Multichannel EEG Analysis
Chen, Zhe; Gao, Xiaorong; Li, Yuanqing; Brown, Emery N.; Gao, Shangkai
2015-01-01
Common spatial patterns (CSP) is a well-known spatial filtering algorithm for multichannel electroencephalogram (EEG) analysis. In this paper, we cast the CSP algorithm in a probabilistic modeling setting. Specifically, probabilistic CSP (P-CSP) is proposed as a generic EEG spatio-temporal modeling framework that subsumes the CSP and regularized CSP algorithms. The proposed framework enables us to resolve the overfitting issue of CSP in a principled manner. We derive statistical inference algorithms that can alleviate the issue of local optima. In particular, an efficient algorithm based on eigendecomposition is developed for maximum a posteriori (MAP) estimation in the case of isotropic noise. For more general cases, a variational algorithm is developed for group-wise sparse Bayesian learning for the P-CSP model and for automatically determining the model size. The two proposed algorithms are validated on a simulated data set. Their practical efficacy is also demonstrated by successful applications to single-trial classifications of three motor imagery EEG data sets and by the spatio-temporal pattern analysis of one EEG data set recorded in a Stroop color naming task. PMID:26005228
NASA Technical Reports Server (NTRS)
Abbey, Craig K.; Eckstein, Miguel P.
2002-01-01
We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes.
Schwartenbeck, Philipp; FitzGerald, Thomas H B; Mathys, Christoph; Dolan, Ray; Friston, Karl
2015-10-01
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a "limited offer" game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing. © The Author 2014. Published by Oxford University Press.
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes
Schwartenbeck, Philipp; FitzGerald, Thomas H. B.; Mathys, Christoph; Dolan, Ray; Friston, Karl
2015-01-01
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a “limited offer” game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing. PMID:25056572
NASA Astrophysics Data System (ADS)
Song, Lu-Kai; Wen, Jie; Fei, Cheng-Wei; Bai, Guang-Chen
2018-05-01
To improve the computing efficiency and precision of probabilistic design for multi-failure structure, a distributed collaborative probabilistic design method-based fuzzy neural network of regression (FR) (called as DCFRM) is proposed with the integration of distributed collaborative response surface method and fuzzy neural network regression model. The mathematical model of DCFRM is established and the probabilistic design idea with DCFRM is introduced. The probabilistic analysis of turbine blisk involving multi-failure modes (deformation failure, stress failure and strain failure) was investigated by considering fluid-structure interaction with the proposed method. The distribution characteristics, reliability degree, and sensitivity degree of each failure mode and overall failure mode on turbine blisk are obtained, which provides a useful reference for improving the performance and reliability of aeroengine. Through the comparison of methods shows that the DCFRM reshapes the probability of probabilistic analysis for multi-failure structure and improves the computing efficiency while keeping acceptable computational precision. Moreover, the proposed method offers a useful insight for reliability-based design optimization of multi-failure structure and thereby also enriches the theory and method of mechanical reliability design.
Causal learning and inference as a rational process: the new synthesis.
Holyoak, Keith J; Cheng, Patricia W
2011-01-01
Over the past decade, an active line of research within the field of human causal learning and inference has converged on a general representational framework: causal models integrated with bayesian probabilistic inference. We describe this new synthesis, which views causal learning and inference as a fundamentally rational process, and review a sample of the empirical findings that support the causal framework over associative alternatives. Causal events, like all events in the distal world as opposed to our proximal perceptual input, are inherently unobservable. A central assumption of the causal approach is that humans (and potentially nonhuman animals) have been designed in such a way as to infer the most invariant causal relations for achieving their goals based on observed events. In contrast, the associative approach assumes that learners only acquire associations among important observed events, omitting the representation of the distal relations. By incorporating bayesian inference over distributions of causal strength and causal structures, along with noisy-logical (i.e., causal) functions for integrating the influences of multiple causes on a single effect, human judgments about causal strength and structure can be predicted accurately for relatively simple causal structures. Dynamic models of learning based on the causal framework can explain patterns of acquisition observed with serial presentation of contingency data and are consistent with available neuroimaging data. The approach has been extended to a diverse range of inductive tasks, including category-based and analogical inferences.
A Framework for Inferring Taxonomic Class of Asteroids.
NASA Technical Reports Server (NTRS)
Dotson, J. L.; Mathias, D. L.
2017-01-01
Introduction: Taxonomic classification of asteroids based on their visible / near-infrared spectra or multi band photometry has proven to be a useful tool to infer other properties about asteroids. Meteorite analogs have been identified for several taxonomic classes, permitting detailed inference about asteroid composition. Trends have been identified between taxonomy and measured asteroid density. Thanks to NEOWise (Near-Earth-Object Wide-field Infrared Survey Explorer) and Spitzer (Spitzer Space Telescope), approximately twice as many asteroids have measured albedos than the number with taxonomic classifications. (If one only considers spectroscopically determined classifications, the ratio is greater than 40.) We present a Bayesian framework that provides probabilistic estimates of the taxonomic class of an asteroid based on its albedo. Although probabilistic estimates of taxonomic classes are not a replacement for spectroscopic or photometric determinations, they can be a useful tool for identifying objects for further study or for asteroid threat assessment models. Inputs and Framework: The framework relies upon two inputs: the expected fraction of each taxonomic class in the population and the albedo distribution of each class. Luckily, numerous authors have addressed both of these questions. For example, the taxonomic distribution by number, surface area and mass of the main belt has been estimated and a diameter limited estimate of fractional abundances of the near earth asteroid population was made. Similarly, the albedo distributions for taxonomic classes have been estimated for the combined main belt and NEA (Near Earth Asteroid) populations in different taxonomic systems and for the NEA population specifically. The framework utilizes a Bayesian inference appropriate for categorical data. The population fractions provide the prior while the albedo distributions allow calculation of the likelihood an albedo measurement is consistent with a given taxonomic class. These inputs allows calculation of the probability an asteroid with a specified albedo belongs to any given taxonomic class.
VizieR Online Data Catalog: A catalog of exoplanet physical parameters (Foreman-Mackey+, 2014)
NASA Astrophysics Data System (ADS)
Foreman-Mackey, D.; Hogg, D. W.; Morton, T. D.
2017-05-01
The first ingredient for any probabilistic inference is a likelihood function, a description of the probability of observing a specific data set given a set of model parameters. In this particular project, the data set is a catalog of exoplanet measurements and the model parameters are the values that set the shape and normalization of the occurrence rate density. (2 data files).
Deductive and inductive reasoning in obsessive-compulsive disorder.
Pélissier, Marie-Claude; O'Connor, Kieron P
2002-03-01
This study tested the hypothesis that people with obsessive-compulsive disorder (OCD) show an inductive reasoning style distinct from people with generalized anxiety disorder (GAD) and from participants in a non-anxious (NA) control group. The experimental procedure consisted of administering a range of six deductive and inductive tasks and a probabilistic task in order to compare reasoning processes between groups. Recruitment was in the Montreal area within a French-speaking population. The participants were 12 people with OCD, 12 NA controls and 10 people with GAD. Participants completed a series of written and oral reasoning tasks including the Wason Selection Task, a Bayesian probability task and other inductive tasks, designed by the authors. There were no differences between groups in deductive reasoning. On an inductive "bridging task", the participants with OCD always took longer than the NA control and GAD groups to infer a link between two statements and to elaborate on this possible link. The OCD group alone showed a significant decrease in their degree of conviction about an arbitrary statement after inductively generating reasons to support this statement. Differences in probabilistic reasoning replicated those of previous authors. The results pinpoint the importance of examining inference processes in people with OCD in order to further refine the clinical applications of behavioural-cognitive therapy for this disorder.
Hamiltonian Monte Carlo acceleration using surrogate functions with random bases.
Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai
2017-11-01
For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov chain Monte Carlo methods, namely, Hamiltonian Monte Carlo. The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the-art methods.
Word-level language modeling for P300 spellers based on discriminative graphical models
NASA Astrophysics Data System (ADS)
Delgado Saa, Jaime F.; de Pesters, Adriana; McFarland, Dennis; Çetin, Müjdat
2015-04-01
Objective. In this work we propose a probabilistic graphical model framework that uses language priors at the level of words as a mechanism to increase the performance of P300-based spellers. Approach. This paper is concerned with brain-computer interfaces based on P300 spellers. Motivated by P300 spelling scenarios involving communication based on a limited vocabulary, we propose a probabilistic graphical model framework and an associated classification algorithm that uses learned statistical models of language at the level of words. Exploiting such high-level contextual information helps reduce the error rate of the speller. Main results. Our experimental results demonstrate that the proposed approach offers several advantages over existing methods. Most importantly, it increases the classification accuracy while reducing the number of times the letters need to be flashed, increasing the communication rate of the system. Significance. The proposed approach models all the variables in the P300 speller in a unified framework and has the capability to correct errors in previous letters in a word, given the data for the current one. The structure of the model we propose allows the use of efficient inference algorithms, which in turn makes it possible to use this approach in real-time applications.
Yu, Ron X.; Liu, Jie; True, Nick; Wang, Wei
2008-01-01
A major challenge in the post-genome era is to reconstruct regulatory networks from the biological knowledge accumulated up to date. The development of tools for identifying direct target genes of transcription factors (TFs) is critical to this endeavor. Given a set of microarray experiments, a probabilistic model called TRANSMODIS has been developed which can infer the direct targets of a TF by integrating sequence motif, gene expression and ChIP-chip data. The performance of TRANSMODIS was first validated on a set of transcription factor perturbation experiments (TFPEs) involving Pho4p, a well studied TF in Saccharomyces cerevisiae. TRANSMODIS removed elements of arbitrariness in manual target gene selection process and produced results that concur with one's intuition. TRANSMODIS was further validated on a genome-wide scale by comparing it with two other methods in Saccharomyces cerevisiae. The usefulness of TRANSMODIS was then demonstrated by applying it to the identification of direct targets of DAF-16, a critical TF regulating ageing in Caenorhabditis elegans. We found that 189 genes were tightly regulated by DAF-16. In addition, DAF-16 has differential preference for motifs when acting as an activator or repressor, which awaits experimental verification. TRANSMODIS is computationally efficient and robust, making it a useful probabilistic framework for finding immediate targets. PMID:18350157
Probabilistic inference in discrete spaces can be implemented into networks of LIF neurons.
Probst, Dimitri; Petrovici, Mihai A; Bytschok, Ilja; Bill, Johannes; Pecevski, Dejan; Schemmel, Johannes; Meier, Karlheinz
2015-01-01
The means by which cortical neural networks are able to efficiently solve inference problems remains an open question in computational neuroscience. Recently, abstract models of Bayesian computation in neural circuits have been proposed, but they lack a mechanistic interpretation at the single-cell level. In this article, we describe a complete theoretical framework for building networks of leaky integrate-and-fire neurons that can sample from arbitrary probability distributions over binary random variables. We test our framework for a model inference task based on a psychophysical phenomenon (the Knill-Kersten optical illusion) and further assess its performance when applied to randomly generated distributions. As the local computations performed by the network strongly depend on the interaction between neurons, we compare several types of couplings mediated by either single synapses or interneuron chains. Due to its robustness to substrate imperfections such as parameter noise and background noise correlations, our model is particularly interesting for implementation on novel, neuro-inspired computing architectures, which can thereby serve as a fast, low-power substrate for solving real-world inference problems.
Jang, Sumin; Choubey, Sandeep; Furchtgott, Leon; Zou, Ling-Nan; Doyle, Adele; Menon, Vilas; Loew, Ethan B; Krostag, Anne-Rachel; Martinez, Refugio A; Madisen, Linda; Levi, Boaz P; Ramanathan, Sharad
2017-01-01
The complexity of gene regulatory networks that lead multipotent cells to acquire different cell fates makes a quantitative understanding of differentiation challenging. Using a statistical framework to analyze single-cell transcriptomics data, we infer the gene expression dynamics of early mouse embryonic stem (mES) cell differentiation, uncovering discrete transitions across nine cell states. We validate the predicted transitions across discrete states using flow cytometry. Moreover, using live-cell microscopy, we show that individual cells undergo abrupt transitions from a naïve to primed pluripotent state. Using the inferred discrete cell states to build a probabilistic model for the underlying gene regulatory network, we further predict and experimentally verify that these states have unique response to perturbations, thus defining them functionally. Our study provides a framework to infer the dynamics of differentiation from single cell transcriptomics data and to build predictive models of the gene regulatory networks that drive the sequence of cell fate decisions during development. DOI: http://dx.doi.org/10.7554/eLife.20487.001 PMID:28296635
Truccolo, Wilson
2017-01-01
This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics (“order parameters”) inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. PMID:28336305
Truccolo, Wilson
2016-11-01
This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics ("order parameters") inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. Published by Elsevier Ltd.
Probabilistic inference in discrete spaces can be implemented into networks of LIF neurons
Probst, Dimitri; Petrovici, Mihai A.; Bytschok, Ilja; Bill, Johannes; Pecevski, Dejan; Schemmel, Johannes; Meier, Karlheinz
2015-01-01
The means by which cortical neural networks are able to efficiently solve inference problems remains an open question in computational neuroscience. Recently, abstract models of Bayesian computation in neural circuits have been proposed, but they lack a mechanistic interpretation at the single-cell level. In this article, we describe a complete theoretical framework for building networks of leaky integrate-and-fire neurons that can sample from arbitrary probability distributions over binary random variables. We test our framework for a model inference task based on a psychophysical phenomenon (the Knill-Kersten optical illusion) and further assess its performance when applied to randomly generated distributions. As the local computations performed by the network strongly depend on the interaction between neurons, we compare several types of couplings mediated by either single synapses or interneuron chains. Due to its robustness to substrate imperfections such as parameter noise and background noise correlations, our model is particularly interesting for implementation on novel, neuro-inspired computing architectures, which can thereby serve as a fast, low-power substrate for solving real-world inference problems. PMID:25729361
Inferring Social Influence of Anti-Tobacco Mass Media Campaign.
Zhan, Qianyi; Zhang, Jiawei; Yu, Philip S; Emery, Sherry; Xie, Junyuan
2017-07-01
Anti-tobacco mass media campaigns are designed to influence tobacco users. It has been proved that campaigns will produce users' changes in awareness, knowledge, and attitudes, and also produce meaningful behavior change of audience. Anti-smoking television advertising is the most important part in the campaign. Meanwhile, nowadays, successful online social networks are creating new media environment, however, little is known about the relation between social conversations and anti-tobacco campaigns. This paper aims to infer social influence of these campaigns, and the problem is formally referred to as the Social Influence inference of anti-Tobacco mass mEdia campaigns (Site) problem. To address the Site problem, a novel influence inference framework, TV advertising social influence estimation (Asie), is proposed based on our analysis of two real anti-tobacco campaigns. Asie divides audience attitudes toward TV ads into three distinct stages: 1) cognitive; 2) affective; and 3) conative. Audience online reactions at each of these three stages are depicted by Asie with specific probabilistic models based on the synergistic influences from both online social friends and offline TV ads. Extensive experiments demonstrate the effectiveness of Asie.
NASA Technical Reports Server (NTRS)
Thacker, B. H.; Mcclung, R. C.; Millwater, H. R.
1990-01-01
An eigenvalue analysis of a typical space propulsion system turbopump blade is presented using an approximate probabilistic analysis methodology. The methodology was developed originally to investigate the feasibility of computing probabilistic structural response using closed-form approximate models. This paper extends the methodology to structures for which simple closed-form solutions do not exist. The finite element method will be used for this demonstration, but the concepts apply to any numerical method. The results agree with detailed analysis results and indicate the usefulness of using a probabilistic approximate analysis in determining efficient solution strategies.
Using expected sequence features to improve basecalling accuracy of amplicon pyrosequencing data.
Rask, Thomas S; Petersen, Bent; Chen, Donald S; Day, Karen P; Pedersen, Anders Gorm
2016-04-22
Amplicon pyrosequencing targets a known genetic region and thus inherently produces reads highly anticipated to have certain features, such as conserved nucleotide sequence, and in the case of protein coding DNA, an open reading frame. Pyrosequencing errors, consisting mainly of nucleotide insertions and deletions, are on the other hand likely to disrupt open reading frames. Such an inverse relationship between errors and expectation based on prior knowledge can be used advantageously to guide the process known as basecalling, i.e. the inference of nucleotide sequence from raw sequencing data. The new basecalling method described here, named Multipass, implements a probabilistic framework for working with the raw flowgrams obtained by pyrosequencing. For each sequence variant Multipass calculates the likelihood and nucleotide sequence of several most likely sequences given the flowgram data. This probabilistic approach enables integration of basecalling into a larger model where other parameters can be incorporated, such as the likelihood for observing a full-length open reading frame at the targeted region. We apply the method to 454 amplicon pyrosequencing data obtained from a malaria virulence gene family, where Multipass generates 20 % more error-free sequences than current state of the art methods, and provides sequence characteristics that allow generation of a set of high confidence error-free sequences. This novel method can be used to increase accuracy of existing and future amplicon sequencing data, particularly where extensive prior knowledge is available about the obtained sequences, for example in analysis of the immunoglobulin VDJ region where Multipass can be combined with a model for the known recombining germline genes. Multipass is available for Roche 454 data at http://www.cbs.dtu.dk/services/MultiPass-1.0 , and the concept can potentially be implemented for other sequencing technologies as well.
Probabilistic classifiers with high-dimensional data
Kim, Kyung In; Simon, Richard
2011-01-01
For medical classification problems, it is often desirable to have a probability associated with each class. Probabilistic classifiers have received relatively little attention for small n large p classification problems despite of their importance in medical decision making. In this paper, we introduce 2 criteria for assessment of probabilistic classifiers: well-calibratedness and refinement and develop corresponding evaluation measures. We evaluated several published high-dimensional probabilistic classifiers and developed 2 extensions of the Bayesian compound covariate classifier. Based on simulation studies and analysis of gene expression microarray data, we found that proper probabilistic classification is more difficult than deterministic classification. It is important to ensure that a probabilistic classifier is well calibrated or at least not “anticonservative” using the methods developed here. We provide this evaluation for several probabilistic classifiers and also evaluate their refinement as a function of sample size under weak and strong signal conditions. We also present a cross-validation method for evaluating the calibration and refinement of any probabilistic classifier on any data set. PMID:21087946
Failed rib region prediction in a human body model during crash events with precrash braking.
Guleyupoglu, B; Koya, B; Barnard, R; Gayzik, F S
2018-02-28
The objective of this study is 2-fold. We used a validated human body finite element model to study the predicted chest injury (focusing on rib fracture as a function of element strain) based on varying levels of simulated precrash braking. Furthermore, we compare deterministic and probabilistic methods of rib injury prediction in the computational model. The Global Human Body Models Consortium (GHBMC) M50-O model was gravity settled in the driver position of a generic interior equipped with an advanced 3-point belt and airbag. Twelve cases were investigated with permutations for failure, precrash braking system, and crash severity. The severities used were median (17 kph), severe (34 kph), and New Car Assessment Program (NCAP; 56.4 kph). Cases with failure enabled removed rib cortical bone elements once 1.8% effective plastic strain was exceeded. Alternatively, a probabilistic framework found in the literature was used to predict rib failure. Both the probabilistic and deterministic methods take into consideration location (anterior, lateral, and posterior). The deterministic method is based on a rubric that defines failed rib regions dependent on a threshold for contiguous failed elements. The probabilistic method depends on age-based strain and failure functions. Kinematics between both methods were similar (peak max deviation: ΔX head = 17 mm; ΔZ head = 4 mm; ΔX thorax = 5 mm; ΔZ thorax = 1 mm). Seat belt forces at the time of probabilistic failed region initiation were lower than those at deterministic failed region initiation. The probabilistic method for rib fracture predicted more failed regions in the rib (an analog for fracture) than the deterministic method in all but 1 case where they were equal. The failed region patterns between models are similar; however, there are differences that arise due to stress reduced from element elimination that cause probabilistic failed regions to continue to rise after no deterministic failed region would be predicted. Both the probabilistic and deterministic methods indicate similar trends with regards to the effect of precrash braking; however, there are tradeoffs. The deterministic failed region method is more spatially sensitive to failure and is more sensitive to belt loads. The probabilistic failed region method allows for increased capability in postprocessing with respect to age. The probabilistic failed region method predicted more failed regions than the deterministic failed region method due to force distribution differences.
Location Contexts of User Check-Ins to Model Urban Geo Life-Style Patterns
Hasan, Samiul; Ukkusuri, Satish V.
2015-01-01
Geo-location data from social media offers us information, in new ways, to understand people's attitudes and interests through their activity choices. In this paper, we explore the idea of inferring individual life-style patterns from activity-location choices revealed in social media. We present a model to understand life-style patterns using the contextual information (e. g. location categories) of user check-ins. Probabilistic topic models are developed to infer individual geo life-style patterns from two perspectives: i) to characterize the patterns of user interests to different types of places and ii) to characterize the patterns of user visits to different neighborhoods. The method is applied to a dataset of Foursquare check-ins of the users from New York City. The co-existence of several location contexts and the corresponding probabilities in a given pattern provide useful information about user interests and choices. It is found that geo life-style patterns have similar items—either nearby neighborhoods or similar location categories. The semantic and geographic proximity of the items in a pattern reflects the hidden regularity in user preferences and location choice behavior. PMID:25970430
A Bayesian state-space approach for damage detection and classification
NASA Astrophysics Data System (ADS)
Dzunic, Zoran; Chen, Justin G.; Mobahi, Hossein; Büyüköztürk, Oral; Fisher, John W.
2017-11-01
The problem of automatic damage detection in civil structures is complex and requires a system that can interpret collected sensor data into meaningful information. We apply our recently developed switching Bayesian model for dependency analysis to the problems of damage detection and classification. The model relies on a state-space approach that accounts for noisy measurement processes and missing data, which also infers the statistical temporal dependency between measurement locations signifying the potential flow of information within the structure. A Gibbs sampling algorithm is used to simultaneously infer the latent states, parameters of the state dynamics, the dependence graph, and any changes in behavior. By employing a fully Bayesian approach, we are able to characterize uncertainty in these variables via their posterior distribution and provide probabilistic estimates of the occurrence of damage or a specific damage scenario. We also implement a single class classification method which is more realistic for most real world situations where training data for a damaged structure is not available. We demonstrate the methodology with experimental test data from a laboratory model structure and accelerometer data from a real world structure during different environmental and excitation conditions.
Morabia, Alfredo
2005-01-01
Epidemiological methods, which combine population thinking and group comparisons, can primarily identify causes of disease in populations. There is therefore a tension between our intuitive notion of a cause, which we want to be deterministic and invariant at the individual level, and the epidemiological notion of causes, which are invariant only at the population level. Epidemiologists have given heretofore a pragmatic solution to this tension. Causal inference in epidemiology consists in checking the logical coherence of a causality statement and determining whether what has been found grossly contradicts what we think we already know: how strong is the association? Is there a dose-response relationship? Does the cause precede the effect? Is the effect biologically plausible? Etc. This approach to causal inference can be traced back to the English philosophers David Hume and John Stuart Mill. On the other hand, the mode of establishing causality, devised by Jakob Henle and Robert Koch, which has been fruitful in bacteriology, requires that in every instance the effect invariably follows the cause (e.g., inoculation of Koch bacillus and tuberculosis). This is incompatible with epidemiological causality which has to deal with probabilistic effects (e.g., smoking and lung cancer), and is therefore invariant only for the population.
Phylogenetic Invariants for Metazoan Mitochondrial Genome Evolution.
Sankoff; Blanchette
1998-01-01
The method of phylogenetic invariants was developed to apply to aligned sequence data generated, according to a stochastic substitution model, for N species related through an unknown phylogenetic tree. The invariants are functions of the probabilities of the observable N-tuples, which are identically zero, over all choices of branch length, for some trees. Evaluating the invariants associated with all possible trees, using observed N-tuple frequencies over all sequence positions, enables us to rapidly infer the generating tree. An aspect of evolution at the genomic level much studied recently is the rearrangements of gene order along the chromosome from one species to another. Instead of the substitutions responsible for sequence evolution, we examine the non-local processes responsible for genome rearrangements such as inversion of arbitrarily long segments of chromosomes. By treating the potential adjacency of each possible pair of genes as a position", an appropriate substitution" model can be recognized as governing the rearrangement process, and a probabilistically principled phylogenetic inference can be set up. We calculate the invariants for this process for N=5, and apply them to mitochondrial genome data from coelomate metazoans, showing how they resolve key aspects of branching order.
Robson, Barry
2016-12-01
The Q-UEL language of XML-like tags and the associated software applications are providing a valuable toolkit for Evidence Based Medicine (EBM). In this paper the already existing applications, data bases, and tags are brought together with new ones. The particular Q-UEL embodiment used here is the BioIngine. The main challenge is one of bringing together the methods of symbolic reasoning and calculative probabilistic inference that underlie EBM and medical decision making. Some space is taken to review this background. The unification is greatly facilitated by Q-UEL's roots in the notation and algebra of Dirac, and by extending Q-UEL into the Wolfram programming environment. Further, the overall problem of integration is also a relatively simple one because of the nature of Q-UEL as a language for interoperability in healthcare and biomedicine, while the notion of workflow is facilitated because of the EBM best practice known as PICO. What remains difficult is achieving a high degree of overall automation because of a well-known difficulty in capturing human expertise in computers: the Feigenbaum bottleneck. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dominating Scale-Free Networks Using Generalized Probabilistic Methods
Molnár,, F.; Derzsy, N.; Czabarka, É.; Székely, L.; Szymanski, B. K.; Korniss, G.
2014-01-01
We study ensemble-based graph-theoretical methods aiming to approximate the size of the minimum dominating set (MDS) in scale-free networks. We analyze both analytical upper bounds of dominating sets and numerical realizations for applications. We propose two novel probabilistic dominating set selection strategies that are applicable to heterogeneous networks. One of them obtains the smallest probabilistic dominating set and also outperforms the deterministic degree-ranked method. We show that a degree-dependent probabilistic selection method becomes optimal in its deterministic limit. In addition, we also find the precise limit where selecting high-degree nodes exclusively becomes inefficient for network domination. We validate our results on several real-world networks, and provide highly accurate analytical estimates for our methods. PMID:25200937
Is probabilistic bias analysis approximately Bayesian?
MacLehose, Richard F.; Gustafson, Paul
2011-01-01
Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. While the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well. PMID:22157311
Global/local methods for probabilistic structural analysis
NASA Technical Reports Server (NTRS)
Millwater, H. R.; Wu, Y.-T.
1993-01-01
A probabilistic global/local method is proposed to reduce the computational requirements of probabilistic structural analysis. A coarser global model is used for most of the computations with a local more refined model used only at key probabilistic conditions. The global model is used to establish the cumulative distribution function (cdf) and the Most Probable Point (MPP). The local model then uses the predicted MPP to adjust the cdf value. The global/local method is used within the advanced mean value probabilistic algorithm. The local model can be more refined with respect to the g1obal model in terms of finer mesh, smaller time step, tighter tolerances, etc. and can be used with linear or nonlinear models. The basis for this approach is described in terms of the correlation between the global and local models which can be estimated from the global and local MPPs. A numerical example is presented using the NESSUS probabilistic structural analysis program with the finite element method used for the structural modeling. The results clearly indicate a significant computer savings with minimal loss in accuracy.
Global/local methods for probabilistic structural analysis
NASA Astrophysics Data System (ADS)
Millwater, H. R.; Wu, Y.-T.
1993-04-01
A probabilistic global/local method is proposed to reduce the computational requirements of probabilistic structural analysis. A coarser global model is used for most of the computations with a local more refined model used only at key probabilistic conditions. The global model is used to establish the cumulative distribution function (cdf) and the Most Probable Point (MPP). The local model then uses the predicted MPP to adjust the cdf value. The global/local method is used within the advanced mean value probabilistic algorithm. The local model can be more refined with respect to the g1obal model in terms of finer mesh, smaller time step, tighter tolerances, etc. and can be used with linear or nonlinear models. The basis for this approach is described in terms of the correlation between the global and local models which can be estimated from the global and local MPPs. A numerical example is presented using the NESSUS probabilistic structural analysis program with the finite element method used for the structural modeling. The results clearly indicate a significant computer savings with minimal loss in accuracy.
A decision network account of reasoning about other people's choices
Jern, Alan; Kemp, Charles
2015-01-01
The ability to predict and reason about other people's choices is fundamental to social interaction. We propose that people reason about other people's choices using mental models that are similar to decision networks. Decision networks are extensions of Bayesian networks that incorporate the idea that choices are made in order to achieve goals. In our first experiment, we explore how people predict the choices of others. Our remaining three experiments explore how people infer the goals and knowledge of others by observing the choices that they make. We show that decision networks account for our data better than alternative computational accounts that do not incorporate the notion of goal-directed choice or that do not rely on probabilistic inference. PMID:26010559
Statistically optimal perception and learning: from behavior to neural representations
Fiser, József; Berkes, Pietro; Orbán, Gergő; Lengyel, Máté
2010-01-01
Human perception has recently been characterized as statistical inference based on noisy and ambiguous sensory inputs. Moreover, suitable neural representations of uncertainty have been identified that could underlie such probabilistic computations. In this review, we argue that learning an internal model of the sensory environment is another key aspect of the same statistical inference procedure and thus perception and learning need to be treated jointly. We review evidence for statistically optimal learning in humans and animals, and reevaluate possible neural representations of uncertainty based on their potential to support statistically optimal learning. We propose that spontaneous activity can have a functional role in such representations leading to a new, sampling-based, framework of how the cortex represents information and uncertainty. PMID:20153683
Action understanding as inverse planning.
Baker, Chris L; Saxe, Rebecca; Tenenbaum, Joshua B
2009-12-01
Humans are adept at inferring the mental states underlying other agents' actions, such as goals, beliefs, desires, emotions and other thoughts. We propose a computational framework based on Bayesian inverse planning for modeling human action understanding. The framework represents an intuitive theory of intentional agents' behavior based on the principle of rationality: the expectation that agents will plan approximately rationally to achieve their goals, given their beliefs about the world. The mental states that caused an agent's behavior are inferred by inverting this model of rational planning using Bayesian inference, integrating the likelihood of the observed actions with the prior over mental states. This approach formalizes in precise probabilistic terms the essence of previous qualitative approaches to action understanding based on an "intentional stance" [Dennett, D. C. (1987). The intentional stance. Cambridge, MA: MIT Press] or a "teleological stance" [Gergely, G., Nádasdy, Z., Csibra, G., & Biró, S. (1995). Taking the intentional stance at 12 months of age. Cognition, 56, 165-193]. In three psychophysical experiments using animated stimuli of agents moving in simple mazes, we assess how well different inverse planning models based on different goal priors can predict human goal inferences. The results provide quantitative evidence for an approximately rational inference mechanism in human goal inference within our simplified stimulus paradigm, and for the flexible nature of goal representations that human observers can adopt. We discuss the implications of our experimental results for human action understanding in real-world contexts, and suggest how our framework might be extended to capture other kinds of mental state inferences, such as inferences about beliefs, or inferring whether an entity is an intentional agent.
[National Health and Nutrition Survey 2012: design and coverage].
Romero-Martínez, Martín; Shamah-Levy, Teresa; Franco-Núñez, Aurora; Villalpando, Salvador; Cuevas-Nasu, Lucía; Gutiérrez, Juan Pablo; Rivera-Dommarco, Juan Ángel
2013-01-01
To describe the design and population coverage of the National Health and Nutrition Survey 2012 (NHNS 2012). The design of the NHNS 2012 is reported, as a probabilistic population based survey with a multi-stage and stratified sampling, as well as the sample inferential properties, the logistical procedures, and the obtained coverage. Household response rate for the NHNS 2012 was 87%, completing data from 50,528 households, where 96 031 individual interviews selected by age and 14,104 of ambulatory health services users were also obtained. The probabilistic design of the NHNS 2012 as well as its coverage allowed to generate inferences about health and nutrition conditions, health programs coverage, and access to health services. Because of their complex designs, all estimations from the NHNS 2012 must use the survey design: weights, primary sampling units, and stratus variables.
Improved Point-source Detection in Crowded Fields Using Probabilistic Cataloging
NASA Astrophysics Data System (ADS)
Portillo, Stephen K. N.; Lee, Benjamin C. G.; Daylan, Tansu; Finkbeiner, Douglas P.
2017-10-01
Cataloging is challenging in crowded fields because sources are extremely covariant with their neighbors and blending makes even the number of sources ambiguous. We present the first optical probabilistic catalog, cataloging a crowded (˜0.1 sources per pixel brighter than 22nd mag in F606W) Sloan Digital Sky Survey r-band image from M2. Probabilistic cataloging returns an ensemble of catalogs inferred from the image and thus can capture source-source covariance and deblending ambiguities. By comparing to a traditional catalog of the same image and a Hubble Space Telescope catalog of the same region, we show that our catalog ensemble better recovers sources from the image. It goes more than a magnitude deeper than the traditional catalog while having a lower false-discovery rate brighter than 20th mag. We also present an algorithm for reducing this catalog ensemble to a condensed catalog that is similar to a traditional catalog, except that it explicitly marginalizes over source-source covariances and nuisance parameters. We show that this condensed catalog has a similar completeness and false-discovery rate to the catalog ensemble. Future telescopes will be more sensitive, and thus more of their images will be crowded. Probabilistic cataloging performs better than existing software in crowded fields and so should be considered when creating photometric pipelines in the Large Synoptic Survey Telescope era.
Probabilistic Learning in Junior High School: Investigation of Student Probabilistic Thinking Levels
NASA Astrophysics Data System (ADS)
Kurniasih, R.; Sujadi, I.
2017-09-01
This paper was to investigate level on students’ probabilistic thinking. Probabilistic thinking level is level of probabilistic thinking. Probabilistic thinking is thinking about probabilistic or uncertainty matter in probability material. The research’s subject was students in grade 8th Junior High School students. The main instrument is a researcher and a supporting instrument is probabilistic thinking skills test and interview guidelines. Data was analyzed using triangulation method. The results showed that the level of students probabilistic thinking before obtaining a teaching opportunity at the level of subjective and transitional. After the students’ learning level probabilistic thinking is changing. Based on the results of research there are some students who have in 8th grade level probabilistic thinking numerically highest of levels. Level of students’ probabilistic thinking can be used as a reference to make a learning material and strategy.
Inferences about nested subsets structure when not all species are detected
Cam, E.; Nichols, J.D.; Hines, J.E.; Sauer, J.R.
2000-01-01
Comparisons of species composition among ecological communities of different size have often provided evidence that the species in communities with lower species richness form nested subsets of the species in larger communities. In the vast majority of studies, the question of nested subsets has been addressed using information on presence-absence, where a '0' is interpreted as the absence of a given species from a given location. Most of the methodological discussion in earlier studies investigating nestedness concerns the approach to generation of model-based matrices. However, it is most likely that in many situations investigators cannot detect all the species present in the location sampled. The possibility that zeros in incidence matrices reflect nondetection rather than absence of species has not been considered in studies addressing nested subsets, even though the position of zeros in these matrices forms the basis of earlier inference methods. These sampling artifacts are likely to lead to erroneous conclusions about both variation over space in species richness and the degree of similarity of the various locations. Here we propose an approach to investigation of nestedness, based on statistical inference methods explicitly incorporating species detection probability, that take into account the probabilistic nature of the sampling process. We use presence-absence data collected under Pollock?s robust capture-recapture design, and resort to an estimator of species richness originally developed for closed populations to assess the proportion of species shared by different locations. We develop testable predictions corresponding to the null hypothesis of a nonnested pattern, and an alternative hypothesis of perfect nestedness. We also present an index for assessing the degree of nestedness of a system of ecological communities. We illustrate our approach using avian data from the North American Breeding Bird Survey collected in Florida Keys.
Eaton, Jeffrey W.; Bao, Le
2017-01-01
Objectives The aim of the study was to propose and demonstrate an approach to allow additional nonsampling uncertainty about HIV prevalence measured at antenatal clinic sentinel surveillance (ANC-SS) in model-based inferences about trends in HIV incidence and prevalence. Design Mathematical model fitted to surveillance data with Bayesian inference. Methods We introduce a variance inflation parameter σinfl2 that accounts for the uncertainty of nonsampling errors in ANC-SS prevalence. It is additive to the sampling error variance. Three approaches are tested for estimating σinfl2 using ANC-SS and household survey data from 40 subnational regions in nine countries in sub-Saharan, as defined in UNAIDS 2016 estimates. Methods were compared using in-sample fit and out-of-sample prediction of ANC-SS data, fit to household survey prevalence data, and the computational implications. Results Introducing the additional variance parameter σinfl2 increased the error variance around ANC-SS prevalence observations by a median of 2.7 times (interquartile range 1.9–3.8). Using only sampling error in ANC-SS prevalence ( σinfl2=0), coverage of 95% prediction intervals was 69% in out-of-sample prediction tests. This increased to 90% after introducing the additional variance parameter σinfl2. The revised probabilistic model improved model fit to household survey prevalence and increased epidemic uncertainty intervals most during the early epidemic period before 2005. Estimating σinfl2 did not increase the computational cost of model fitting. Conclusions: We recommend estimating nonsampling error in ANC-SS as an additional parameter in Bayesian inference using the Estimation and Projection Package model. This approach may prove useful for incorporating other data sources such as routine prevalence from Prevention of mother-to-child transmission testing into future epidemic estimates. PMID:28296801
Probabilistic reconstruction of GPS vertical ground motion and comparison with GIA models
NASA Astrophysics Data System (ADS)
Husson, Laurent; Bodin, Thomas; Choblet, Gael; Kreemer, Corné
2017-04-01
The vertical position time-series of GPS stations have become long enough for many parts of the world to infer modern rates of vertical ground motion. We use the worldwide compilation of GPS trend velocities of the Nevada Geodetic Laboratory. Those rates are inferred by applying the MIDAS algorithm (Blewitt et al., 2016) to time-series obtained from publicly available data from permanent stations. Because MIDAS filters out seasonality and discontinuities, regardless of their causes, it gives robust long-term rates of vertical ground motion (except where there is significant postseismic deformation). As the stations are unevenly distributed, and because data errors are also highly variable, sometimes to an unknown degree, we use a Bayesian inference method to reconstruct 2D maps of vertical ground motion. Our models are based on a Voronoi tessellation and self-adapt to the spatially variable level of information provided by the data. Instead of providing a unique interpolated surface, each point of the reconstructed surface is defined through a probability density function. We apply our method to a series of vast regions covering entire continents. Not surprisingly, the reconstructed surface at a long wavelength is dominated by the GIA. This result can be exploited to evaluate whether forward models of GIA reproduce geodetic rates within the uncertainties derived from our interpolation, not only at high latitudes where postglacial rebound is fast, but also in more temperate latitudes where, for instance, such rates may compete with modern sea level rise. At shorter wavelengths, the reconstructed surface of vertical ground motion features a variety of identifiable patterns, whose geometries and rates can be mapped. Examples are transient dynamic topography over the convecting mantle, actively deforming domains (mountain belts and active margins), volcanic areas, or anthropogenic contributions.
Bayesian quantitative precipitation forecasts in terms of quantiles
NASA Astrophysics Data System (ADS)
Bentzien, Sabrina; Friederichs, Petra
2014-05-01
Ensemble prediction systems (EPS) for numerical weather predictions on the mesoscale are particularly developed to obtain probabilistic guidance for high impact weather. An EPS not only issues a deterministic future state of the atmosphere but a sample of possible future states. Ensemble postprocessing then translates such a sample of forecasts into probabilistic measures. This study focus on probabilistic quantitative precipitation forecasts in terms of quantiles. Quantiles are particular suitable to describe precipitation at various locations, since no assumption is required on the distribution of precipitation. The focus is on the prediction during high-impact events and related to the Volkswagen Stiftung funded project WEX-MOP (Mesoscale Weather Extremes - Theory, Spatial Modeling and Prediction). Quantile forecasts are derived from the raw ensemble and via quantile regression. Neighborhood method and time-lagging are effective tools to inexpensively increase the ensemble spread, which results in more reliable forecasts especially for extreme precipitation events. Since an EPS provides a large amount of potentially informative predictors, a variable selection is required in order to obtain a stable statistical model. A Bayesian formulation of quantile regression allows for inference about the selection of predictive covariates by the use of appropriate prior distributions. Moreover, the implementation of an additional process layer for the regression parameters accounts for spatial variations of the parameters. Bayesian quantile regression and its spatially adaptive extension is illustrated for the German-focused mesoscale weather prediction ensemble COSMO-DE-EPS, which runs (pre)operationally since December 2010 at the German Meteorological Service (DWD). Objective out-of-sample verification uses the quantile score (QS), a weighted absolute error between quantile forecasts and observations. The QS is a proper scoring function and can be decomposed into reliability, resolutions and uncertainty parts. A quantile reliability plot gives detailed insights in the predictive performance of the quantile forecasts.
A Parallel and Incremental Approach for Data-Intensive Learning of Bayesian Networks.
Yue, Kun; Fang, Qiyu; Wang, Xiaoling; Li, Jin; Liu, Weiyi
2015-12-01
Bayesian network (BN) has been adopted as the underlying model for representing and inferring uncertain knowledge. As the basis of realistic applications centered on probabilistic inferences, learning a BN from data is a critical subject of machine learning, artificial intelligence, and big data paradigms. Currently, it is necessary to extend the classical methods for learning BNs with respect to data-intensive computing or in cloud environments. In this paper, we propose a parallel and incremental approach for data-intensive learning of BNs from massive, distributed, and dynamically changing data by extending the classical scoring and search algorithm and using MapReduce. First, we adopt the minimum description length as the scoring metric and give the two-pass MapReduce-based algorithms for computing the required marginal probabilities and scoring the candidate graphical model from sample data. Then, we give the corresponding strategy for extending the classical hill-climbing algorithm to obtain the optimal structure, as well as that for storing a BN by
Life beyond MSE and R2 — improving validation of predictive models with observations
NASA Astrophysics Data System (ADS)
Papritz, Andreas; Nussbaum, Madlene
2017-04-01
Machine learning and statistical predictive methods are evaluated by the closeness of predictions to observations of a test dataset. Common criteria for rating predictive methods are bias and mean square error (MSE), characterizing systematic and random prediction errors. Many studies also report R2-values, but their meaning is not always clear (correlation between observations and predictions or MSE skill score; Wilks, 2011). The same criteria are also used for choosing tuning parameters of predictive procedures by cross-validation and bagging (e.g. Hastie et al., 2009). For evident reasons, atmospheric sciences have developed a rich box of tools for forecast verification. Specific criteria have been proposed for evaluating deterministic and probabilistic predictions of binary, multinomial, ordinal and continuous responses (see reviews by Wilks, 2011, Jollie and Stephenson, 2012 and Gneiting et al., 2007). It appears that these techniques are not very well-known in the geosciences community interested in machine learning. In our presentation we review techniques that offer more insight into proximity of data and predictions than bias, MSE and R2 alone. We mention here only examples: (i) Graphing observations vs. predictions is usually more appropriate than the reverse (Piñeiro et al., 2008). (ii) The decomposition of the Brier score score (= MSE for probabilistic predictions of binary yes/no data) into reliability and resolution reveals (conditional) bias and capability of discriminating yes/no observations by the predictions. We illustrate the approaches by applications from digital soil mapping studies. Gneiting, T., Balabdaoui, F., and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society Series B, 69, 243-268. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning; Data Mining, Inference and Prediction. Springer, New York, second edition. Jolliffe, I. T. and Stephenson, D. B., editors (2012). Forecast Verification: A Practitioner's Guide in Atmospheric Science. Wiley-Blackwell, second edition. Piñeiro, G., Perelman, S., Guerschman, J., and Paruelo, J. (2008). How to evaluate models: Observed vs. predicted or predicted vs. observed? Ecological Modelling, 216, 316-322. Wilks, D. S. (2011). Statistical Methods in the Atmospheric Sciences. Academic Press, third edition.
Comparison of probabilistic and deterministic fiber tracking of cranial nerves.
Zolal, Amir; Sobottka, Stephan B; Podlesek, Dino; Linn, Jennifer; Rieger, Bernhard; Juratli, Tareq A; Schackert, Gabriele; Kitzler, Hagen H
2017-09-01
OBJECTIVE The depiction of cranial nerves (CNs) using diffusion tensor imaging (DTI) is of great interest in skull base tumor surgery and DTI used with deterministic tracking methods has been reported previously. However, there are still no good methods usable for the elimination of noise from the resulting depictions. The authors have hypothesized that probabilistic tracking could lead to more accurate results, because it more efficiently extracts information from the underlying data. Moreover, the authors have adapted a previously described technique for noise elimination using gradual threshold increases to probabilistic tracking. To evaluate the utility of this new approach, a comparison is provided with this work between the gradual threshold increase method in probabilistic and deterministic tracking of CNs. METHODS Both tracking methods were used to depict CNs II, III, V, and the VII+VIII bundle. Depiction of 240 CNs was attempted with each of the above methods in 30 healthy subjects, which were obtained from 2 public databases: the Kirby repository (KR) and Human Connectome Project (HCP). Elimination of erroneous fibers was attempted by gradually increasing the respective thresholds (fractional anisotropy [FA] and probabilistic index of connectivity [PICo]). The results were compared with predefined ground truth images based on corresponding anatomical scans. Two label overlap measures (false-positive error and Dice similarity coefficient) were used to evaluate the success of both methods in depicting the CN. Moreover, the differences between these parameters obtained from the KR and HCP (with higher angular resolution) databases were evaluated. Additionally, visualization of 10 CNs in 5 clinical cases was attempted with both methods and evaluated by comparing the depictions with intraoperative findings. RESULTS Maximum Dice similarity coefficients were significantly higher with probabilistic tracking (p < 0.001; Wilcoxon signed-rank test). The false-positive error of the last obtained depiction was also significantly lower in probabilistic than in deterministic tracking (p < 0.001). The HCP data yielded significantly better results in terms of the Dice coefficient in probabilistic tracking (p < 0.001, Mann-Whitney U-test) and in deterministic tracking (p = 0.02). The false-positive errors were smaller in HCP data in deterministic tracking (p < 0.001) and showed a strong trend toward significance in probabilistic tracking (p = 0.06). In the clinical cases, the probabilistic method visualized 7 of 10 attempted CNs accurately, compared with 3 correct depictions with deterministic tracking. CONCLUSIONS High angular resolution DTI scans are preferable for the DTI-based depiction of the cranial nerves. Probabilistic tracking with a gradual PICo threshold increase is more effective for this task than the previously described deterministic tracking with a gradual FA threshold increase and might represent a method that is useful for depicting cranial nerves with DTI since it eliminates the erroneous fibers without manual intervention.
Using Structured Knowledge Representation for Context-Sensitive Probabilistic Modeling
2008-01-01
Morgan Kaufmann, 1988. [24] J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge University Press, 2000. [25] J. Piaget , Piaget’s theory ...5c. PROGRAM ELEMENT NUMBER 6. AUTHOR( S ) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME( S ) AND ADDRESS(ES...AGENCY NAME( S ) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM( S ) 11. SPONSOR/MONITOR’S REPORT NUMBER( S ) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved
Probabilistic Algorithmic Knowledge
2005-12-20
standard possible-worlds sense. Although soundness is not required in the basic definition, it does seem to be useful in many applications. Our interest...think of as describing basic facts about the system, such as “the door is closed” or “agent A sent the message m to B”, more complicated formulas are...messages as long as the adversary knows the decryption key. (The function submsg basically implements the inference rules for ⊢DY .) A DY i (hasi(m
Augmenting Latent Dirichlet Allocation and Rank Threshold Detection with Ontologies
2010-03-01
Probabilistic Latent Semantic Indexing (PLSI) is an automated indexing information retrieval model [20]. It is based on a statistical latent class model which is...uses a statistical foundation that is more accurate in finding hidden semantic relationships [20]. The model uses factor analysis of count data, number...principle of statistical infer- ence which asserts that all of the information in a sample is contained in the likelihood function [20]. The statistical
Synaptic and nonsynaptic plasticity approximating probabilistic inference
Tully, Philip J.; Hennig, Matthias H.; Lansner, Anders
2014-01-01
Learning and memory operations in neural circuits are believed to involve molecular cascades of synaptic and nonsynaptic changes that lead to a diverse repertoire of dynamical phenomena at higher levels of processing. Hebbian and homeostatic plasticity, neuromodulation, and intrinsic excitability all conspire to form and maintain memories. But it is still unclear how these seemingly redundant mechanisms could jointly orchestrate learning in a more unified system. To this end, a Hebbian learning rule for spiking neurons inspired by Bayesian statistics is proposed. In this model, synaptic weights and intrinsic currents are adapted on-line upon arrival of single spikes, which initiate a cascade of temporally interacting memory traces that locally estimate probabilities associated with relative neuronal activation levels. Trace dynamics enable synaptic learning to readily demonstrate a spike-timing dependence, stably return to a set-point over long time scales, and remain competitive despite this stability. Beyond unsupervised learning, linking the traces with an external plasticity-modulating signal enables spike-based reinforcement learning. At the postsynaptic neuron, the traces are represented by an activity-dependent ion channel that is shown to regulate the input received by a postsynaptic cell and generate intrinsic graded persistent firing levels. We show how spike-based Hebbian-Bayesian learning can be performed in a simulated inference task using integrate-and-fire (IAF) neurons that are Poisson-firing and background-driven, similar to the preferred regime of cortical neurons. Our results support the view that neurons can represent information in the form of probability distributions, and that probabilistic inference could be a functional by-product of coupled synaptic and nonsynaptic mechanisms operating over several timescales. The model provides a biophysical realization of Bayesian computation by reconciling several observed neural phenomena whose functional effects are only partially understood in concert. PMID:24782758
Bayesian analysis of the dynamic cosmic web in the SDSS galaxy survey
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leclercq, Florent; Wandelt, Benjamin; Jasche, Jens, E-mail: florent.leclercq@polytechnique.org, E-mail: jasche@iap.fr, E-mail: wandelt@iap.fr
Recent application of the Bayesian algorithm \\textsc(borg) to the Sloan Digital Sky Survey (SDSS) main sample galaxies resulted in the physical inference of the formation history of the observed large-scale structure from its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the cosmic web into four distinct components (voids, sheets, filaments, and clusters) on the basis of themore » tidal field. Our inference framework automatically and self-consistently propagates typical observational uncertainties to web-type classification. As a result, this study produces accurate cosmographic classification of large-scale structure elements in the SDSS volume. By also providing the history of these structure maps, the approach allows an analysis of the origin and growth of the early traces of the cosmic web present in the initial density field and of the evolution of global quantities such as the volume and mass filling fractions of different structures. For the problem of web-type classification, the results described in this work constitute the first connection between theory and observations at non-linear scales including a physical model of structure formation and the demonstrated capability of uncertainty quantification. A connection between cosmology and information theory using real data also naturally emerges from our probabilistic approach. Our results constitute quantitative chrono-cosmography of the complex web-like patterns underlying the observed galaxy distribution.« less
Probabilistic Geoacoustic Inversion in Complex Environments
2015-09-30
Probabilistic Geoacoustic Inversion in Complex Environments Jan Dettmer School of Earth and Ocean Sciences, University of Victoria, Victoria BC...long-range inversion methods can fail to provide sufficient resolution. For proper quantitative examination of variability, parameter uncertainty must...project aims to advance probabilistic geoacoustic inversion methods for complex ocean environments for a range of geoacoustic data types. The work is
NASA Technical Reports Server (NTRS)
Townsend, John S.; Peck, Jeff; Ayala, Samuel
2000-01-01
NASA has funded several major programs (the Probabilistic Structural Analysis Methods Project is an example) to develop probabilistic structural analysis methods and tools for engineers to apply in the design and assessment of aerospace hardware. A probabilistic finite element software code, known as Numerical Evaluation of Stochastic Structures Under Stress, is used to determine the reliability of a critical weld of the Space Shuttle solid rocket booster aft skirt. An external bracket modification to the aft skirt provides a comparison basis for examining the details of the probabilistic analysis and its contributions to the design process. Also, analysis findings are compared with measured Space Shuttle flight data.
Application of the Probabilistic Dynamic Synthesis Method to the Analysis of a Realistic Structure
NASA Technical Reports Server (NTRS)
Brown, Andrew M.; Ferri, Aldo A.
1998-01-01
The Probabilistic Dynamic Synthesis method is a new technique for obtaining the statistics of a desired response engineering quantity for a structure with non-deterministic parameters. The method uses measured data from modal testing of the structure as the input random variables, rather than more "primitive" quantities like geometry or material variation. This modal information is much more comprehensive and easily measured than the "primitive" information. The probabilistic analysis is carried out using either response surface reliability methods or Monte Carlo simulation. A previous work verified the feasibility of the PDS method on a simple seven degree-of-freedom spring-mass system. In this paper, extensive issues involved with applying the method to a realistic three-substructure system are examined, and free and forced response analyses are performed. The results from using the method are promising, especially when the lack of alternatives for obtaining quantitative output for probabilistic structures is considered.
Application of the Probabilistic Dynamic Synthesis Method to Realistic Structures
NASA Technical Reports Server (NTRS)
Brown, Andrew M.; Ferri, Aldo A.
1998-01-01
The Probabilistic Dynamic Synthesis method is a technique for obtaining the statistics of a desired response engineering quantity for a structure with non-deterministic parameters. The method uses measured data from modal testing of the structure as the input random variables, rather than more "primitive" quantities like geometry or material variation. This modal information is much more comprehensive and easily measured than the "primitive" information. The probabilistic analysis is carried out using either response surface reliability methods or Monte Carlo simulation. In previous work, the feasibility of the PDS method applied to a simple seven degree-of-freedom spring-mass system was verified. In this paper, extensive issues involved with applying the method to a realistic three-substructure system are examined, and free and forced response analyses are performed. The results from using the method are promising, especially when the lack of alternatives for obtaining quantitative output for probabilistic structures is considered.
Probabilistic structural analysis methods for select space propulsion system components
NASA Technical Reports Server (NTRS)
Millwater, H. R.; Cruse, T. A.
1989-01-01
The Probabilistic Structural Analysis Methods (PSAM) project developed at the Southwest Research Institute integrates state-of-the-art structural analysis techniques with probability theory for the design and analysis of complex large-scale engineering structures. An advanced efficient software system (NESSUS) capable of performing complex probabilistic analysis has been developed. NESSUS contains a number of software components to perform probabilistic analysis of structures. These components include: an expert system, a probabilistic finite element code, a probabilistic boundary element code and a fast probability integrator. The NESSUS software system is shown. An expert system is included to capture and utilize PSAM knowledge and experience. NESSUS/EXPERT is an interactive menu-driven expert system that provides information to assist in the use of the probabilistic finite element code NESSUS/FEM and the fast probability integrator (FPI). The expert system menu structure is summarized. The NESSUS system contains a state-of-the-art nonlinear probabilistic finite element code, NESSUS/FEM, to determine the structural response and sensitivities. A broad range of analysis capabilities and an extensive element library is present.
Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances
NASA Astrophysics Data System (ADS)
Stähler, Simon C.; Sigloch, Karin
2016-11-01
Seismic source inversion, a central task in seismology, is concerned with the estimation of earthquake source parameters and their uncertainties. Estimating uncertainties is particularly challenging because source inversion is a non-linear problem. In a companion paper, Stähler and Sigloch (2014) developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements, a problem we address here. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D = 1 - CC of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. By identifying and quantifying this likelihood function, we make D and thus waveform cross-correlation measurements usable for fully probabilistic sampling strategies, in source inversion and related applications such as seismic tomography.
Nakatani, Yoichiro; McLysaght, Aoife
2017-01-01
Abstract Motivation: It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Results: Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. Conclusions: We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. Availability and implementation: The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip, and the software written in Java is available upon request. Contact: yoichiro.nakatani@tcd.ie or aoife.mclysaght@tcd.ie Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28881993
Nakatani, Yoichiro; McLysaght, Aoife
2017-07-15
It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip , and the software written in Java is available upon request. yoichiro.nakatani@tcd.ie or aoife.mclysaght@tcd.ie. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Analyzing gene perturbation screens with nested effects models in R and bioconductor.
Fröhlich, Holger; Beissbarth, Tim; Tresch, Achim; Kostka, Dennis; Jacob, Juby; Spang, Rainer; Markowetz, F
2008-11-01
Nested effects models (NEMs) are a class of probabilistic models introduced to analyze the effects of gene perturbation screens visible in high-dimensional phenotypes like microarrays or cell morphology. NEMs reverse engineer upstream/downstream relations of cellular signaling cascades. NEMs take as input a set of candidate pathway genes and phenotypic profiles of perturbing these genes. NEMs return a pathway structure explaining the observed perturbation effects. Here, we describe the package nem, an open-source software to efficiently infer NEMs from data. Our software implements several search algorithms for model fitting and is applicable to a wide range of different data types and representations. The methods we present summarize the current state-of-the-art in NEMs. Our software is written in the R language and freely avail-able via the Bioconductor project at http://www.bioconductor.org.
NASA Astrophysics Data System (ADS)
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
From Weakly Chaotic Dynamics to Deterministic Subdiffusion via Copula Modeling
NASA Astrophysics Data System (ADS)
Nazé, Pierre
2018-03-01
Copula modeling consists in finding a probabilistic distribution, called copula, whereby its coupling with the marginal distributions of a set of random variables produces their joint distribution. The present work aims to use this technique to connect the statistical distributions of weakly chaotic dynamics and deterministic subdiffusion. More precisely, we decompose the jumps distribution of Geisel-Thomae map into a bivariate one and determine the marginal and copula distributions respectively by infinite ergodic theory and statistical inference techniques. We verify therefore that the characteristic tail distribution of subdiffusion is an extreme value copula coupling Mittag-Leffler distributions. We also present a method to calculate the exact copula and joint distributions in the case where weakly chaotic dynamics and deterministic subdiffusion statistical distributions are already known. Numerical simulations and consistency with the dynamical aspects of the map support our results.
NASA Astrophysics Data System (ADS)
Hunziker, Jürg; Laloy, Eric; Linde, Niklas
2016-04-01
Deterministic inversion procedures can often explain field data, but they only deliver one final subsurface model that depends on the initial model and regularization constraints. This leads to poor insights about the uncertainties associated with the inferred model properties. In contrast, probabilistic inversions can provide an ensemble of model realizations that accurately span the range of possible models that honor the available calibration data and prior information allowing a quantitative description of model uncertainties. We reconsider the problem of inferring the dielectric permittivity (directly related to radar velocity) structure of the subsurface by inversion of first-arrival travel times from crosshole ground penetrating radar (GPR) measurements. We rely on the DREAM_(ZS) algorithm that is a state-of-the-art Markov chain Monte Carlo (MCMC) algorithm. Such algorithms need several orders of magnitude more forward simulations than deterministic algorithms and often become infeasible in high parameter dimensions. To enable high-resolution imaging with MCMC, we use a recently proposed dimensionality reduction approach that allows reproducing 2D multi-Gaussian fields with far fewer parameters than a classical grid discretization. We consider herein a dimensionality reduction from 5000 to 257 unknowns. The first 250 parameters correspond to a spectral representation of random and uncorrelated spatial fluctuations while the remaining seven geostatistical parameters are (1) the standard deviation of the data error, (2) the mean and (3) the variance of the relative electric permittivity, (4) the integral scale along the major axis of anisotropy, (5) the anisotropy angle, (6) the ratio of the integral scale along the minor axis of anisotropy to the integral scale along the major axis of anisotropy and (7) the shape parameter of the Matérn function. The latter essentially defines the type of covariance function (e.g., exponential, Whittle, Gaussian). We present an improved formulation of the dimensionality reduction, and numerically show how it reduces artifacts in the generated models and provides better posterior estimation of the subsurface geostatistical structure. We next show that the results of the method compare very favorably against previous deterministic and stochastic inversion results obtained at the South Oyster Bacterial Transport Site in Virginia, USA. The long-term goal of this work is to enable MCMC-based full waveform inversion of crosshole GPR data.
Martin, Summer L; Stohs, Stephen M; Moore, Jeffrey E
2015-03-01
Fisheries bycatch is a global threat to marine megafauna. Environmental laws require bycatch assessment for protected species, but this is difficult when bycatch is rare. Low bycatch rates, combined with low observer coverage, may lead to biased, imprecise estimates when using standard ratio estimators. Bayesian model-based approaches incorporate uncertainty, produce less volatile estimates, and enable probabilistic evaluation of estimates relative to management thresholds. Here, we demonstrate a pragmatic decision-making process that uses Bayesian model-based inferences to estimate the probability of exceeding management thresholds for bycatch in fisheries with < 100% observer coverage. Using the California drift gillnet fishery as a case study, we (1) model rates of rare-event bycatch and mortality using Bayesian Markov chain Monte Carlo estimation methods and 20 years of observer data; (2) predict unobserved counts of bycatch and mortality; (3) infer expected annual mortality; (4) determine probabilities of mortality exceeding regulatory thresholds; and (5) classify the fishery as having low, medium, or high bycatch impact using those probabilities. We focused on leatherback sea turtles (Dermochelys coriacea) and humpback whales (Megaptera novaeangliae). Candidate models included Poisson or zero-inflated Poisson likelihood, fishing effort, and a bycatch rate that varied with area, time, or regulatory regime. Regulatory regime had the strongest effect on leatherback bycatch, with the highest levels occurring prior to a regulatory change. Area had the strongest effect on humpback bycatch. Cumulative bycatch estimates for the 20-year period were 104-242 leatherbacks (52-153 deaths) and 6-50 humpbacks (0-21 deaths). The probability of exceeding a regulatory threshold under the U.S. Marine Mammal Protection Act (Potential Biological Removal, PBR) of 0.113 humpback deaths was 0.58, warranting a "medium bycatch impact" classification of the fishery. No PBR thresholds exist for leatherbacks, but the probability of exceeding an anticipated level of two deaths per year, stated as part of a U.S. Endangered Species Act assessment process, was 0.0007. The approach demonstrated here would allow managers to objectively and probabilistically classify fisheries with respect to bycatch impacts on species that have population-relevant mortality reference points, and declare with a stipulated level of certainty that bycatch did or did not exceed estimated upper bounds.
Use of adjoint methods in the probabilistic finite element approach to fracture mechanics
NASA Technical Reports Server (NTRS)
Liu, Wing Kam; Besterfield, Glen; Lawrence, Mark; Belytschko, Ted
1988-01-01
The adjoint method approach to probabilistic finite element methods (PFEM) is presented. When the number of objective functions is small compared to the number of random variables, the adjoint method is far superior to the direct method in evaluating the objective function derivatives with respect to the random variables. The PFEM is extended to probabilistic fracture mechanics (PFM) using an element which has the near crack-tip singular strain field embedded. Since only two objective functions (i.e., mode I and II stress intensity factors) are needed for PFM, the adjoint method is well suited.
A probabilistic Hu-Washizu variational principle
NASA Technical Reports Server (NTRS)
Liu, W. K.; Belytschko, T.; Besterfield, G. H.
1987-01-01
A Probabilistic Hu-Washizu Variational Principle (PHWVP) for the Probabilistic Finite Element Method (PFEM) is presented. This formulation is developed for both linear and nonlinear elasticity. The PHWVP allows incorporation of the probabilistic distributions for the constitutive law, compatibility condition, equilibrium, domain and boundary conditions into the PFEM. Thus, a complete probabilistic analysis can be performed where all aspects of the problem are treated as random variables and/or fields. The Hu-Washizu variational formulation is available in many conventional finite element codes thereby enabling the straightforward inclusion of the probabilistic features into present codes.
Visual shape perception as Bayesian inference of 3D object-centered shape representations.
Erdogan, Goker; Jacobs, Robert A
2017-11-01
Despite decades of research, little is known about how people visually perceive object shape. We hypothesize that a promising approach to shape perception is provided by a "visual perception as Bayesian inference" framework which augments an emphasis on visual representation with an emphasis on the idea that shape perception is a form of statistical inference. Our hypothesis claims that shape perception of unfamiliar objects can be characterized as statistical inference of 3D shape in an object-centered coordinate system. We describe a computational model based on our theoretical framework, and provide evidence for the model along two lines. First, we show that, counterintuitively, the model accounts for viewpoint-dependency of object recognition, traditionally regarded as evidence against people's use of 3D object-centered shape representations. Second, we report the results of an experiment using a shape similarity task, and present an extensive evaluation of existing models' abilities to account for the experimental data. We find that our shape inference model captures subjects' behaviors better than competing models. Taken as a whole, our experimental and computational results illustrate the promise of our approach and suggest that people's shape representations of unfamiliar objects are probabilistic, 3D, and object-centered. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bayes and blickets: Effects of knowledge on causal induction in children and adults
Griffiths, Thomas L.; Sobel, David M.; Tenenbaum, Joshua B.; Gopnik, Alison
2011-01-01
People are adept at inferring novel causal relations, even from only a few observations. Prior knowledge about the probability of encountering causal relations of various types and the nature of the mechanisms relating causes and effects plays a crucial role in these inferences. We test a formal account of how this knowledge can be used and acquired, based on analyzing causal induction as Bayesian inference. Five studies explored the predictions of this account with adults and 4-year-olds, using tasks in which participants learned about the causal properties of a set of objects. The studies varied the two factors that our Bayesian approach predicted should be relevant to causal induction: the prior probability with which causal relations exist, and the assumption of a deterministic or a probabilistic relation between cause and effect. Adults’ judgments (Experiments 1, 2, and 4) were in close correspondence with the quantitative predictions of the model, and children’s judgments (Experiments 3 and 5) agreed qualitatively with this account. PMID:21972897
Dynamic Probabilistic Instability of Composite Structures
NASA Technical Reports Server (NTRS)
Chamis, Christos C.
2009-01-01
A computationally effective method is described to evaluate the non-deterministic dynamic instability (probabilistic dynamic buckling) of thin composite shells. The method is a judicious combination of available computer codes for finite element, composite mechanics and probabilistic structural analysis. The solution method is incrementally updated Lagrangian. It is illustrated by applying it to thin composite cylindrical shell subjected to dynamic loads. Both deterministic and probabilistic buckling loads are evaluated to demonstrate the effectiveness of the method. A universal plot is obtained for the specific shell that can be used to approximate buckling loads for different load rates and different probability levels. Results from this plot show that the faster the rate, the higher the buckling load and the shorter the time. The lower the probability, the lower is the buckling load for a specific time. Probabilistic sensitivity results show that the ply thickness, the fiber volume ratio and the fiber longitudinal modulus, dynamic load and loading rate are the dominant uncertainties in that order.
A quantum probability framework for human probabilistic inference.
Trueblood, Jennifer S; Yearsley, James M; Pothos, Emmanuel M
2017-09-01
There is considerable variety in human inference (e.g., a doctor inferring the presence of a disease, a juror inferring the guilt of a defendant, or someone inferring future weight loss based on diet and exercise). As such, people display a wide range of behaviors when making inference judgments. Sometimes, people's judgments appear Bayesian (i.e., normative), but in other cases, judgments deviate from the normative prescription of classical probability theory. How can we combine both Bayesian and non-Bayesian influences in a principled way? We propose a unified explanation of human inference using quantum probability theory. In our approach, we postulate a hierarchy of mental representations, from 'fully' quantum to 'fully' classical, which could be adopted in different situations. In our hierarchy of models, moving from the lowest level to the highest involves changing assumptions about compatibility (i.e., how joint events are represented). Using results from 3 experiments, we show that our modeling approach explains 5 key phenomena in human inference including order effects, reciprocity (i.e., the inverse fallacy), memorylessness, violations of the Markov condition, and antidiscounting. As far as we are aware, no existing theory or model can explain all 5 phenomena. We also explore transitions in our hierarchy, examining how representations change from more quantum to more classical. We show that classical representations provide a better account of data as individuals gain familiarity with a task. We also show that representations vary between individuals, in a way that relates to a simple measure of cognitive style, the Cognitive Reflection Test. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Probabilistic boundary element method
NASA Technical Reports Server (NTRS)
Cruse, T. A.; Raveendra, S. T.
1989-01-01
The purpose of the Probabilistic Structural Analysis Method (PSAM) project is to develop structural analysis capabilities for the design analysis of advanced space propulsion system hardware. The boundary element method (BEM) is used as the basis of the Probabilistic Advanced Analysis Methods (PADAM) which is discussed. The probabilistic BEM code (PBEM) is used to obtain the structural response and sensitivity results to a set of random variables. As such, PBEM performs analogous to other structural analysis codes such as finite elements in the PSAM system. For linear problems, unlike the finite element method (FEM), the BEM governing equations are written at the boundary of the body only, thus, the method eliminates the need to model the volume of the body. However, for general body force problems, a direct condensation of the governing equations to the boundary of the body is not possible and therefore volume modeling is generally required.
NASA Technical Reports Server (NTRS)
Price J. M.; Ortega, R.
1998-01-01
Probabilistic method is not a universally accepted approach for the design and analysis of aerospace structures. The validity of this approach must be demonstrated to encourage its acceptance as it viable design and analysis tool to estimate structural reliability. The objective of this Study is to develop a well characterized finite population of similar aerospace structures that can be used to (1) validate probabilistic codes, (2) demonstrate the basic principles behind probabilistic methods, (3) formulate general guidelines for characterization of material drivers (such as elastic modulus) when limited data is available, and (4) investigate how the drivers affect the results of sensitivity analysis at the component/failure mode level.
Browne, Fiona; Wang, Haiying; Zheng, Huiru; Azuaje, Francisco
2010-03-01
This study applied a knowledge-driven data integration framework for the inference of protein-protein interactions (PPI). Evidence from diverse genomic features is integrated using a knowledge-driven Bayesian network (KD-BN). Receiver operating characteristic (ROC) curves may not be the optimal assessment method to evaluate a classifier's performance in PPI prediction as the majority of the area under the curve (AUC) may not represent biologically meaningful results. It may be of benefit to interpret the AUC of a partial ROC curve whereby biologically interesting results are represented. Therefore, the novel application of the assessment method referred to as the partial ROC has been employed in this study to assess predictive performance of PPI predictions along with calculating the True positive/false positive rate and true positive/positive rate. By incorporating domain knowledge into the construction of the KD-BN, we demonstrate improvement in predictive performance compared with previous studies based upon the Naive Bayesian approach. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Genovo: De Novo Assembly for Metagenomes
NASA Astrophysics Data System (ADS)
Laserson, Jonathan; Jojic, Vladimir; Koller, Daphne
Next-generation sequencing technologies produce a large number of noisy reads from the DNA in a sample. Metagenomics and population sequencing aim to recover the genomic sequences of the species in the sample, which could be of high diversity. Methods geared towards single sequence reconstruction are not sensitive enough when applied in this setting. We introduce a generative probabilistic model of read generation from environmental samples and present Genovo, a novel de novo sequence assembler that discovers likely sequence reconstructions under the model. A Chinese restaurant process prior accounts for the unknown number of genomes in the sample. Inference is made by applying a series of hill-climbing steps iteratively until convergence. We compare the performance of Genovo to three other short read assembly programs across one synthetic dataset and eight metagenomic datasets created using the 454 platform, the largest of which has 311k reads. Genovo's reconstructions cover more bases and recover more genes than the other methods, and yield a higher assembly score.
Physical Simulation for Probabilistic Motion Tracking
2008-01-01
learn a low- dimensional embedding of the high-dimensional kinematic data and then attempt to solve the problem in this more man- ageable low...rotations and foot skate ). Such artifacts can be attributed to the general lack of physically plausible priors [2] (that can account for static and/or...temporal priors of the form p(xf+1|xf ) = N (xf + γf ,Σ) (where γf is scaled velocity learned or inferred), have also been proposed [13] and shown to
How social information affects information search and choice in probabilistic inferences.
Puskaric, Marin; von Helversen, Bettina; Rieskamp, Jörg
2018-01-01
When making decisions, people are often exposed to relevant information stemming from qualitatively different sources. For instance, when making a choice between two alternatives people can rely on the advice of other people (i.e., social information) or search for factual information about the alternatives (i.e., non-social information). Prior research in categorization has shown that social information is given special attention when both social and non-social information is available, even when the social information has no additional informational value. The goal of the current work is to investigate whether framing information as social or non-social also influences information search and choice in probabilistic inferences. In a first study, we found that framing cues (i.e., the information used to make a decision) with medium validity as social increased the probability that they were searched for compared to a task where the same cues were framed as non-social information, but did not change the strategy people relied on. A second and a third study showed that framing a cue with high validity as social information facilitated learning to rely on a non-compensatory decision strategy. Overall, the results suggest that social in comparison to non-social information is given more attention and is learned faster than non-social information. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Ryan, Robert S.; Townsend, John S.
1993-01-01
The prospective improvement of probabilistic methods for space program analysis/design entails the further development of theories, codes, and tools which match specific areas of application, the drawing of lessons from previous uses of probability and statistics data bases, the enlargement of data bases (especially in the field of structural failures), and the education of engineers and managers on the advantages of these methods. An evaluation is presently made of the current limitations of probabilistic engineering methods. Recommendations are made for specific applications.
Emulation for probabilistic weather forecasting
NASA Astrophysics Data System (ADS)
Cornford, Dan; Barillec, Remi
2010-05-01
Numerical weather prediction models are typically very expensive to run due to their complexity and resolution. Characterising the sensitivity of the model to its initial condition and/or to its parameters requires numerous runs of the model, which is impractical for all but the simplest models. To produce probabilistic forecasts requires knowledge of the distribution of the model outputs, given the distribution over the inputs, where the inputs include the initial conditions, boundary conditions and model parameters. Such uncertainty analysis for complex weather prediction models seems a long way off, given current computing power, with ensembles providing only a partial answer. One possible way forward that we develop in this work is the use of statistical emulators. Emulators provide an efficient statistical approximation to the model (or simulator) while quantifying the uncertainty introduced. In the emulator framework, a Gaussian process is fitted to the simulator response as a function of the simulator inputs using some training data. The emulator is essentially an interpolator of the simulator output and the response in unobserved areas is dictated by the choice of covariance structure and parameters in the Gaussian process. Suitable parameters are inferred from the data in a maximum likelihood, or Bayesian framework. Once trained, the emulator allows operations such as sensitivity analysis or uncertainty analysis to be performed at a much lower computational cost. The efficiency of emulators can be further improved by exploiting the redundancy in the simulator output through appropriate dimension reduction techniques. We demonstrate this using both Principal Component Analysis on the model output and a new reduced-rank emulator in which an optimal linear projection operator is estimated jointly with other parameters, in the context of simple low order models, such as the Lorenz 40D system. We present the application of emulators to probabilistic weather forecasting, where the construction of the emulator training set replaces the traditional ensemble model runs. Thus the actual forecast distributions are computed using the emulator conditioned on the ‘ensemble runs' which are chosen to explore the plausible input space using relatively crude experimental design methods. One benefit here is that the ensemble does not need to be a sample from the true distribution of the input space, rather it should cover that input space in some sense. The probabilistic forecasts are computed using Monte Carlo methods sampling from the input distribution and using the emulator to produce the output distribution. Finally we discuss the limitations of this approach and briefly mention how we might use similar methods to learn the model error within a framework that incorporates a data assimilation like aspect, using emulators and learning complex model error representations. We suggest future directions for research in the area that will be necessary to apply the method to more realistic numerical weather prediction models.
Sayers, Adrian; Ben-Shlomo, Yoav; Blom, Ashley W; Steele, Fiona
2016-01-01
Abstract Studies involving the use of probabilistic record linkage are becoming increasingly common. However, the methods underpinning probabilistic record linkage are not widely taught or understood, and therefore these studies can appear to be a ‘black box’ research tool. In this article, we aim to describe the process of probabilistic record linkage through a simple exemplar. We first introduce the concept of deterministic linkage and contrast this with probabilistic linkage. We illustrate each step of the process using a simple exemplar and describe the data structure required to perform a probabilistic linkage. We describe the process of calculating and interpreting matched weights and how to convert matched weights into posterior probabilities of a match using Bayes theorem. We conclude this article with a brief discussion of some of the computational demands of record linkage, how you might assess the quality of your linkage algorithm, and how epidemiologists can maximize the value of their record-linked research using robust record linkage methods. PMID:26686842
Probabilistic structural analysis methods of hot engine structures
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Hopkins, D. A.
1989-01-01
Development of probabilistic structural analysis methods for hot engine structures at Lewis Research Center is presented. Three elements of the research program are: (1) composite load spectra methodology; (2) probabilistic structural analysis methodology; and (3) probabilistic structural analysis application. Recent progress includes: (1) quantification of the effects of uncertainties for several variables on high pressure fuel turbopump (HPFT) turbine blade temperature, pressure, and torque of the space shuttle main engine (SSME); (2) the evaluation of the cumulative distribution function for various structural response variables based on assumed uncertainties in primitive structural variables; and (3) evaluation of the failure probability. Collectively, the results demonstrate that the structural durability of hot engine structural components can be effectively evaluated in a formal probabilistic/reliability framework.
A probabilistic and continuous model of protein conformational space for template-free modeling.
Zhao, Feng; Peng, Jian; Debartolo, Joe; Freed, Karl F; Sosnick, Tobin R; Xu, Jinbo
2010-06-01
One of the major challenges with protein template-free modeling is an efficient sampling algorithm that can explore a huge conformation space quickly. The popular fragment assembly method constructs a conformation by stringing together short fragments extracted from the Protein Data Base (PDB). The discrete nature of this method may limit generated conformations to a subspace in which the native fold does not belong. Another worry is that a protein with really new fold may contain some fragments not in the PDB. This article presents a probabilistic model of protein conformational space to overcome the above two limitations. This probabilistic model employs directional statistics to model the distribution of backbone angles and 2(nd)-order Conditional Random Fields (CRFs) to describe sequence-angle relationship. Using this probabilistic model, we can sample protein conformations in a continuous space, as opposed to the widely used fragment assembly and lattice model methods that work in a discrete space. We show that when coupled with a simple energy function, this probabilistic method compares favorably with the fragment assembly method in the blind CASP8 evaluation, especially on alpha or small beta proteins. To our knowledge, this is the first probabilistic method that can search conformations in a continuous space and achieves favorable performance. Our method also generated three-dimensional (3D) models better than template-based methods for a couple of CASP8 hard targets. The method described in this article can also be applied to protein loop modeling, model refinement, and even RNA tertiary structure prediction.
Chen, Jonathan H; Goldstein, Mary K; Asch, Steven M; Mackey, Lester; Altman, Russ B
2017-05-01
Build probabilistic topic model representations of hospital admissions processes and compare the ability of such models to predict clinical order patterns as compared to preconstructed order sets. The authors evaluated the first 24 hours of structured electronic health record data for > 10 K inpatients. Drawing an analogy between structured items (e.g., clinical orders) to words in a text document, the authors performed latent Dirichlet allocation probabilistic topic modeling. These topic models use initial clinical information to predict clinical orders for a separate validation set of > 4 K patients. The authors evaluated these topic model-based predictions vs existing human-authored order sets by area under the receiver operating characteristic curve, precision, and recall for subsequent clinical orders. Existing order sets predict clinical orders used within 24 hours with area under the receiver operating characteristic curve 0.81, precision 16%, and recall 35%. This can be improved to 0.90, 24%, and 47% ( P < 10 -20 ) by using probabilistic topic models to summarize clinical data into up to 32 topics. Many of these latent topics yield natural clinical interpretations (e.g., "critical care," "pneumonia," "neurologic evaluation"). Existing order sets tend to provide nonspecific, process-oriented aid, with usability limitations impairing more precise, patient-focused support. Algorithmic summarization has the potential to breach this usability barrier by automatically inferring patient context, but with potential tradeoffs in interpretability. Probabilistic topic modeling provides an automated approach to detect thematic trends in patient care and generate decision support content. A potential use case finds related clinical orders for decision support. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association.
‘Particle genetics’: treating every cell as unique
Yvert, Gaël
2014-01-01
Genotype-phenotype relations are usually inferred from a deterministic point of view. For example, quantitative trait loci (QTL), which describe regions of the genome associated with a particular phenotype, are based on a mean trait difference between genotype categories. However, living systems comprise huge numbers of cells (the ‘particles’ of biology). Each cell can exhibit substantial phenotypic individuality, which can have dramatic consequences at the organismal level. Now, with technology capable of interrogating individual cells, it is time to consider how genotypes shape the probability laws of single cell traits. The possibility of mapping single cell probabilistic trait loci (PTL), which link genomic regions to probabilities of cellular traits, is a promising step in this direction. This approach requires thinking about phenotypes in probabilistic terms, a concept that statistical physicists have been applying to particles for a century. Here, I describe PTL and discuss their potential to enlarge our understanding of genotype-phenotype relations. PMID:24315431
A probabilistic framework to infer brain functional connectivity from anatomical connections.
Deligianni, Fani; Varoquaux, Gael; Thirion, Bertrand; Robinson, Emma; Sharp, David J; Edwards, A David; Rueckert, Daniel
2011-01-01
We present a novel probabilistic framework to learn across several subjects a mapping from brain anatomical connectivity to functional connectivity, i.e. the covariance structure of brain activity. This prediction problem must be formulated as a structured-output learning task, as the predicted parameters are strongly correlated. We introduce a model selection framework based on cross-validation with a parametrization-independent loss function suitable to the manifold of covariance matrices. Our model is based on constraining the conditional independence structure of functional activity by the anatomical connectivity. Subsequently, we learn a linear predictor of a stationary multivariate autoregressive model. This natural parameterization of functional connectivity also enforces the positive-definiteness of the predicted covariance and thus matches the structure of the output space. Our results show that functional connectivity can be explained by anatomical connectivity on a rigorous statistical basis, and that a proper model of functional connectivity is essential to assess this link.
A decision network account of reasoning about other people's choices.
Jern, Alan; Kemp, Charles
2015-09-01
The ability to predict and reason about other people's choices is fundamental to social interaction. We propose that people reason about other people's choices using mental models that are similar to decision networks. Decision networks are extensions of Bayesian networks that incorporate the idea that choices are made in order to achieve goals. In our first experiment, we explore how people predict the choices of others. Our remaining three experiments explore how people infer the goals and knowledge of others by observing the choices that they make. We show that decision networks account for our data better than alternative computational accounts that do not incorporate the notion of goal-directed choice or that do not rely on probabilistic inference. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Pernot, Pascal; Savin, Andreas
2018-06-01
Benchmarking studies in computational chemistry use reference datasets to assess the accuracy of a method through error statistics. The commonly used error statistics, such as the mean signed and mean unsigned errors, do not inform end-users on the expected amplitude of prediction errors attached to these methods. We show that, the distributions of model errors being neither normal nor zero-centered, these error statistics cannot be used to infer prediction error probabilities. To overcome this limitation, we advocate for the use of more informative statistics, based on the empirical cumulative distribution function of unsigned errors, namely, (1) the probability for a new calculation to have an absolute error below a chosen threshold and (2) the maximal amplitude of errors one can expect with a chosen high confidence level. Those statistics are also shown to be well suited for benchmarking and ranking studies. Moreover, the standard error on all benchmarking statistics depends on the size of the reference dataset. Systematic publication of these standard errors would be very helpful to assess the statistical reliability of benchmarking conclusions.
NASA Technical Reports Server (NTRS)
Singhal, Surendra N.
2003-01-01
The SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division activities include identification and fulfillment of joint industry, government, and academia needs for development and implementation of RMSL technologies. Four Projects in the Probabilistic Methods area and two in the area of RMSL have been identified. These are: (1) Evaluation of Probabilistic Technology - progress has been made toward the selection of probabilistic application cases. Future effort will focus on assessment of multiple probabilistic softwares in solving selected engineering problems using probabilistic methods. Relevance to Industry & Government - Case studies of typical problems encountering uncertainties, results of solutions to these problems run by different codes, and recommendations on which code is applicable for what problems; (2) Probabilistic Input Preparation - progress has been made in identifying problem cases such as those with no data, little data and sufficient data. Future effort will focus on developing guidelines for preparing input for probabilistic analysis, especially with no or little data. Relevance to Industry & Government - Too often, we get bogged down thinking we need a lot of data before we can quantify uncertainties. Not True. There are ways to do credible probabilistic analysis with little data; (3) Probabilistic Reliability - probabilistic reliability literature search has been completed along with what differentiates it from statistical reliability. Work on computation of reliability based on quantification of uncertainties in primitive variables is in progress. Relevance to Industry & Government - Correct reliability computations both at the component and system level are needed so one can design an item based on its expected usage and life span; (4) Real World Applications of Probabilistic Methods (PM) - A draft of volume 1 comprising aerospace applications has been released. Volume 2, a compilation of real world applications of probabilistic methods with essential information demonstrating application type and timehost savings by the use of probabilistic methods for generic applications is in progress. Relevance to Industry & Government - Too often, we say, 'The Proof is in the Pudding'. With help from many contributors, we hope to produce such a document. Problem is - not too many people are coming forward due to proprietary nature. So, we are asking to document only minimum information including problem description, what method used, did it result in any savings, and how much?; (5) Software Reliability - software reliability concept, program, implementation, guidelines, and standards are being documented. Relevance to Industry & Government - software reliability is a complex issue that must be understood & addressed in all facets of business in industry, government, and other institutions. We address issues, concepts, ways to implement solutions, and guidelines for maximizing software reliability; (6) Maintainability Standards - maintainability/serviceability industry standard/guidelines and industry best practices and methodologies used in performing maintainability/ serviceability tasks are being documented. Relevance to Industry & Government - Any industry or government process, project, and/or tool must be maintained and serviced to realize the life and performance it was designed for. We address issues and develop guidelines for optimum performance & life.
Probabilistic simulation of stress concentration in composite laminates
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Murthy, P. L. N.; Liaw, L.
1993-01-01
A computational methodology is described to probabilistically simulate the stress concentration factors in composite laminates. This new approach consists of coupling probabilistic composite mechanics with probabilistic finite element structural analysis. The probabilistic composite mechanics is used to probabilistically describe all the uncertainties inherent in composite material properties while probabilistic finite element is used to probabilistically describe the uncertainties associated with methods to experimentally evaluate stress concentration factors such as loads, geometry, and supports. The effectiveness of the methodology is demonstrated by using it to simulate the stress concentration factors in composite laminates made from three different composite systems. Simulated results match experimental data for probability density and for cumulative distribution functions. The sensitivity factors indicate that the stress concentration factors are influenced by local stiffness variables, by load eccentricities and by initial stress fields.
On the Origins of Suboptimality in Human Probabilistic Inference
Acerbi, Luigi; Vijayakumar, Sethu; Wolpert, Daniel M.
2014-01-01
Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior. PMID:24945142
Reliability and risk assessment of structures
NASA Technical Reports Server (NTRS)
Chamis, C. C.
1991-01-01
Development of reliability and risk assessment of structural components and structures is a major activity at Lewis Research Center. It consists of five program elements: (1) probabilistic loads; (2) probabilistic finite element analysis; (3) probabilistic material behavior; (4) assessment of reliability and risk; and (5) probabilistic structural performance evaluation. Recent progress includes: (1) the evaluation of the various uncertainties in terms of cumulative distribution functions for various structural response variables based on known or assumed uncertainties in primitive structural variables; (2) evaluation of the failure probability; (3) reliability and risk-cost assessment; and (4) an outline of an emerging approach for eventual certification of man-rated structures by computational methods. Collectively, the results demonstrate that the structural durability/reliability of man-rated structural components and structures can be effectively evaluated by using formal probabilistic methods.
An advanced probabilistic structural analysis method for implicit performance functions
NASA Technical Reports Server (NTRS)
Wu, Y.-T.; Millwater, H. R.; Cruse, T. A.
1989-01-01
In probabilistic structural analysis, the performance or response functions usually are implicitly defined and must be solved by numerical analysis methods such as finite element methods. In such cases, the most commonly used probabilistic analysis tool is the mean-based, second-moment method which provides only the first two statistical moments. This paper presents a generalized advanced mean value (AMV) method which is capable of establishing the distributions to provide additional information for reliability design. The method requires slightly more computations than the second-moment method but is highly efficient relative to the other alternative methods. In particular, the examples show that the AMV method can be used to solve problems involving non-monotonic functions that result in truncated distributions.
NASA Astrophysics Data System (ADS)
Paasche, Hendrik
2018-01-01
Site characterization requires detailed and ideally spatially continuous information about the subsurface. Geophysical tomographic experiments allow for spatially continuous imaging of physical parameter variations, e.g., seismic wave propagation velocities. Such physical parameters are often related to typical geotechnical or hydrological target parameters, e.g. as achieved from 1D direct push or borehole logging. Here, the probabilistic inference of 2D tip resistance, sleeve friction, and relative dielectric permittivity distributions in near-surface sediments is constrained by ill-posed cross-borehole seismic P- and S-wave and radar wave traveltime tomography. In doing so, we follow a discovery science strategy employing a fully data-driven approach capable of accounting for tomographic ambiguity and differences in spatial resolution between the geophysical tomograms and the geotechnical logging data used for calibration. We compare the outcome to results achieved employing classical hypothesis-driven approaches, i.e., deterministic transfer functions derived empirically for the inference of 2D sleeve friction from S-wave velocity tomograms and theoretically for the inference of 2D dielectric permittivity from radar wave velocity tomograms. The data-driven approach offers maximal flexibility in combination with very relaxed considerations about the character of the expected links. This makes it a versatile tool applicable to almost any combination of data sets. However, error propagation may be critical and justify thinking about a hypothesis-driven pre-selection of an optimal database going along with the risk of excluding relevant information from the analyses. Results achieved by transfer function rely on information about the nature of the link and optimal calibration settings drawn as retrospective hypothesis by other authors. Applying such transfer functions at other sites turns them into a priori valid hypothesis, which can, particularly for empirically derived transfer functions, result in poor predictions. However, a mindful utilization and critical evaluation of the consequences of turning a retrospectively drawn hypothesis into an a priori valid hypothesis can also result in good results for inference and prediction problems when using classical transfer function concepts.
Probabilistic safety assessment of the design of a tall buildings under the extreme load
DOE Office of Scientific and Technical Information (OSTI.GOV)
Králik, Juraj, E-mail: juraj.kralik@stuba.sk
2016-06-08
The paper describes some experiences from the deterministic and probabilistic analysis of the safety of the tall building structure. There are presented the methods and requirements of Eurocode EN 1990, standard ISO 2394 and JCSS. The uncertainties of the model and resistance of the structures are considered using the simulation methods. The MONTE CARLO, LHS and RSM probabilistic methods are compared with the deterministic results. On the example of the probability analysis of the safety of the tall buildings is demonstrated the effectiveness of the probability design of structures using Finite Element Methods.
Probabilistic safety assessment of the design of a tall buildings under the extreme load
NASA Astrophysics Data System (ADS)
Králik, Juraj
2016-06-01
The paper describes some experiences from the deterministic and probabilistic analysis of the safety of the tall building structure. There are presented the methods and requirements of Eurocode EN 1990, standard ISO 2394 and JCSS. The uncertainties of the model and resistance of the structures are considered using the simulation methods. The MONTE CARLO, LHS and RSM probabilistic methods are compared with the deterministic results. On the example of the probability analysis of the safety of the tall buildings is demonstrated the effectiveness of the probability design of structures using Finite Element Methods.
Perturbation Biology: Inferring Signaling Networks in Cellular Systems
Miller, Martin L.; Gauthier, Nicholas P.; Jing, Xiaohong; Kaushik, Poorvi; He, Qin; Mills, Gordon; Solit, David B.; Pratilas, Christine A.; Weigt, Martin; Braunstein, Alfredo; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris
2013-01-01
We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology. PMID:24367245
Stochastic simulation of ecohydrological interactions between vegetation and groundwater
NASA Astrophysics Data System (ADS)
Dwelle, M. C.; Ivanov, V. Y.; Sargsyan, K.
2017-12-01
The complex interactions between groundwater and vegetation in the Amazon rainforest may yield vital ecophysiological interactions in specific landscape niches such as buffering plant water stress during dry season or suppression of water uptake due to anoxic conditions. Representation of such processes is greatly impacted by both external and internal sources of uncertainty: inaccurate data and subjective choice of model representation. The models that can simulate these processes are complex and computationally expensive, and therefore make it difficult to address uncertainty using traditional methods. We use the ecohydrologic model tRIBS+VEGGIE and a novel uncertainty quantification framework applied to the ZF2 watershed near Manaus, Brazil. We showcase the capability of this framework for stochastic simulation of vegetation-hydrology dynamics. This framework is useful for simulation with internal and external stochasticity, but this work will focus on internal variability of groundwater depth distribution and model parameterizations. We demonstrate the capability of this framework to make inferences on uncertain states of groundwater depth from limited in situ data, and how the realizations of these inferences affect the ecohydrological interactions between groundwater dynamics and vegetation function. We place an emphasis on the probabilistic representation of quantities of interest and how this impacts the understanding and interpretation of the dynamics at the groundwater-vegetation interface.
Probabilistic dual heuristic programming-based adaptive critic
NASA Astrophysics Data System (ADS)
Herzallah, Randa
2010-02-01
Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.
Probabilistic Aeroelastic Analysis of Turbomachinery Components
NASA Technical Reports Server (NTRS)
Reddy, T. S. R.; Mital, S. K.; Stefko, G. L.
2004-01-01
A probabilistic approach is described for aeroelastic analysis of turbomachinery blade rows. Blade rows with subsonic flow and blade rows with supersonic flow with subsonic leading edge are considered. To demonstrate the probabilistic approach, the flutter frequency, damping and forced response of a blade row representing a compressor geometry is considered. The analysis accounts for uncertainties in structural and aerodynamic design variables. The results are presented in the form of probabilistic density function (PDF) and sensitivity factors. For subsonic flow cascade, comparisons are also made with different probabilistic distributions, probabilistic methods, and Monte-Carlo simulation. The approach shows that the probabilistic approach provides a more realistic and systematic way to assess the effect of uncertainties in design variables on the aeroelastic instabilities and response.
An undergraduate course, and new textbook, on ``Physical Models of Living Systems''
NASA Astrophysics Data System (ADS)
Nelson, Philip
2015-03-01
I'll describe an intermediate-level course on ``Physical Models of Living Systems.'' The only prerequisite is first-year university physics and calculus. The course is a response to rapidly growing interest among undergraduates in several science and engineering departments. Students acquire several research skills that are often not addressed in traditional courses, including: basic modeling skills, probabilistic modeling skills, data analysis methods, computer programming using a general-purpose platform like MATLAB or Python, dynamical systems, particularly feedback control. These basic skills, which are relevant to nearly any field of science or engineering, are presented in the context of case studies from living systems, including: virus dynamics; bacterial genetics and evolution of drug resistance; statistical inference; superresolution microscopy; synthetic biology; naturally evolved cellular circuits. Publication of a new textbook by WH Freeman and Co. is scheduled for December 2014. Supported in part by EF-0928048 and DMR-0832802.
NASA Astrophysics Data System (ADS)
Bin, Che; Ruoying, Yu; Dongsheng, Dang; Xiangyan, Wang
2017-05-01
Distributed Generation (DG) integrating to the network would cause the harmonic pollution which would cause damages on electrical devices and affect the normal operation of power system. On the other hand, due to the randomness of the wind and solar irradiation, the output of DG is random, too, which leads to an uncertainty of the harmonic generated by the DG. Thus, probabilistic methods are needed to analyse the impacts of the DG integration. In this work we studied the harmonic voltage probabilistic distribution and the harmonic distortion in distributed network after the distributed photovoltaic (DPV) system integrating in different weather conditions, mainly the sunny day, cloudy day, rainy day and the snowy day. The probabilistic distribution function of the DPV output power in different typical weather conditions could be acquired via the parameter identification method of maximum likelihood estimation. The Monte-Carlo simulation method was adopted to calculate the probabilistic distribution of harmonic voltage content at different frequency orders as well as the harmonic distortion (THD) in typical weather conditions. The case study was based on the IEEE33 system and the results of harmonic voltage content probabilistic distribution as well as THD in typical weather conditions were compared.
NESSUS/EXPERT - An expert system for probabilistic structural analysis methods
NASA Technical Reports Server (NTRS)
Millwater, H.; Palmer, K.; Fink, P.
1988-01-01
An expert system (NESSUS/EXPERT) is presented which provides assistance in using probabilistic structural analysis methods. NESSUS/EXPERT is an interactive menu-driven expert system that provides information to assist in the use of the probabilistic finite element code NESSUS/FEM and the fast probability integrator. NESSUS/EXPERT was developed with a combination of FORTRAN and CLIPS, a C language expert system tool, to exploit the strengths of each language.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bacvarov, D.C.
1981-01-01
A new method for probabilistic risk assessment of transmission line insulation flashovers caused by lightning strokes is presented. The utilized approach of applying the finite element method for probabilistic risk assessment is demonstrated to be very powerful. The reasons for this are two. First, the finite element method is inherently suitable for analysis of three dimensional spaces where the parameters, such as three variate probability densities of the lightning currents, are non-uniformly distributed. Second, the finite element method permits non-uniform discretization of the three dimensional probability spaces thus yielding high accuracy in critical regions, such as the area of themore » low probability events, while at the same time maintaining coarse discretization in the non-critical areas to keep the number of grid points and the size of the problem to a manageable low level. The finite element probabilistic risk assessment method presented here is based on a new multidimensional search algorithm. It utilizes an efficient iterative technique for finite element interpolation of the transmission line insulation flashover criteria computed with an electro-magnetic transients program. Compared to other available methods the new finite element probabilistic risk assessment method is significantly more accurate and approximately two orders of magnitude computationally more efficient. The method is especially suited for accurate assessment of rare, very low probability events.« less
DOT National Transportation Integrated Search
2009-10-13
This paper describes a probabilistic approach to estimate the conditional probability of release of hazardous materials from railroad tank cars during train accidents. Monte Carlo methods are used in developing a probabilistic model to simulate head ...
Structural reliability methods: Code development status
NASA Astrophysics Data System (ADS)
Millwater, Harry R.; Thacker, Ben H.; Wu, Y.-T.; Cruse, T. A.
1991-05-01
The Probabilistic Structures Analysis Method (PSAM) program integrates state of the art probabilistic algorithms with structural analysis methods in order to quantify the behavior of Space Shuttle Main Engine structures subject to uncertain loadings, boundary conditions, material parameters, and geometric conditions. An advanced, efficient probabilistic structural analysis software program, NESSUS (Numerical Evaluation of Stochastic Structures Under Stress) was developed as a deliverable. NESSUS contains a number of integrated software components to perform probabilistic analysis of complex structures. A nonlinear finite element module NESSUS/FEM is used to model the structure and obtain structural sensitivities. Some of the capabilities of NESSUS/FEM are shown. A Fast Probability Integration module NESSUS/FPI estimates the probability given the structural sensitivities. A driver module, PFEM, couples the FEM and FPI. NESSUS, version 5.0, addresses component reliability, resistance, and risk.
Structural reliability methods: Code development status
NASA Technical Reports Server (NTRS)
Millwater, Harry R.; Thacker, Ben H.; Wu, Y.-T.; Cruse, T. A.
1991-01-01
The Probabilistic Structures Analysis Method (PSAM) program integrates state of the art probabilistic algorithms with structural analysis methods in order to quantify the behavior of Space Shuttle Main Engine structures subject to uncertain loadings, boundary conditions, material parameters, and geometric conditions. An advanced, efficient probabilistic structural analysis software program, NESSUS (Numerical Evaluation of Stochastic Structures Under Stress) was developed as a deliverable. NESSUS contains a number of integrated software components to perform probabilistic analysis of complex structures. A nonlinear finite element module NESSUS/FEM is used to model the structure and obtain structural sensitivities. Some of the capabilities of NESSUS/FEM are shown. A Fast Probability Integration module NESSUS/FPI estimates the probability given the structural sensitivities. A driver module, PFEM, couples the FEM and FPI. NESSUS, version 5.0, addresses component reliability, resistance, and risk.
Model-Based Method for Sensor Validation
NASA Technical Reports Server (NTRS)
Vatan, Farrokh
2012-01-01
Fault detection, diagnosis, and prognosis are essential tasks in the operation of autonomous spacecraft, instruments, and in situ platforms. One of NASA s key mission requirements is robust state estimation. Sensing, using a wide range of sensors and sensor fusion approaches, plays a central role in robust state estimation, and there is a need to diagnose sensor failure as well as component failure. Sensor validation can be considered to be part of the larger effort of improving reliability and safety. The standard methods for solving the sensor validation problem are based on probabilistic analysis of the system, from which the method based on Bayesian networks is most popular. Therefore, these methods can only predict the most probable faulty sensors, which are subject to the initial probabilities defined for the failures. The method developed in this work is based on a model-based approach and provides the faulty sensors (if any), which can be logically inferred from the model of the system and the sensor readings (observations). The method is also more suitable for the systems when it is hard, or even impossible, to find the probability functions of the system. The method starts by a new mathematical description of the problem and develops a very efficient and systematic algorithm for its solution. The method builds on the concepts of analytical redundant relations (ARRs).
Probabilistic structural analysis methods for space transportation propulsion systems
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Moore, N.; Anis, C.; Newell, J.; Nagpal, V.; Singhal, S.
1991-01-01
Information on probabilistic structural analysis methods for space propulsion systems is given in viewgraph form. Information is given on deterministic certification methods, probability of failure, component response analysis, stress responses for 2nd stage turbine blades, Space Shuttle Main Engine (SSME) structural durability, and program plans. .
Jin, Suoqin; MacLean, Adam L; Peng, Tao; Nie, Qing
2018-02-05
Single-cell RNA-sequencing (scRNA-seq) offers unprecedented resolution for studying cellular decision-making processes. Robust inference of cell state transition paths and probabilities is an important yet challenging step in the analysis of these data. Here we present scEpath, an algorithm that calculates energy landscapes and probabilistic directed graphs in order to reconstruct developmental trajectories. We quantify the energy landscape using "single-cell energy" and distance-based measures, and find that the combination of these enables robust inference of the transition probabilities and lineage relationships between cell states. We also identify marker genes and gene expression patterns associated with cell state transitions. Our approach produces pseudotemporal orderings that are - in combination - more robust and accurate than current methods, and offers higher resolution dynamics of the cell state transitions, leading to new insight into key transition events during differentiation and development. Moreover, scEpath is robust to variation in the size of the input gene set, and is broadly unsupervised, requiring few parameters to be set by the user. Applications of scEpath led to the identification of a cell-cell communication network implicated in early human embryo development, and novel transcription factors important for myoblast differentiation. scEpath allows us to identify common and specific temporal dynamics and transcriptional factor programs along branched lineages, as well as the transition probabilities that control cell fates. A MATLAB package of scEpath is available at https://github.com/sqjin/scEpath. qnie@uci.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.
Towards Breaking the Histone Code – Bayesian Graphical Models for Histone Modifications
Mitra, Riten; Müller, Peter; Liang, Shoudan; Xu, Yanxun; Ji, Yuan
2013-01-01
Background Histones are proteins that wrap DNA around in small spherical structures called nucleosomes. Histone modifications (HMs) refer to the post-translational modifications to the histone tails. At a particular genomic locus, each of these HMs can either be present or absent, and the combinatory patterns of the presence or absence of multiple HMs, or the ‘histone codes,’ are believed to co-regulate important biological processes. We aim to use raw data on HM markers at different genomic loci to (1) decode the complex biological network of HMs in a single region and (2) demonstrate how the HM networks differ in different regulatory regions. We suggest that these differences in network attributes form a significant link between histones and genomic functions. Methods and Results We develop a powerful graphical model under Bayesian paradigm. Posterior inference is fully probabilistic, allowing us to compute the probabilities of distinct dependence patterns of the HMs using graphs. Furthermore, our model-based framework allows for easy but important extensions for inference on differential networks under various conditions, such as the different annotations of the genomic locations (e.g., promoters versus insulators). We applied these models to ChIP-Seq data based on CD4+ T lymphocytes. The results confirmed many existing findings and provided a unified tool to generate various promising hypotheses. Differential network analyses revealed new insights on co-regulation of HMs of transcriptional activities in different genomic regions. Conclusions The use of Bayesian graphical models and borrowing strength across different conditions provide high power to infer histone networks and their differences. PMID:23748248
Exploratory Causal Analysis in Bivariate Time Series Data
NASA Astrophysics Data System (ADS)
McCracken, James M.
Many scientific disciplines rely on observational data of systems for which it is difficult (or impossible) to implement controlled experiments and data analysis techniques are required for identifying causal information and relationships directly from observational data. This need has lead to the development of many different time series causality approaches and tools including transfer entropy, convergent cross-mapping (CCM), and Granger causality statistics. In this thesis, the existing time series causality method of CCM is extended by introducing a new method called pairwise asymmetric inference (PAI). It is found that CCM may provide counter-intuitive causal inferences for simple dynamics with strong intuitive notions of causality, and the CCM causal inference can be a function of physical parameters that are seemingly unrelated to the existence of a driving relationship in the system. For example, a CCM causal inference might alternate between ''voltage drives current'' and ''current drives voltage'' as the frequency of the voltage signal is changed in a series circuit with a single resistor and inductor. PAI is introduced to address both of these limitations. Many of the current approaches in the times series causality literature are not computationally straightforward to apply, do not follow directly from assumptions of probabilistic causality, depend on assumed models for the time series generating process, or rely on embedding procedures. A new approach, called causal leaning, is introduced in this work to avoid these issues. The leaning is found to provide causal inferences that agree with intuition for both simple systems and more complicated empirical examples, including space weather data sets. The leaning may provide a clearer interpretation of the results than those from existing time series causality tools. A practicing analyst can explore the literature to find many proposals for identifying drivers and causal connections in times series data sets, but little research exists of how these tools compare to each other in practice. This work introduces and defines exploratory causal analysis (ECA) to address this issue along with the concept of data causality in the taxonomy of causal studies introduced in this work. The motivation is to provide a framework for exploring potential causal structures in time series data sets. ECA is used on several synthetic and empirical data sets, and it is found that all of the tested time series causality tools agree with each other (and intuitive notions of causality) for many simple systems but can provide conflicting causal inferences for more complicated systems. It is proposed that such disagreements between different time series causality tools during ECA might provide deeper insight into the data than could be found otherwise.
The computational nature of memory modification.
Gershman, Samuel J; Monfils, Marie-H; Norman, Kenneth A; Niv, Yael
2017-03-15
Retrieving a memory can modify its influence on subsequent behavior. We develop a computational theory of memory modification, according to which modification of a memory trace occurs through classical associative learning, but which memory trace is eligible for modification depends on a structure learning mechanism that discovers the units of association by segmenting the stream of experience into statistically distinct clusters (latent causes). New memories are formed when the structure learning mechanism infers that a new latent cause underlies current sensory observations. By the same token, old memories are modified when old and new sensory observations are inferred to have been generated by the same latent cause. We derive this framework from probabilistic principles, and present a computational implementation. Simulations demonstrate that our model can reproduce the major experimental findings from studies of memory modification in the Pavlovian conditioning literature.
Networking—a statistical physics perspective
NASA Astrophysics Data System (ADS)
Yeung, Chi Ho; Saad, David
2013-03-01
Networking encompasses a variety of tasks related to the communication of information on networks; it has a substantial economic and societal impact on a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption requires new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with nonlinear large-scale systems. This review aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications.
Probabilistic structural analysis methods for improving Space Shuttle engine reliability
NASA Technical Reports Server (NTRS)
Boyce, L.
1989-01-01
Probabilistic structural analysis methods are particularly useful in the design and analysis of critical structural components and systems that operate in very severe and uncertain environments. These methods have recently found application in space propulsion systems to improve the structural reliability of Space Shuttle Main Engine (SSME) components. A computer program, NESSUS, based on a deterministic finite-element program and a method of probabilistic analysis (fast probability integration) provides probabilistic structural analysis for selected SSME components. While computationally efficient, it considers both correlated and nonnormal random variables as well as an implicit functional relationship between independent and dependent variables. The program is used to determine the response of a nickel-based superalloy SSME turbopump blade. Results include blade tip displacement statistics due to the variability in blade thickness, modulus of elasticity, Poisson's ratio or density. Modulus of elasticity significantly contributed to blade tip variability while Poisson's ratio did not. Thus, a rational method for choosing parameters to be modeled as random is provided.
Denison, Stephanie; Trikutam, Pallavi; Xu, Fei
2014-08-01
A rich tradition in developmental psychology explores physical reasoning in infancy. However, no research to date has investigated whether infants can reason about physical objects that behave probabilistically, rather than deterministically. Physical events are often quite variable, in that similar-looking objects can be placed in similar contexts with different outcomes. Can infants rapidly acquire probabilistic physical knowledge, such as some leaves fall and some glasses break by simply observing the statistical regularity with which objects behave and apply that knowledge in subsequent reasoning? We taught 11-month-old infants physical constraints on objects and asked them to reason about the probability of different outcomes when objects were drawn from a large distribution. Infants could have reasoned either by using the perceptual similarity between the samples and larger distributions or by applying physical rules to adjust base rates and estimate the probabilities. Infants learned the physical constraints quickly and used them to estimate probabilities, rather than relying on similarity, a version of the representativeness heuristic. These results indicate that infants can rapidly and flexibly acquire physical knowledge about objects following very brief exposure and apply it in subsequent reasoning. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Zhang, Qin
2015-07-01
Probabilistic graphical models (PGMs) such as Bayesian network (BN) have been widely applied in uncertain causality representation and probabilistic reasoning. Dynamic uncertain causality graph (DUCG) is a newly presented model of PGMs, which can be applied to fault diagnosis of large and complex industrial systems, disease diagnosis, and so on. The basic methodology of DUCG has been previously presented, in which only the directed acyclic graph (DAG) was addressed. However, the mathematical meaning of DUCG was not discussed. In this paper, the DUCG with directed cyclic graphs (DCGs) is addressed. In contrast, BN does not allow DCGs, as otherwise the conditional independence will not be satisfied. The inference algorithm for the DUCG with DCGs is presented, which not only extends the capabilities of DUCG from DAGs to DCGs but also enables users to decompose a large and complex DUCG into a set of small, simple sub-DUCGs, so that a large and complex knowledge base can be easily constructed, understood, and maintained. The basic mathematical definition of a complete DUCG with or without DCGs is proved to be a joint probability distribution (JPD) over a set of random variables. The incomplete DUCG as a part of a complete DUCG may represent a part of JPD. Examples are provided to illustrate the methodology.
The rational status of quantum cognition.
Pothos, Emmanuel M; Busemeyer, Jerome R; Shiffrin, Richard M; Yearsley, James M
2017-07-01
Classic probability theory (CPT) is generally considered the rational way to make inferences, but there have been some empirical findings showing a divergence between reasoning and the principles of classical probability theory (CPT), inviting the conclusion that humans are irrational. Perhaps the most famous of these findings is the conjunction fallacy (CF). Recently, the CF has been shown consistent with the principles of an alternative probabilistic framework, quantum probability theory (QPT). Does this imply that QPT is irrational or does QPT provide an alternative interpretation of rationality? Our presentation consists of 3 parts. First, we examine the putative rational status of QPT using the same argument as used to establish the rationality of CPT, the Dutch Book (DB) argument, according to which reasoners should not commit to bets guaranteeing a loss. We prove the rational status of QPT by formulating it as a particular case of an extended form of CPT, with separate probability spaces produced by changing context. Second, we empirically examine the key requirement for whether a CF can be rational or not; the results show that participants indeed behave rationally, at least relative to the representations they employ. Finally, we consider whether the conditions for the CF to be rational are applicable in the outside (nonmental) world. Our discussion provides a general and alternative perspective for rational probabilistic inference, based on the idea that contextuality requires either reasoning in separate CPT probability spaces or reasoning with QPT principles. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
MacNeilage, Paul R.; Ganesan, Narayan; Angelaki, Dora E.
2008-01-01
Spatial orientation is the sense of body orientation and self-motion relative to the stationary environment, fundamental to normal waking behavior and control of everyday motor actions including eye movements, postural control, and locomotion. The brain achieves spatial orientation by integrating visual, vestibular, and somatosensory signals. Over the past years, considerable progress has been made toward understanding how these signals are processed by the brain using multiple computational approaches that include frequency domain analysis, the concept of internal models, observer theory, Bayesian theory, and Kalman filtering. Here we put these approaches in context by examining the specific questions that can be addressed by each technique and some of the scientific insights that have resulted. We conclude with a recent application of particle filtering, a probabilistic simulation technique that aims to generate the most likely state estimates by incorporating internal models of sensor dynamics and physical laws and noise associated with sensory processing as well as prior knowledge or experience. In this framework, priors for low angular velocity and linear acceleration can explain the phenomena of velocity storage and frequency segregation, both of which have been modeled previously using arbitrary low-pass filtering. How Kalman and particle filters may be implemented by the brain is an emerging field. Unlike past neurophysiological research that has aimed to characterize mean responses of single neurons, investigations of dynamic Bayesian inference should attempt to characterize population activities that constitute probabilistic representations of sensory and prior information. PMID:18842952
NASA Technical Reports Server (NTRS)
Duffy, S. F.; Hu, J.; Hopkins, D. A.
1995-01-01
The article begins by examining the fundamentals of traditional deterministic design philosophy. The initial section outlines the concepts of failure criteria and limit state functions two traditional notions that are embedded in deterministic design philosophy. This is followed by a discussion regarding safety factors (a possible limit state function) and the common utilization of statistical concepts in deterministic engineering design approaches. Next the fundamental aspects of a probabilistic failure analysis are explored and it is shown that deterministic design concepts mentioned in the initial portion of the article are embedded in probabilistic design methods. For components fabricated from ceramic materials (and other similarly brittle materials) the probabilistic design approach yields the widely used Weibull analysis after suitable assumptions are incorporated. The authors point out that Weibull analysis provides the rare instance where closed form solutions are available for a probabilistic failure analysis. Since numerical methods are usually required to evaluate component reliabilities, a section on Monte Carlo methods is included to introduce the concept. The article concludes with a presentation of the technical aspects that support the numerical method known as fast probability integration (FPI). This includes a discussion of the Hasofer-Lind and Rackwitz-Fiessler approximations.
Conditional reasoning in autism: activation and integration of knowledge and belief.
McKenzie, Rebecca; Evans, Jonathan St B T; Handley, Simon J
2010-03-01
Everyday conditional reasoning is typically influenced by prior knowledge and belief in the form of specific exceptions known as counterexamples. This study explored whether adolescents with autism spectrum disorder (ASD; N = 26) were less influenced by background knowledge than typically developing adolescents (N = 38) when engaged in conditional reasoning. Participants were presented with pretested valid and invalid conditional inferences with varying available counterexamples. The group with ASD showed significantly less influence of prior knowledge on valid inferences (p = .01) and invalid inferences (p = .01) compared with the typical group. In a secondary probability judgment task, no significant group differences were found in probabilistic judgments of the believability of the premises. Further experiments found that results could not be explained by differences between the groups in the ability to generate counterexamples or any tendency among adolescents with ASD to exhibit a "yes" response pattern. It was concluded that adolescents with ASD tend not to spontaneously contextualize presented material when engaged in everyday reasoning. These findings are discussed with reference to weak central coherence theory and the conditional reasoning literature.
Deng, Wei (Sophia); Sloutsky, Vladimir M.
2015-01-01
Does category representation change in the course of development? And if so, how and why? The current study attempted to answer these questions by examining category learning and category representation. In Experiment 1, 4-year-olds, 6-year-olds, and adults were trained with either a classification task or an inference task and their categorization performance and memory for items were tested. Adults and 6-year-olds exhibited an important asymmetry: they relied on a single deterministic feature during classification training, but not during inference training. In contrast, regardless of the training condition, 4-year-olds relied on multiple probabilistic features. In Experiment 2, 4-year-olds were presented with classification training and their attention was explicitly directed to the deterministic feature. Under this condition, their categorization performance was similar to that of older participants in Experiment 1, yet their memory performance pointed to a similarity-based representation, which was similar to that of 4-year-olds in Experiment 1. These results are discussed in relation to theories of categorization and the role of selective attention in the development of category learning. PMID:25602938
An approximate methods approach to probabilistic structural analysis
NASA Technical Reports Server (NTRS)
Mcclung, R. C.; Millwater, H. R.; Wu, Y.-T.; Thacker, B. H.; Burnside, O. H.
1989-01-01
A probabilistic structural analysis method (PSAM) is described which makes an approximate calculation of the structural response of a system, including the associated probabilistic distributions, with minimal computation time and cost, based on a simplified representation of the geometry, loads, and material. The method employs the fast probability integration (FPI) algorithm of Wu and Wirsching. Typical solution strategies are illustrated by formulations for a representative critical component chosen from the Space Shuttle Main Engine (SSME) as part of a major NASA-sponsored program on PSAM. Typical results are presented to demonstrate the role of the methodology in engineering design and analysis.
Fuzzy-probabilistic model for risk assessment of radioactive material railway transportation.
Avramenko, M; Bolyatko, V; Kosterev, V
2005-01-01
Transportation of radioactive materials is obviously accompanied by a certain risk. A model for risk assessment of emergency situations and terrorist attacks may be useful for choosing possible routes and for comparing the various defence strategies. In particular, risk assessment is crucial for safe transportation of excess weapons-grade plutonium arising from the removal of plutonium from military employment. A fuzzy-probabilistic model for risk assessment of railway transportation has been developed taking into account the different natures of risk-affecting parameters (probabilistic and not probabilistic but fuzzy). Fuzzy set theory methods as well as standard methods of probability theory have been used for quantitative risk assessment. Information-preserving transformations are applied to realise the correct aggregation of probabilistic and fuzzy parameters. Estimations have also been made of the inhalation doses resulting from possible accidents during plutonium transportation. The obtained data show the scale of possible consequences that may arise from plutonium transportation accidents.
Le, Quang A; Doctor, Jason N
2011-05-01
As quality-adjusted life years have become the standard metric in health economic evaluations, mapping health-profile or disease-specific measures onto preference-based measures to obtain quality-adjusted life years has become a solution when health utilities are not directly available. However, current mapping methods are limited due to their predictive validity, reliability, and/or other methodological issues. We employ probability theory together with a graphical model, called a Bayesian network, to convert health-profile measures into preference-based measures and to compare the results to those estimated with current mapping methods. A sample of 19,678 adults who completed both the 12-item Short Form Health Survey (SF-12v2) and EuroQoL 5D (EQ-5D) questionnaires from the 2003 Medical Expenditure Panel Survey was split into training and validation sets. Bayesian networks were constructed to explore the probabilistic relationships between each EQ-5D domain and 12 items of the SF-12v2. The EQ-5D utility scores were estimated on the basis of the predicted probability of each response level of the 5 EQ-5D domains obtained from the Bayesian inference process using the following methods: Monte Carlo simulation, expected utility, and most-likely probability. Results were then compared with current mapping methods including multinomial logistic regression, ordinary least squares, and censored least absolute deviations. The Bayesian networks consistently outperformed other mapping models in the overall sample (mean absolute error=0.077, mean square error=0.013, and R overall=0.802), in different age groups, number of chronic conditions, and ranges of the EQ-5D index. Bayesian networks provide a new robust and natural approach to map health status responses into health utility measures for health economic evaluations.
Bayesian estimation of Karhunen–Loève expansions; A random subspace approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chowdhary, Kenny; Najm, Habib N.
One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less
Bayesian estimation of Karhunen–Loève expansions; A random subspace approach
Chowdhary, Kenny; Najm, Habib N.
2016-04-13
One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less
Li, Zhixi; Peck, Kyung K.; Brennan, Nicole P.; Jenabi, Mehrnaz; Hsu, Meier; Zhang, Zhigang; Holodny, Andrei I.; Young, Robert J.
2014-01-01
Purpose The purpose of this study was to compare the deterministic and probabilistic tracking methods of diffusion tensor white matter fiber tractography in patients with brain tumors. Materials and Methods We identified 29 patients with left brain tumors <2 cm from the arcuate fasciculus who underwent pre-operative language fMRI and DTI. The arcuate fasciculus was reconstructed using a deterministic Fiber Assignment by Continuous Tracking (FACT) algorithm and a probabilistic method based on an extended Monte Carlo Random Walk algorithm. Tracking was controlled using two ROIs corresponding to Broca’s and Wernicke’s areas. Tracts in tumoraffected hemispheres were examined for extension between Broca’s and Wernicke’s areas, anterior-posterior length and volume, and compared with the normal contralateral tracts. Results Probabilistic tracts displayed more complete anterior extension to Broca’s area than did FACT tracts on the tumor-affected and normal sides (p < 0.0001). The median length ratio for tumor: normal sides was greater for probabilistic tracts than FACT tracts (p < 0.0001). The median tract volume ratio for tumor: normal sides was also greater for probabilistic tracts than FACT tracts (p = 0.01). Conclusion Probabilistic tractography reconstructs the arcuate fasciculus more completely and performs better through areas of tumor and/or edema. The FACT algorithm tends to underestimate the anterior-most fibers of the arcuate fasciculus, which are crossed by primary motor fibers. PMID:25328583
Probabilistic Parameter Uncertainty Analysis of Single Input Single Output Control Systems
NASA Technical Reports Server (NTRS)
Smith, Brett A.; Kenny, Sean P.; Crespo, Luis G.
2005-01-01
The current standards for handling uncertainty in control systems use interval bounds for definition of the uncertain parameters. This approach gives no information about the likelihood of system performance, but simply gives the response bounds. When used in design, current methods of m-analysis and can lead to overly conservative controller design. With these methods, worst case conditions are weighted equally with the most likely conditions. This research explores a unique approach for probabilistic analysis of control systems. Current reliability methods are examined showing the strong areas of each in handling probability. A hybrid method is developed using these reliability tools for efficiently propagating probabilistic uncertainty through classical control analysis problems. The method developed is applied to classical response analysis as well as analysis methods that explore the effects of the uncertain parameters on stability and performance metrics. The benefits of using this hybrid approach for calculating the mean and variance of responses cumulative distribution functions are shown. Results of the probabilistic analysis of a missile pitch control system, and a non-collocated mass spring system, show the added information provided by this hybrid analysis.
A probabilistic model framework for evaluating year-to-year variation in crop productivity
NASA Astrophysics Data System (ADS)
Yokozawa, M.; Iizumi, T.; Tao, F.
2008-12-01
Most models describing the relation between crop productivity and weather condition have so far been focused on mean changes of crop yield. For keeping stable food supply against abnormal weather as well as climate change, evaluating the year-to-year variations in crop productivity rather than the mean changes is more essential. We here propose a new framework of probabilistic model based on Bayesian inference and Monte Carlo simulation. As an example, we firstly introduce a model on paddy rice production in Japan. It is called PRYSBI (Process- based Regional rice Yield Simulator with Bayesian Inference; Iizumi et al., 2008). The model structure is the same as that of SIMRIW, which was developed and used widely in Japan. The model includes three sub- models describing phenological development, biomass accumulation and maturing of rice crop. These processes are formulated to include response nature of rice plant to weather condition. This model inherently was developed to predict rice growth and yield at plot paddy scale. We applied it to evaluate the large scale rice production with keeping the same model structure. Alternatively, we assumed the parameters as stochastic variables. In order to let the model catch up actual yield at larger scale, model parameters were determined based on agricultural statistical data of each prefecture of Japan together with weather data averaged over the region. The posterior probability distribution functions (PDFs) of parameters included in the model were obtained using Bayesian inference. The MCMC (Markov Chain Monte Carlo) algorithm was conducted to numerically solve the Bayesian theorem. For evaluating the year-to-year changes in rice growth/yield under this framework, we firstly iterate simulations with set of parameter values sampled from the estimated posterior PDF of each parameter and then take the ensemble mean weighted with the posterior PDFs. We will also present another example for maize productivity in China. The framework proposed here provides us information on uncertainties, possibilities and limitations on future improvements in crop model as well.
Bayesian Monitoring Systems for the CTBT: Historical Development and New Results
NASA Astrophysics Data System (ADS)
Russell, S.; Arora, N. S.; Moore, D.
2016-12-01
A project at Berkeley, begun in 2009 in collaboration with CTBTO andmore recently with LLNL, has reformulated the global seismicmonitoring problem in a Bayesian framework. A first-generation system,NETVISA, has been built comprising a spatial event prior andgenerative models of event transmission and detection, as well as aMonte Carlo inference algorithm. The probabilistic model allows forseamless integration of various disparate sources of information,including negative information (the absence of detections). Workingfrom arrivals extracted by traditional station processing fromInternational Monitoring System (IMS) data, NETVISA achieves areduction of around 60% in the number of missed events compared withthe currently deployed network processing system. It also finds manyevents that are missed by the human analysts who postprocess the IMSoutput. Recent improvements include the integration of models forinfrasound and hydroacoustic detections and a global depth model fornatural seismicity trained from ISC data. NETVISA is now fullycompatible with the CTBTO operating environment. A second-generation model called SIGVISA extends NETVISA's generativemodel all the way from events to raw signal data, avoiding theerror-prone bottom-up detection phase of station processing. SIGVISA'smodel automatically captures the phenomena underlying existingdetection and location techniques such as multilateration, waveformcorrelation matching, and double-differencing, and integrates theminto a global inference process that also (like NETVISA) handles denovo events. Initial results for the Western US in early 2008 (whenthe transportable US Array was operating) shows that SIGVISA finds,from IMS data only, more than twice the number of events recorded inthe CTBTO Late Event Bulletin (LEB). For mb 1.0-2.5, the ratio is more than10; put another way, for this data set, SIGVISA lowers the detectionthreshold by roughly one magnitude compared to LEB. The broader message of this work is that probabilistic inference basedon a vertically integrated generative model that directly expressesgeophysical knowledge can be a much more effective approach forinterpreting scientific data than the traditional bottom-up processingpipeline.
Fully probabilistic control for stochastic nonlinear control systems with input dependent noise.
Herzallah, Randa
2015-03-01
Robust controllers for nonlinear stochastic systems with functional uncertainties can be consistently designed using probabilistic control methods. In this paper a generalised probabilistic controller design for the minimisation of the Kullback-Leibler divergence between the actual joint probability density function (pdf) of the closed loop control system, and an ideal joint pdf is presented emphasising how the uncertainty can be systematically incorporated in the absence of reliable systems models. To achieve this objective all probabilistic models of the system are estimated from process data using mixture density networks (MDNs) where all the parameters of the estimated pdfs are taken to be state and control input dependent. Based on this dependency of the density parameters on the input values, explicit formulations to the construction of optimal generalised probabilistic controllers are obtained through the techniques of dynamic programming and adaptive critic methods. Using the proposed generalised probabilistic controller, the conditional joint pdfs can be made to follow the ideal ones. A simulation example is used to demonstrate the implementation of the algorithm and encouraging results are obtained. Copyright © 2014 Elsevier Ltd. All rights reserved.
Probabilistic Simulation of Stress Concentration in Composite Laminates
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Murthy, P. L. N.; Liaw, D. G.
1994-01-01
A computational methodology is described to probabilistically simulate the stress concentration factors (SCF's) in composite laminates. This new approach consists of coupling probabilistic composite mechanics with probabilistic finite element structural analysis. The composite mechanics is used to probabilistically describe all the uncertainties inherent in composite material properties, whereas the finite element is used to probabilistically describe the uncertainties associated with methods to experimentally evaluate SCF's, such as loads, geometry, and supports. The effectiveness of the methodology is demonstrated by using is to simulate the SCF's in three different composite laminates. Simulated results match experimental data for probability density and for cumulative distribution functions. The sensitivity factors indicate that the SCF's are influenced by local stiffness variables, by load eccentricities, and by initial stress fields.
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
NASA Astrophysics Data System (ADS)
Barentsen, Geert; Vink, J. S.; Drew, J. E.; Sale, S. E.
2013-03-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main-sequence ages, masses, accretion rates and extinctions to be estimated using two widely available photometric survey data bases (Isaac Newton Telescope Photometric Hα Survey r'/Hα/i' and Two Micron All Sky Survey J-band magnitudes). Because our approach does not rely on spectroscopy, it can easily be applied to ho-mogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov chain Monte Carlo method and provide PYTHON source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2264 (Cone Nebula), arriving at a median age of 3.0 Myr, an accretion fraction of 20 ± 2 per cent and a median accretion rate of 10-8.4 M⊙ yr-1. The Bayesian analysis formulated in this work delivers results which are in agreement with spectroscopic studies already in the literature, but achieves this with great efficiency by depending only on photometry. It is a significant step forward from previous photometric studies because the probabilistic approach ensures that nuisance parameters, such as extinction and distance, are fully included in the analysis with a clear picture on any degeneracies.
Direct evidence for a dual process model of deductive inference.
Markovits, Henry; Brunet, Marie-Laurence; Thompson, Valerie; Brisson, Janie
2013-07-01
In 2 experiments, we tested a strong version of a dual process theory of conditional inference (cf. Verschueren et al., 2005a, 2005b) that assumes that most reasoners have 2 strategies available, the choice of which is determined by situational variables, cognitive capacity, and metacognitive control. The statistical strategy evaluates inferences probabilistically, accepting those with high conditional probability. The counterexample strategy rejects inferences when a counterexample shows the inference to be invalid. To discriminate strategy use, we presented reasoners with conditional statements (if p, then q) and explicit statistical information about the relative frequency of the probability of p/q (50% vs. 90%). A statistical strategy would accept the more probable inferences more frequently, whereas the counterexample one would reject both. In Experiment 1, reasoners under time pressure used the statistical strategy more, but switched to the counterexample strategy when time constraints were removed; the former took less time than the latter. These data are consistent with the hypothesis that the statistical strategy is the default heuristic. Under a free-time condition, reasoners preferred the counterexample strategy and kept it when put under time pressure. Thus, it is not simply a lack of capacity that produces a statistical strategy; instead, it seems that time pressure disrupts the ability to make good metacognitive choices. In line with this conclusion, in a 2nd experiment, we measured reasoners' confidence in their performance; those under time pressure were less confident in the statistical than the counterexample strategy and more likely to switch strategies under free-time conditions. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Non-Deterministic Dynamic Instability of Composite Shells
NASA Technical Reports Server (NTRS)
Chamis, Christos C.; Abumeri, Galib H.
2004-01-01
A computationally effective method is described to evaluate the non-deterministic dynamic instability (probabilistic dynamic buckling) of thin composite shells. The method is a judicious combination of available computer codes for finite element, composite mechanics, and probabilistic structural analysis. The solution method is incrementally updated Lagrangian. It is illustrated by applying it to thin composite cylindrical shell subjected to dynamic loads. Both deterministic and probabilistic buckling loads are evaluated to demonstrate the effectiveness of the method. A universal plot is obtained for the specific shell that can be used to approximate buckling loads for different load rates and different probability levels. Results from this plot show that the faster the rate, the higher the buckling load and the shorter the time. The lower the probability, the lower is the buckling load for a specific time. Probabilistic sensitivity results show that the ply thickness, the fiber volume ratio and the fiber longitudinal modulus, dynamic load and loading rate are the dominant uncertainties, in that order.
Commercialization of NESSUS: Status
NASA Technical Reports Server (NTRS)
Thacker, Ben H.; Millwater, Harry R.
1991-01-01
A plan was initiated in 1988 to commercialize the Numerical Evaluation of Stochastic Structures Under Stress (NESSUS) probabilistic structural analysis software. The goal of the on-going commercialization effort is to begin the transfer of Probabilistic Structural Analysis Method (PSAM) developed technology into industry and to develop additional funding resources in the general area of structural reliability. The commercialization effort is summarized. The SwRI NESSUS Software System is a general purpose probabilistic finite element computer program using state of the art methods for predicting stochastic structural response due to random loads, material properties, part geometry, and boundary conditions. NESSUS can be used to assess structural reliability, to compute probability of failure, to rank the input random variables by importance, and to provide a more cost effective design than traditional methods. The goal is to develop a general probabilistic structural analysis methodology to assist in the certification of critical components in the next generation Space Shuttle Main Engine.
Bayesian accounts of covert selective attention: A tutorial review.
Vincent, Benjamin T
2015-05-01
Decision making and optimal observer models offer an important theoretical approach to the study of covert selective attention. While their probabilistic formulation allows quantitative comparison to human performance, the models can be complex and their insights are not always immediately apparent. Part 1 establishes the theoretical appeal of the Bayesian approach, and introduces the way in which probabilistic approaches can be applied to covert search paradigms. Part 2 presents novel formulations of Bayesian models of 4 important covert attention paradigms, illustrating optimal observer predictions over a range of experimental manipulations. Graphical model notation is used to present models in an accessible way and Supplementary Code is provided to help bridge the gap between model theory and practical implementation. Part 3 reviews a large body of empirical and modelling evidence showing that many experimental phenomena in the domain of covert selective attention are a set of by-products. These effects emerge as the result of observers conducting Bayesian inference with noisy sensory observations, prior expectations, and knowledge of the generative structure of the stimulus environment.
A probabilistic, distributed, recursive mechanism for decision-making in the brain
Gurney, Kevin N.
2018-01-01
Decision formation recruits many brain regions, but the procedure they jointly execute is unknown. Here we characterize its essential composition, using as a framework a novel recursive Bayesian algorithm that makes decisions based on spike-trains with the statistics of those in sensory cortex (MT). Using it to simulate the random-dot-motion task, we demonstrate it quantitatively replicates the choice behaviour of monkeys, whilst predicting losses of otherwise usable information from MT. Its architecture maps to the recurrent cortico-basal-ganglia-thalamo-cortical loops, whose components are all implicated in decision-making. We show that the dynamics of its mapped computations match those of neural activity in the sensorimotor cortex and striatum during decisions, and forecast those of basal ganglia output and thalamus. This also predicts which aspects of neural dynamics are and are not part of inference. Our single-equation algorithm is probabilistic, distributed, recursive, and parallel. Its success at capturing anatomy, behaviour, and electrophysiology suggests that the mechanism implemented by the brain has these same characteristics. PMID:29614077
Jones, Michael N.
2017-01-01
A central goal of cognitive neuroscience is to decode human brain activity—that is, to infer mental processes from observed patterns of whole-brain activation. Previous decoding efforts have focused on classifying brain activity into a small set of discrete cognitive states. To attain maximal utility, a decoding framework must be open-ended, systematic, and context-sensitive—that is, capable of interpreting numerous brain states, presented in arbitrary combinations, in light of prior information. Here we take steps towards this objective by introducing a probabilistic decoding framework based on a novel topic model—Generalized Correspondence Latent Dirichlet Allocation—that learns latent topics from a database of over 11,000 published fMRI studies. The model produces highly interpretable, spatially-circumscribed topics that enable flexible decoding of whole-brain images. Importantly, the Bayesian nature of the model allows one to “seed” decoder priors with arbitrary images and text—enabling researchers, for the first time, to generate quantitative, context-sensitive interpretations of whole-brain patterns of brain activity. PMID:29059185
Data Assimilation - Advances and Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Brian J.
2014-07-30
This presentation provides an overview of data assimilation (model calibration) for complex computer experiments. Calibration refers to the process of probabilistically constraining uncertain physics/engineering model inputs to be consistent with observed experimental data. An initial probability distribution for these parameters is updated using the experimental information. Utilization of surrogate models and empirical adjustment for model form error in code calibration form the basis for the statistical methodology considered. The role of probabilistic code calibration in supporting code validation is discussed. Incorporation of model form uncertainty in rigorous uncertainty quantification (UQ) analyses is also addressed. Design criteria used within a batchmore » sequential design algorithm are introduced for efficiently achieving predictive maturity and improved code calibration. Predictive maturity refers to obtaining stable predictive inference with calibrated computer codes. These approaches allow for augmentation of initial experiment designs for collecting new physical data. A standard framework for data assimilation is presented and techniques for updating the posterior distribution of the state variables based on particle filtering and the ensemble Kalman filter are introduced.« less
Inverse Problems in Complex Models and Applications to Earth Sciences
NASA Astrophysics Data System (ADS)
Bosch, M. E.
2015-12-01
The inference of the subsurface earth structure and properties requires the integration of different types of data, information and knowledge, by combined processes of analysis and synthesis. To support the process of integrating information, the regular concept of data inversion is evolving to expand its application to models with multiple inner components (properties, scales, structural parameters) that explain multiple data (geophysical survey data, well-logs, core data). The probabilistic inference methods provide the natural framework for the formulation of these problems, considering a posterior probability density function (PDF) that combines the information from a prior information PDF and the new sets of observations. To formulate the posterior PDF in the context of multiple datasets, the data likelihood functions are factorized assuming independence of uncertainties for data originating across different surveys. A realistic description of the earth medium requires modeling several properties and structural parameters, which relate to each other according to dependency and independency notions. Thus, conditional probabilities across model components also factorize. A common setting proceeds by structuring the model parameter space in hierarchical layers. A primary layer (e.g. lithology) conditions a secondary layer (e.g. physical medium properties), which conditions a third layer (e.g. geophysical data). In general, less structured relations within model components and data emerge from the analysis of other inverse problems. They can be described with flexibility via direct acyclic graphs, which are graphs that map dependency relations between the model components. Examples of inverse problems in complex models can be shown at various scales. At local scale, for example, the distribution of gas saturation is inferred from pre-stack seismic data and a calibrated rock-physics model. At regional scale, joint inversion of gravity and magnetic data is applied for the estimation of lithological structure of the crust, with the lithotype body regions conditioning the mass density and magnetic susceptibility fields. At planetary scale, the Earth mantle temperature and element composition is inferred from seismic travel-time and geodetic data.
Quantum Graphical Models and Belief Propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leifer, M.S.; Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo Ont., N2L 2Y5; Poulin, D.
Belief Propagation algorithms acting on Graphical Models of classical probability distributions, such as Markov Networks, Factor Graphs and Bayesian Networks, are amongst the most powerful known methods for deriving probabilistic inferences amongst large numbers of random variables. This paper presents a generalization of these concepts and methods to the quantum case, based on the idea that quantum theory can be thought of as a noncommutative, operator-valued, generalization of classical probability theory. Some novel characterizations of quantum conditional independence are derived, and definitions of Quantum n-Bifactor Networks, Markov Networks, Factor Graphs and Bayesian Networks are proposed. The structure of Quantum Markovmore » Networks is investigated and some partial characterization results are obtained, along the lines of the Hammersley-Clifford theorem. A Quantum Belief Propagation algorithm is presented and is shown to converge on 1-Bifactor Networks and Markov Networks when the underlying graph is a tree. The use of Quantum Belief Propagation as a heuristic algorithm in cases where it is not known to converge is discussed. Applications to decoding quantum error correcting codes and to the simulation of many-body quantum systems are described.« less
NASA Astrophysics Data System (ADS)
Schöberl, Markus; Zabaras, Nicholas; Koutsourelakis, Phaedon-Stelios
2017-03-01
We propose a data-driven, coarse-graining formulation in the context of equilibrium statistical mechanics. In contrast to existing techniques which are based on a fine-to-coarse map, we adopt the opposite strategy by prescribing a probabilistic coarse-to-fine map. This corresponds to a directed probabilistic model where the coarse variables play the role of latent generators of the fine scale (all-atom) data. From an information-theoretic perspective, the framework proposed provides an improvement upon the relative entropy method [1] and is capable of quantifying the uncertainty due to the information loss that unavoidably takes place during the coarse-graining process. Furthermore, it can be readily extended to a fully Bayesian model where various sources of uncertainties are reflected in the posterior of the model parameters. The latter can be used to produce not only point estimates of fine-scale reconstructions or macroscopic observables, but more importantly, predictive posterior distributions on these quantities. Predictive posterior distributions reflect the confidence of the model as a function of the amount of data and the level of coarse-graining. The issues of model complexity and model selection are seamlessly addressed by employing a hierarchical prior that favors the discovery of sparse solutions, revealing the most prominent features in the coarse-grained model. A flexible and parallelizable Monte Carlo - Expectation-Maximization (MC-EM) scheme is proposed for carrying out inference and learning tasks. A comparative assessment of the proposed methodology is presented for a lattice spin system and the SPC/E water model.
Probabilistic models for neural populations that naturally capture global coupling and criticality
2017-01-01
Advances in multi-unit recordings pave the way for statistical modeling of activity patterns in large neural populations. Recent studies have shown that the summed activity of all neurons strongly shapes the population response. A separate recent finding has been that neural populations also exhibit criticality, an anomalously large dynamic range for the probabilities of different population activity patterns. Motivated by these two observations, we introduce a class of probabilistic models which takes into account the prior knowledge that the neural population could be globally coupled and close to critical. These models consist of an energy function which parametrizes interactions between small groups of neurons, and an arbitrary positive, strictly increasing, and twice differentiable function which maps the energy of a population pattern to its probability. We show that: 1) augmenting a pairwise Ising model with a nonlinearity yields an accurate description of the activity of retinal ganglion cells which outperforms previous models based on the summed activity of neurons; 2) prior knowledge that the population is critical translates to prior expectations about the shape of the nonlinearity; 3) the nonlinearity admits an interpretation in terms of a continuous latent variable globally coupling the system whose distribution we can infer from data. Our method is independent of the underlying system’s state space; hence, it can be applied to other systems such as natural scenes or amino acid sequences of proteins which are also known to exhibit criticality. PMID:28926564
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schöberl, Markus, E-mail: m.schoeberl@tum.de; Zabaras, Nicholas; Department of Aerospace and Mechanical Engineering, University of Notre Dame, 365 Fitzpatrick Hall, Notre Dame, IN 46556
We propose a data-driven, coarse-graining formulation in the context of equilibrium statistical mechanics. In contrast to existing techniques which are based on a fine-to-coarse map, we adopt the opposite strategy by prescribing a probabilistic coarse-to-fine map. This corresponds to a directed probabilistic model where the coarse variables play the role of latent generators of the fine scale (all-atom) data. From an information-theoretic perspective, the framework proposed provides an improvement upon the relative entropy method and is capable of quantifying the uncertainty due to the information loss that unavoidably takes place during the coarse-graining process. Furthermore, it can be readily extendedmore » to a fully Bayesian model where various sources of uncertainties are reflected in the posterior of the model parameters. The latter can be used to produce not only point estimates of fine-scale reconstructions or macroscopic observables, but more importantly, predictive posterior distributions on these quantities. Predictive posterior distributions reflect the confidence of the model as a function of the amount of data and the level of coarse-graining. The issues of model complexity and model selection are seamlessly addressed by employing a hierarchical prior that favors the discovery of sparse solutions, revealing the most prominent features in the coarse-grained model. A flexible and parallelizable Monte Carlo – Expectation–Maximization (MC-EM) scheme is proposed for carrying out inference and learning tasks. A comparative assessment of the proposed methodology is presented for a lattice spin system and the SPC/E water model.« less
Lähdesmäki, Harri; Hautaniemi, Sampsa; Shmulevich, Ilya; Yli-Harja, Olli
2006-01-01
A significant amount of attention has recently been focused on modeling of gene regulatory networks. Two frequently used large-scale modeling frameworks are Bayesian networks (BNs) and Boolean networks, the latter one being a special case of its recent stochastic extension, probabilistic Boolean networks (PBNs). PBN is a promising model class that generalizes the standard rule-based interactions of Boolean networks into the stochastic setting. Dynamic Bayesian networks (DBNs) is a general and versatile model class that is able to represent complex temporal stochastic processes and has also been proposed as a model for gene regulatory systems. In this paper, we concentrate on these two model classes and demonstrate that PBNs and a certain subclass of DBNs can represent the same joint probability distribution over their common variables. The major benefit of introducing the relationships between the models is that it opens up the possibility of applying the standard tools of DBNs to PBNs and vice versa. Hence, the standard learning tools of DBNs can be applied in the context of PBNs, and the inference methods give a natural way of handling the missing values in PBNs which are often present in gene expression measurements. Conversely, the tools for controlling the stationary behavior of the networks, tools for projecting networks onto sub-networks, and efficient learning schemes can be used for DBNs. In other words, the introduced relationships between the models extend the collection of analysis tools for both model classes. PMID:17415411
Students’ difficulties in probabilistic problem-solving
NASA Astrophysics Data System (ADS)
Arum, D. P.; Kusmayadi, T. A.; Pramudya, I.
2018-03-01
There are many errors can be identified when students solving mathematics problems, particularly in solving the probabilistic problem. This present study aims to investigate students’ difficulties in solving the probabilistic problem. It focuses on analyzing and describing students errors during solving the problem. This research used the qualitative method with case study strategy. The subjects in this research involve ten students of 9th grade that were selected by purposive sampling. Data in this research involve students’ probabilistic problem-solving result and recorded interview regarding students’ difficulties in solving the problem. Those data were analyzed descriptively using Miles and Huberman steps. The results show that students have difficulties in solving the probabilistic problem and can be divided into three categories. First difficulties relate to students’ difficulties in understanding the probabilistic problem. Second, students’ difficulties in choosing and using appropriate strategies for solving the problem. Third, students’ difficulties with the computational process in solving the problem. Based on the result seems that students still have difficulties in solving the probabilistic problem. It means that students have not able to use their knowledge and ability for responding probabilistic problem yet. Therefore, it is important for mathematics teachers to plan probabilistic learning which could optimize students probabilistic thinking ability.
1994-06-01
8217tonditional events" as well-defined ob- jects as in De Finetti [14];, Gilio t[15]L When the strength of the rule b-)a is computed in the context of...uncertain outcome (see, e.g., McGee [5]-) or a coherency argument inthe sense ’of De Finetti as employed by Gilio et al [15],([17J1 or Coletti et al. 118...probabil- ity through a scoring characterization, extending De Finetti’s coherency principle. (See also Gilio et al. [17] for additional results
Disruption of the Right Temporoparietal Junction Impairs Probabilistic Belief Updating.
Mengotti, Paola; Dombert, Pascasie L; Fink, Gereon R; Vossel, Simone
2017-05-31
Generating and updating probabilistic models of the environment is a fundamental modus operandi of the human brain. Although crucial for various cognitive functions, the neural mechanisms of these inference processes remain to be elucidated. Here, we show the causal involvement of the right temporoparietal junction (rTPJ) in updating probabilistic beliefs and we provide new insights into the chronometry of the process by combining online transcranial magnetic stimulation (TMS) with computational modeling of behavioral responses. Female and male participants performed a modified location-cueing paradigm, where false information about the percentage of cue validity (%CV) was provided in half of the experimental blocks to prompt updating of prior expectations. Online double-pulse TMS over rTPJ 300 ms (but not 50 ms) after target appearance selectively decreased participants' updating of false prior beliefs concerning %CV, reflected in a decreased learning rate of a Rescorla-Wagner model. Online TMS over rTPJ also impacted on participants' explicit beliefs, causing them to overestimate %CV. These results confirm the involvement of rTPJ in updating of probabilistic beliefs, thereby advancing our understanding of this area's function during cognitive processing. SIGNIFICANCE STATEMENT Contemporary views propose that the brain maintains probabilistic models of the world to minimize surprise about sensory inputs. Here, we provide evidence that the right temporoparietal junction (rTPJ) is causally involved in this process. Because neuroimaging has suggested that rTPJ is implicated in divergent cognitive domains, the demonstration of an involvement in updating internal models provides a novel unifying explanation for these findings. We used computational modeling to characterize how participants change their beliefs after new observations. By interfering with rTPJ activity through online transcranial magnetic stimulation, we showed that participants were less able to update prior beliefs with TMS delivered at 300 ms after target onset. Copyright © 2017 the authors 0270-6474/17/375419-10$15.00/0.
The computational nature of memory modification
Gershman, Samuel J; Monfils, Marie-H; Norman, Kenneth A; Niv, Yael
2017-01-01
Retrieving a memory can modify its influence on subsequent behavior. We develop a computational theory of memory modification, according to which modification of a memory trace occurs through classical associative learning, but which memory trace is eligible for modification depends on a structure learning mechanism that discovers the units of association by segmenting the stream of experience into statistically distinct clusters (latent causes). New memories are formed when the structure learning mechanism infers that a new latent cause underlies current sensory observations. By the same token, old memories are modified when old and new sensory observations are inferred to have been generated by the same latent cause. We derive this framework from probabilistic principles, and present a computational implementation. Simulations demonstrate that our model can reproduce the major experimental findings from studies of memory modification in the Pavlovian conditioning literature. DOI: http://dx.doi.org/10.7554/eLife.23763.001 PMID:28294944
Probabilistic cosmological mass mapping from weak lensing shear
Schneider, M. D.; Ng, K. Y.; Dawson, W. A.; ...
2017-04-10
Here, we infer gravitational lensing shear and convergence fields from galaxy ellipticity catalogs under a spatial process prior for the lensing potential. We demonstrate the performance of our algorithm with simulated Gaussian-distributed cosmological lensing shear maps and a reconstruction of the mass distribution of the merging galaxy cluster Abell 781 using galaxy ellipticities measured with the Deep Lens Survey. Given interim posterior samples of lensing shear or convergence fields on the sky, we describe an algorithm to infer cosmological parameters via lens field marginalization. In the most general formulation of our algorithm we make no assumptions about weak shear ormore » Gaussian-distributed shape noise or shears. Because we require solutions and matrix determinants of a linear system of dimension that scales with the number of galaxies, we expect our algorithm to require parallel high-performance computing resources for application to ongoing wide field lensing surveys.« less